About – v2.0
It is a software used to find endpoints (and potential parameters) for a given goal. It may discover them by:
- crawling a goal (go a site/URL)
- crawling a number of targets (go a file of domains/URLs)
- looking out recordsdata in a given listing (go a listing identify)
- get them from a Burp mission (go location of a Burp XML file)
- get them from an OWASP ZAP mission (go location of a ZAP ASCII message file)
The python script relies on the hyperlink discovering capabilities of my Burp extension GAP. As a place to begin, I took the wonderful software LinkFinder by Gerben Javado, and used the Regex for locating hyperlinks, however with further enhancements to seek out much more.
Set up
xnLinkFinder helps Python 3.
$ git clone https://github.com/xnl-h4ck3r/xnLinkFinder.git
$ cd xnLinkFinder
$ sudo python setup.py set up
Utilization
Arg | Lengthy Arg | Description |
---|---|---|
-i | –input | Enter a: URL, textual content file of URLs, a Listing of recordsdata to go looking, a Burp XML output file or an OWASP ZAP output file. |
-o | –output | The file to save lots of the Hyperlinks output to, together with path if crucial (default: output.txt). If set to cli then output is barely written to STDOUT. If the file exist already it should simply be appended to (and de-duplicated) until choice -ow is handed. |
-op | –output-params | The file to save lots of the Potential Parameters output to, together with path if crucial (default: parameters.txt). If set to cli then output is barely written to STDOUT (however not piped to a different program). If the file exist already it should simply be appended to (and de-duplicated) until choice -ow is handed. |
-ow | –output-overwrite | If the output file already exists, it is going to be overwritten as a substitute of being appended to. |
-sp | –scope-prefix | Any hyperlinks discovered beginning with / shall be prefixed with scope domains within the output as a substitute of the unique hyperlink. If the handed worth is a sound file identify, that file shall be used, in any other case the string literal shall be used. |
-spo | –scope-prefix-original | If argument -sp is handed, then this determines whether or not the unique hyperlink beginning with / can also be included within the output (default: false) |
-sf | –scope-filter | Will filter output hyperlinks to solely embody them if the area of the hyperlink is within the scope specified. If the handed worth is a sound file identify, that file shall be used, in any other case the string literal shall be used. |
-c | –cookies † | Add cookies to go with HTTP requests. Go within the format 'name1=value1; name2=value2;' |
-H | –headers † | Add customized headers to go with HTTP requests. Go within the format 'Header1: value1; Header2: value2;' |
-ra | –regex-after | RegEx for filtering functions in opposition to discovered endpoints earlier than output (e.g. /api/v[0-9].[0-9]* ). If it matches, the hyperlink is output. |
-d | –depth † | The extent of depth to go looking. For instance, if a price of two is handed, then all hyperlinks initially discovered will then be looked for extra hyperlinks (default: 1). This feature is ignored for Burp recordsdata as a result of they are often enormous and eat a number of reminiscence. Additionally it is advisable to make use of the -sp (--scope-prefix ) argument to make sure a request to hyperlinks discovered and not using a area might be tried. |
-p | –processes † | Primary multithreading is completed when getting requests for a URL, or file of URLs (not a Burp file). This argument determines the variety of processes (threads) used (default: 25) |
-x | –exclude | Extra Hyperlink exclusions (to the checklist in config.yml ) in a comma separated checklist, e.g. careers,discussion board |
-orig | –origin | Whether or not you need the origin of the hyperlink to be within the output. Displayed as LINK-URL [ORIGIN-URL] within the output (default: false) |
-t | –timeout † | What number of seconds to attend for the server to ship knowledge earlier than giving up (default: 10 seconds) |
-inc | –include | Embody enter (-i ) hyperlinks within the output (default: false) |
-u | –user-agent † | What Person Brokers to get hyperlinks for, e.g. -u desktop cell |
-insecure | † | Whether or not TLS certificates checks needs to be disabled when making requests (delfault: false) |
-s429 | † | Cease when > 95 % of responses return 429 Too Many Requests (default: false) |
-s403 | † | Cease when > 95 % of responses return 403 Forbidden (default: false) |
-sTO | † | Cease when > 95 % of requests outing (default: false) |
-sCE | † | Cease when > 95 % of requests have connection errors (default: false) |
-m | –memory-threshold | The reminiscence threshold share. If the machines reminiscence goes above the brink, this system shall be stopped and ended gracefully earlier than operating out of reminiscence (default: 95) |
-mfs | –max-file-size † | The utmost file dimension (in bytes) of a file to be checked if -i is a listing. If the file dimension os over, it is going to be ignored (default: 500 MB). Setting to 0 means no recordsdata shall be ignored, no matter dimension.. |
-replay-proxy | † | For lively hyperlink discovering with URL (or file of URLs), replay the requests via this proxy. |
-ascii-only | Whether or not hyperlinks and parameters will solely be added in the event that they solely comprise ASCII characters. This may be helpful when you understand the goal is probably going to make use of ASCII characters and also you additionally get a variety of false positives from binary recordsdata for some cause. | |
-v | –verbose | Verbose output |
-vv | –vverbose | Elevated verbose output |
-h | –help | present the assistance message and exit |
† NOT RELEVANT FOR INPUT OF DIRECTORY, BURP XML FILE OR OWASP ZAP FILE
config.yml
The config.yml
file has the keys which might be up to date to fit your wants:
linkExclude
– A comma separated checklist of strings (e.g..css,.jpg,.jpeg
and so forth.) that each one hyperlinks are checked in opposition to. If a hyperlink consists of any of the strings then it is going to be excluded from the output. If the enter is a listing, then file names are checked in opposition to this checklist.contentExclude
– A comma separated checklist of strings (e.g.textual content/css,picture/jpeg,picture/jpg
and so forth.) that each one responsesContent material-Kind
headers are checked in opposition to. Any responses with the these content material varieties shall be excluded and never checked for hyperlinks.fileExtExclude
– A comma separated checklist of strings (e.g..zip,.gz,.tar
and so forth.) that each one recordsdata in Listing mode are checked in opposition to. If a file has a type of extensions it is not going to be looked for hyperlinks.regexFiles
– A listing of file varieties separated by a pipe character (e.g.php|php3|php5
and so forth.). These are used within the Hyperlink Discovering Regex when there are findings that are not apparent hyperlinks, however are fascinating file varieties that you just need to pick. Should you add to this checklist, make sure you escape any dots to make sure appropriate regex, e.g.js.map
respParamLinksFound
† – Whether or not to get potential parameters from hyperlinks present in responses:True
orFalse
respParamPathWords
† – Whether or not so as to add path phrases in retrieved hyperlinks as potential parameters:True
orFalse
respParamJSON
† – If the MIME sort of the response incorporates JSON, whether or not so as to add JSON Key values as potential parameters:True
orFalse
respParamJSVars
† – Whether or not javascript variables set withvar
,let
orconst
are added as potential parameters:True
orFalse
respParamXML
† – If the MIME sort of the response incorporates XML, whether or not so as to add XML attributes values as potential parameters:True
orFalse
respParamInputField
† – If the MIME sort of the response incorporates HTML, whether or not so as to add NAME and ID attributes of any INPUT fields as potential parameters:True
orFalse
respParamMetaName
† – If the MIME sort of the response incorporates HTML, whether or not so as to add NAME attributes of any META tags as potential parameters:True
orFalse
† IF THESE ARE NOT FOUND IN THE CONFIG FILE THEY WILL DEFAULT TO True
Examples
Discover Hyperlinks from a selected goal – Primary
python3 xnLinkFinder.py -i goal.com
Discover Hyperlinks from a selected goal – Detailed
Ideally, present scope prefix (-sp
) with the first area (together with schema), and a scope filter (-sf
) to filter the outcomes solely to related domains (this generally is a file or in scope domains). Additionally, you possibly can go cookies and buyer headers to make sure you discover hyperlinks solely obtainable to authorised customers. Specifying the Person Agent (-u desktop cell
) will first seek for all hyperlinks utilizing desktop Person Brokers, after which strive once more utilizing cell person brokers. There may very well be particular endpoints which are associated to the person agent given. Giving a depth worth (-d
) will maintain sending request to hyperlinks discovered on the earlier depth search to seek out extra hyperlinks.
python3 xnLinkFinder.py -i goal.com -sp target_prefix.txt -sf target_scope.txt -spo -inc -vv -H 'Authorization: Bearer XXXXXXXXXXXXXX' -c 'SessionId=MYSESSIONID' -u desktop cell -d 10
Discover Hyperlinks from an inventory of URLs – Primary
You probably have a file of JS file URLs for instance, you possibly can search for hyperlinks in these:
python3 xnLinkFinder.py -i target_js.txt
Discover Hyperlinks from a recordsdata in a listing – Primary
You probably have a recordsdata, e.g. JS recordsdata, HTTP responses, and so forth. you possibly can search for hyperlinks in these:
python3 xnLinkFinder.py -i ~/Instruments/waymore/outcomes/goal.com
NOTE: Sub directories are additionally checked. The -mfs
choice might be specified to skip recordsdata over a sure dimension.
Discover Hyperlinks from a Burp mission – Primary
In Burp, choose the gadgets you need to search by highlighting the scope for instance, proper clicking and deciding on the Save chosen gadgets
. Be certain that the choice base64-encode requests and responses
choice is checked earlier than saving. To get all hyperlinks from the file (even with HUGE recordsdata, you can get all of the hyperlinks):
python3 xnLinkFinder.py -i target_burp.xml
NOTE: xnLinkFinder makes the idea that if the primary line of the file handed with -i
begins with <?xml
then you are attempting to course of a Burp file.
Discover Hyperlinks from a Burp mission – Detailed
Ideally, present scope prefix (-sp
) with the first area (together with schema), and a scope filter (-sf
) to filter the outcomes solely to related domains.
python3 xnLinkFinder.py -i target_burp.xml -o target_burp.txt -sp https://www.goal.com -sf goal.* -ow -spo -inc -vv
Discover Hyperlinks from an OWASP ZAP mission – Primary
In ZAP, choose the gadgets you need to search by highlighting the Historical past for instance, clicking menu Report
and deciding on Export Messages to File...
. It will allow you to save an ASCII textual content file of all requests and responses you need to search. To get all hyperlinks from the file (even with HUGE recordsdata, you can get all of the hyperlinks):
python3 xnLinkFinder.py -i target_zap.txt
NOTE: xnLinkFinder makes the idea that if the primary line of the file handed with -i
is within the format ==== 99 ==========
for instance, then you are attempting to course of an OWASP ZAP ASCII textual content file.
Piping to different Instruments
You’ll be able to pipe xnLinkFinder to different instruments. Any errors are despatched to stderr
and any hyperlinks discovered are despatched to stdout
. The output file continues to be created along with the hyperlinks being piped to the following program. Nevertheless, potential parameters aren’t piped to the following program, however they’re nonetheless written to file. For instance:
python3 xnLinkFinder.py -i redbull.com -sp https://redbull.com -sf rebbull.* -d 3 | unfurl keys | type -u
You can too go the enter via stdin
as a substitute of -i
.
cat redbull_subs.txt | python3 xnLinkFinder.py -sp https://redbull.com -sf rebbull.* -d 3
NOTE: You’ll be able to’t pipe in a Burp or ZAP file, these have to be handed utilizing -i
.
Suggestions and Notes
- All the time use the Scope Prefix argument
-sp
. This may be one scope area, or a file containing a number of scope domains. Beneath are examples of the format used (no path needs to be included, and no wildcards used. Schema is non-compulsory, however will default to http):http://www.goal.com
https://target-payments.com
https://static.target-cdn.comIf a hyperlink is discovered that has no area, e.g.
/path/to/instance.js
then giving passing-sp http://www.goal.com
will lead to teh outputhttp://www.goal.com/path/to/instance.js
and if Depth (-d
) is >1 then a request will have the ability to be made to that URL to seek for extra hyperlinks. If a file of domains are handed utilizing-sp
then the output will embody every area adopted by/path/to/instance.js
and improve the possibility of discovering extra hyperlinks. - Should you use
-sp
however nonetheless need the unique hyperlink of/path/to/instance.js
(and not using a area) moreover returned within the output, the go the argument-spo
. - All the time use the Scope Filter argument
-sf
. It will be certain that solely related domains are returned within the output, and extra importantly if Depth (-d
) is >1 then out of scope targets is not going to be looked for hyperlinks or parameters. This may be one scope area, or a file containing a number of scope domains. Beneath are examples of the format used (no schema or path needs to be included):goal.*
target-payments.com
static.target-cdn.comTHIS IS FOR FILTERING THE LINKS DOMAIN ONLY.
- If you wish to filter the ultimate output in any means, use
-ra
. It is at all times a good suggestion to make use of https://regex101.com/ to test your Regex expression goes to do what you anticipate. - Use the
-v
choice to have a greater thought of what the software is doing. - You probably have issues, use the
-vv
choice which can present errors which are occurring, which might probably be resolved, or you possibly can increase as a problem on github. - Go cookies (
-c
), headers (-H
) and regex (-ra
) values inside single quotes, e.g.-ra '/api/v[0-9].[0-9]*'
- Set the
-o
choice to offer a selected output file identify for Hyperlinks, reasonably than the default ofoutput.txt
. Should you plan on operating a big depth of searches, begin with 2 with choice-v
to test what’s being returned. Then you possibly can improve the Depth, and the brand new output shall be appended to the prevailing file, until you go-ow
. - Set the
-op
choice to offer a selected output file identify for Potential Parameters, reasonably than the default ofparameters.txt
. Any output shall be appended to the prevailing file, until you go-ow
. - If utilizing a excessive Depth (
-d
) be cautious of some websites utilizing dynamic hyperlinks so will it should simply maintain discovering new ones. If no new hyperlinks are being discovered, then xnlLinkFinder will cease looking out. Offering the Cease flags (s429
,s403
,sTO
,sCE
) must also be thought-about. - If you’re discovering a lot of hyperlinks (particularly if the Depth (
-d
worth is excessive), and have restricted sources, this system will cease when it reaches the reminiscence Threshold (-m
) worth and finish gracefully with knowledge intact earlier than getting killed. - Should you resolve to cancel xnLinkFinder (utilizing
Ctrl-C
) in the midst of operating, be affected person and any gathered knowledge shall be saved earlier than ending gracefully. - Utilizing the
-orig
choice will present the URL the place the hyperlink was discovered. This will imply you’ve duplicate hyperlinks within the output if the identical hyperlink was discovered on a number of sources, however it should suffixed with the origin URL in sq. brackets. - When making requests, xnLinkFinder will use a random Person-Agent from the present group, which defaults to
desktop
. You probably have a goal that would have completely different hyperlinks for various person agent teams, the specify-u desktop cell
for instance (separate with an area). Thecell
person agent choice is an mixture ofmobile-apple
,mobile-android
andmobile-windows
. - When
-i
has been set to a listing, the contents of the recordsdata within the root of that listing shall be looked for hyperlinks. Recordsdata in sub-directories aren’t searched. Any recordsdata which are over the scale set by-mfs
(default: 500 MB) shall be skipped. - When utilizing the
-replay-proxy
choice, generally requests can take longer. Should you begin seeing extraRequest Timeout
errors (you will see errors in the event you use-v
or-vv
choices) then think about using-t
to lift the timeout restrict. - If you understand a goal will solely have ASCII characters in hyperlinks and parameters then contemplate passing
-ascii-only
. This will get rid of a variety of false positives that may generally get returned from binary knowledge.
Points
Should you come throughout any issues in any respect, or have concepts for enhancements, please be happy to lift a problem on Github. If there’s a drawback, it is going to be helpful in the event you can present the precise command you ran and an in depth description of the issue. If doable, run with -vv
to breed the issue and let me learn about any error messages which are given.
TODO
- I appear to have accomplished all of the TODO’s I initially had! Should you consider any that want including, let me know
落
Instance output
Lively hyperlink discovering for a site:
…
Piped enter and output:
Good luck and good looking! Should you actually love the software (or any others), or they helped you discover an superior bounty, contemplate BUYING ME A COFFEE! ☕ (I may use the caffeine!)
落
/XNL-h4ck3r