GitDorker — A Better Tool to Perform GitHub Dorks and Snag Easy Bug Bounty Wins

Omar Bheda
7 min readSep 27, 2020

Intro & Why GitDorker?

Traversing GitHub for secrets utilizing automated tools such as gitrob (michenriksen) or GitGot (BishopFox) are great for a quick scan of potentially hidden sensitive information behind a target’s GitHub environment. However, these automated tools used to discover secrets in GitHub are far from perfect.

Oftentimes sensitive secrets stored in a target’s GitHub environment are overlooked and thus not reported in the tool output due to the limitations of automated scanning (regex, entropy searches, etc.). On the flip slide, too much information can be outputted by automated tools, making it difficult to discern true secrets from a sea of false positives.

Enter GitDorker. An easy to use tool written in Python that uses a compiled list of GitHub dorks from various sources across the Bug Bounty community to perform manual dorking given a user inputted query such as a GitHub organization, user, or domain name of the intended target. These manual dorks are utilized to map out the potential surface for exposure of secrets by providing the user with a list of successful dorks, the number of results returned per dork, and a URL link for easy access to manual searching of secrets across a target’s public GitHub environment.

You may view the current compiled list of over 230+ GitHub dorks which I will continue to update in the near future here:

https://github.com/obheda12/GitDorker/

Lastly, before I dive into the use cases and explanation of GitDorker I’d like to thank Gwendell Le Coguic who had written “github-dorks” from which I was able to base GitDorker on. He also has a fantastic repo of GitHub searching tools available here:

https://github.com/gwen001/github-search

In this post, I will be outlining use cases and demonstrating how to utilize GitDorker to create an insightful attack surface to find sensitive information exposure for your intended target’s GitHub environment. For this demonstration we will be using Tesla as our target.

Setup

In order to download GitDorker perform the following command in your terminal of choice.

git clone https://github.com/obheda12/GitDorker

To install the requirements use the following command:

pip3 install -r requirements.txt

Lastly, in order to utilize GitDorker, a github personal access token must be created and utilized using the “-t” or “-tf” switch if using multiple tokens. You may follow the documentation below to create your own access token.

https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/creating-a-personal-access-token

NOTE: GitHub rate limits accounts to 30 queries every minute. In order to circumvent rate limiting restrictions I have inputted a sleep for every 30 requests. The option to utilize a file of tokens separated by line, from which GitDorker will round robin through to perform dorks and increase the amount of queries you may utilize per minute is highly encouraged.

Demo and Use Cases

For our demo we will be performing dorks on Tesla as our target to enumerate potential sources of sensitive information exposure. We will first identify Tesla’s GitHub organization account name to use in our query. A quick google search for “tesla github” gives us the organization account, teslamotors.

To view the help menu of GitDorker, utilize the following command

python3 GitDorker.py -h

We will be using the “-org” switch to build our query. This switch specifies usage of one of GitHub’s advanced search parameters for code searches across an organization. We will add Tesla’s GitHub org account, teslamotors, to the “-org” switch to build out our query as shown below.

python3 GitDorker.py -tf *tokensfile* -org teslamotors

Alternatively you may also do the same using the “-q” switch, which allows more dynamic querying using GitHub’s advanced search parameters (linked below). The “-q” switch has much more functionality (such as searching on a domain) and is covered towards the end of this post.

python3 GitDorker.py -tf *tokensfile* -q org:teslamotors

NOTE: GitDorker will NOT run correctly unless you specify a dork file. This is covered in the steps below

The query built thus far is shown below:

To perform dorks against our query, we specify a list of dorks using the “-d” switch. In addition, we will output our results using the “-o” switch. The commands is now as follows:

python3 GitDorker.py -tf *tokensfile* -org teslamotors -d *dorksfile* -o *outputfile*

The dorks file contains a list of dorks separated by each line. These dorks will be appended to the query built using the “-org” switch. Below is an example result of what the combined query (org:teslamotors)+ dork (password) search result would look like.

Below is an example of the standard output in a terminal using a file of dorks.

Standard Output

Below is the corresponding CSV output opened in excel and filtered based on number of results for easy analysis.

CSV Output

As you can see, a url of a custom search query is generated for each dork along with the number of results for reference. We will visit the link generated for the “ftp” dorks and analyze our results.

Our result for the “ftp” dork produced a large surface of potential code to analyze. This would take some time to look through all the code and commit history information given the size of Tesla’s GitHub environment.

While it could prove to be fruitful to start here first, it is advised to analyze dork results in order of least quantity of results to greatest or by degree of sensitivity (subjective) in order to optimize your time spent searching.

On the lower end of our results we see the “connectionstring” dork produce one result. Taking a look at the result we see the following:

A single result with the dork “connectionstring” apparent in the code. This would be an ideal surface to look into the commit history of the result.

Result Analysis

In our case, Tesla is a fairly mature target and does not seem to have any sensitive information lying behind this instance or behind this instance’s commit history, however, other less mature targets are much more susceptible to be hiding secrets.

Enabling a user to enumerate the potential endpoints for sensitive information exposure on GitHub make manual GitHub dorking a far smoother process and greatly increase the likelihood for findings. The core purpose of GitDorker is to identify and map where sensitive information may may be hiding to enhance your manual search for sensitive information exposure and give you detailed insight into your target’s GitHub environment.

Other Example Use Cases

Below are additional use cases for GitDorker using Tesla as our example target:

Utilizing a domain as a search term

You may search on a domain or any search term per the GitHub search guidelines I mentioned earlier in my post. For this example we will utilize the “-q” switch and “tesla.com” as our target domain.

python3 GitDorker.py -tf tokensfile.txt -q tesla.com -d dorks/demo_dorks.txt

Utilizing “tesla.com” as a target domain

Utilizing a user or multiple users as a target with threading

You may search on a user or a list of users for sensitive information listed in their repositories. For example we will utilize the users listed on Tesla’s teslamotors GitHub page as our targets provided in a file.

First we will visit Tesla’s teslamotors GitHub page and identify users on the people tab. Gwen001 has written a script in his GitHub repository “github-search” to automatically scrape users as well, which I’ve linked earlier above.

We will be targeting 3 public users and input their GitHub usernames into a text file separated by new lines. For this example I am using a shorter list of dorks as the amount of dorks will be multiplied per user.

Tesla GitHub Users

We will now perform the following command to perform dorking on 3 GitHub users while using a thread count of 2 and a user file containing 4 unique GitHub access tokens.

python3 GitDorker.py -tf *tokensfile* -uf *userfile* -d dorks/*DORK_FILE*

These are only a few use cases. The advantages of GitDorker depends on how you choose to perform advanced querying to pinpoint sensitive information and gain further insight on a target’s publicly facing GitHub environment.

To date, I’ve personally found a decent amount of success with this tool on bug bounty targets. If it works well for you, great! I’d love to hear about any wins or successes you have. If you think this modified tool sucks, then let me know how to improve it. I am more than open to feedback :)

Feel free to give me a follow, I plan to drop more tools and insights from my research in the future:

Follow me on twitter for updates on new tools and research I plan on sharing.

Twitter

Other platforms to connect with me on

GitHub

LinkedIn

--

--

Omar Bheda

Amateur bug bounty hunter, tool devleoper, and offensive security researcher.