Getting access to internal source code of multiple organizations

Arpit Kubadia
4 min readAug 31, 2020

The story started with me looking at various Shodan dorks, trying to find something interesting until I came across this issue on the awesome-shodan-queries repo on GitHub. It talked about misconfigured SonarQube instances to find internal source code.
Well, more or less that’s really about what this blog post is about; so you can skip it and just try out this dork: http.favicon.hash:”1485257654"
However, in the sections ahead, I will share my approach to identify the companies whose source code I found, and how I scanned the results on a (relatively) large scale.

According to Wikipedia Description

SonarQube is an open-source platform developed by SonarSource for continuous inspection of code quality to perform automatic reviews with static analysis of code to detect bugs, code smells, and security vulnerabilities on 20+ programming languages.

This is a goldmine for a bug bounty hunter since you are not only finding critical internal source code that can have all sorts of things like API keys, IP addresses, etc, but also an analysis of the vulnerabilities and bugs in the code.

This is what an open SonarQube instance looks like

The way to find these is simple:

  1. Use a Shodan dork to find some of these: 2 which I know are product:sonarqube and http.favicon.hash:”1485257654"

2. Of the thousands of results that come up, open them individually to see if they have any sort of code available.

Next comes the interesting, and personally my favorite part: Identifying the owner
This isn’t very straightforward, as, from my experience, the instances which have a hostname or SSL certificate are generally safely secured behind a login screen. So there needs to be another way to identify the owner. I’ll go from the least intrusive way to some slightly more intrusive. Remember that if you don’t have an idea of whom something belongs to, its best to be cautious, and use a VPN whenever possible.

Methods to identify the owner:

  1. Less Intrusive:
    a. Reverse whois the IP of the instance
    b. Check for any other web services running on other ports
  2. Medium Intrusive (most effective):
    This requires you to actually navigate a bit through the SonarQube to identify the owner. You need to go to the Issues tab, and then on the left-hand side expand the list of Authors. There you will find a list of people who have authored any of the bugs/issues/code smells on the instance. In most cases, the people will have their company email ID in the authors' section which will allow you to identify the company and then you can check if the company has an RVDP/Bug bounty program.

Thanks to Nitesh Surana for the next tip — You can also go to /api/users/search endpoint and enumerate the usernames and names of people who have login IDs for any particular SonarQubes. For example:

3. More Intrusive:
This should be done in only rare cases where you have a good idea of the owner of the code and want to find additional information to support it. This would require you to actually dive into the code trying to find clues that can help identify the company. Some things that I like to do are:
a. Looking for License files
b. Looking at the project names and trying to Google for those specific things
Apart from this, just looking through the bugs and code can also give clues with respect to commented out credentials, links, and URLs.

Personally, I have used all the above techniques in the past month and have reported over 2 dozen issues to various Fortune 500 companies, government organizations, and banks. While I don’t have explicit permissions to give any names at the moment, I will keep updating this section whenever I can share more. (There might be a disclosure here soon!)
Edit: https://hackerone.com/reports/947946

(Semi) Automation:

While doing this, I came across some annoying issues. Shodan only allows 200 search results for their free tier on their GUI. It is slow to click and check links individually on the UI. I tried doing some clever search dorking to limit results (like only seeing results from a particular city on a particular port etc), it still was very inconvenient. So this is the methodology I hacked together to somewhat do it more efficiently.

Not that it still can be optimized. For example, I have not done any scripting or automated scraping. I have just moved the searching to Shodan CLI.

Steps and Scripts:

shodan download sonar-favicon http.favicon.hash:”1485257654" — limit 7000
mkdir usa
cd usa
shodan parse -f location.country_code:US — fields ip_str,port — separator : ../sonar-favicon.json.gz | httpx -o hosts_usa

This will download all the results, parse the IP and Port in the format of ip:port for a particular country (USA in my case) and then pass it through HTTPX to see which hosts are alive.

What I did next was used a bulk URL opener like this to open multiple URLs and quickly go through them. This little exercise considerably increased my speed.

So far, I have only checked for SonarQubes in a few countries (USA, India, Netherlands). There is a goldmine out there! Let me know if things mentioned in this article helped you land any good stuff ;-)

You can follow me on Twitter for more!

--

--