Using Twitter to notify careless developers — the unorthodox way (Or, how you could use GitHub to compromise 9.5K Twitter accounts without “hacking”)
If you are reading this, you probably already know how critical it is to NOT hardcode credentials into source code, especially if this code is going to be committed in a public GitHub repo.
Certainly, GitHub knows this too, and they provide secret scanning for all public repositories, as part of the GitHub Advanced Security suite.
Specifically, GitHub docs state that:
GitHub scans repositories for known secret formats to prevent fraudulent use of credentials that were committed accidentally. Secret scanning happens by default on public repositories, and can be enabled on private repositories by repository administrators or organization owners. As a service provider, you can partner with GitHub so that your secret formats are included in our secret scanning.
From the above, it is evident that the secret scanning functionality of GitHub, is capable of identifying secrets conforming to a well-defined format.
But what if this is not the case?
One major example of such a service provider is Twitter, which, notably is not included in the “Secret scanning partner program”. Twitter API uses a typical 3-legged OAuth 1.0a flow for authorizing applications to perform actions on behalf of users, with a set of token credentials comprising an access token and an access token secret.
Here is the catch: the access tokens generated by Twitter, do not have a well-defined structure. But that shouldn’t pose a serious challenge to the GitHub Advanced Security, right?
We could only think of two logical reasons why GitHub does not scan code for Twitter tokens:
- It is exceptionally rare for Twitter Access Tokens/Secrets to be hardcoded into source code, so the impact of committing these secrets in a public GitHub repository will be negligible (i.e. no malicious actors will ever bother to misuse them).
- The secret scanning provided as part of GitHub’s “Advanced” Security, is in fact, not very advanced. (Any decent entropy based secret scanning tool would be able to catch Twitter secrets).
So, at IncognitaTech, we decided to test both hypotheses.
Detecting Twitter tokens and secrets with the intelligent contextual analysis implemented by GoldDigger, our upcoming secret scanning solution, was a simple task. By assembling the initial version of the PinataHub dataset, we already had detected approximately 25K unique Twitter Access Token/Secret combinations.
We deliberately selected Twitter to raise awareness of the underlying issues. Clearly, it is the easiest way to reach a wide audience, however, many of the associated accounts may have already been used in malicious campaigns. Twitter account takeovers are used on a daily basis to spread, among others, disinformation, crypto scams, false advertisements, etc.
For credential leaks, just raw quantities do not mean anything, though. There is a chance that all of these tokens are dummy/non-functional.
But is this really the case? Thankfully, in the case of Twitter, we were able to quickly verify our findings using Tweepy library, and collect some interesting statistics.
Lo and behold, it turned out that around 9,500 of these access tokens are valid and working.
Practically, anyone could effectively take over these accounts (if not already done so), since the programmatic access to Twitter’s API allows for a wide range of actions.
If you consider this a big deal, imagine what this would mean for other services or platforms.
At this point several questions arise:
Q: How popular/recent are the GitHub repositories containing these access tokens?
From a simple inspection of the metadata we collected when scanning GitHub with Magellan, we found that while ~80% of the repositories with working tokens had zero watchers/forks/stars, the most popular project had over 600 forks and 7K stars, at the time of crawling. It’s quite interesting that such a popular project contains leaked credentials that no one bothered to at least remove.
This also means convincing developers to not hardcode their credentials in the source code, is by no means a simple task, and also, it’s something that can be overlooked rather easily, regardless of how many eyes see this. However, that is not the case for someone with malicious intent.
Next, to explore the age of the affected repositories, we plot the dates of the last commit per project (grouped by month).
Results are pretty self explanatory. Not only the situation is getting worse as more and more tokens get committed each month, but also there exist cases where tokens have been left there for over 8 years, and they are still working.
Q: How popular are the exposed Twitter accounts?
A different dimension to explore is the popularity of these accounts, and assess the potential impact of their takeover by a malicious party. A simple way to gauge this is to explore their follower counts. For this, we plot the cumulative distribution function for the followers_count attribute for the leaked accounts.
Notably, while around 60% of accounts have 10 followers or less, there exist users with more than a MILLION followers ! But let’s not allow curiosity to get the better of us and expose these accounts directly.
For last, we left the best part. Actually notifying the affected users !
Given that approximately 80% of the repositories have no watchers, probably GitHub is not the best channel to notify the owners of affected accounts. So let’s use Twitter instead.
We created a dummy Twitter account @PinataHub_Bot, and used the valid tokens to perform two actions: a) follow this account and b) retweet the following:
That’s a rather unconventional way to raise awareness, but we hope we make developers think twice about what they commit, especially in public repositories. While this qualifies for the largest Twitter account takeover to date, our goal is noble.
Contrary to the Twitter account hijacking incident in 2020, such takeover would be more difficult to contain since any actions taken would appear totally legitimate, as they originate from properly authorized applications. Thus, we urge all affected users to revoke their access to third party apps immediately, for invalidating the leaked tokens. Here is how.
We don’t know how long will Twitter allow our @PinataHub_Bot to live, but we hope our message reaches both careless developers and GitHub.
Follow us on Twitter for more updates and news.