Boy, do we have a doozy for you! Last week, Twitter claimed that it’s not to blame for a leak that exposed data from over 200 million users. “There is no evidence that the data being sold online was obtained by exploiting a vulnerability of Twitter systems,” the social media giant concluded after — er — a self “investigation.”
As we mentioned before, we’re skeptical about this. After all, can you really trust a company that uses an internal team to thoroughly — and truthfully — investigate its own flaws? On top of that, Alon Gal, founder of Israel-based cybersecurity firm Hudson Rock who first blew the whistle about the data leak, is also leery about Twitter’s “investigation.” He still maintains that Twitter shouldn’t escape culpability for the data breach.
And it doesn’t stop there. The anon behind Twitter’s 200 million+ data dump on Breached (a hackers’ forum) reached out to Laptop Mag to tell us that Twitter, as we suspected, is “almost certainly lying” — here’s why.
Twitter’s data-breach debacle
Before we flesh out Twitter’s questionable report, here’s some background on the data-breach debacle. In January 2022, a hawk-eyed observer from Twitter’s bug bounty program told the social media giant about an API vulnerability that exposes users’ data. How could one exploit this flaw? Good question. Here’s how Twitter described it:
“By submitting an email address or phone number to Twitter’s systems, Twitter’s systems would tell the person what Twitter account the submitted email addresses or phone number was associated with it. This issue came from an update to Twitter’s code in June 2021.”
However, Twitter claimed that it patched the vulnerability in January 2022. Unfortunately, it was too little, too late. In July 2022, a hacker took to Breached to reveal that they were in possession of a dataset featuring more than five million Twitter users (email addresses and phone numbers were exposed). They managed to secure the data before Twitter patched the security flaw in January 2022.
Twitter then informed users about the incident in August 2022. Now, here’s where it gets interesting. Using the same vulnerability patched in January 2022, in late December, a threat actor claimed that they managed to secure a data dump of 400 million Twitter users — and requested $200,000 for the dataset. (To be clear, this is another instance in which a threat actor exploited the notorious vulnerability before the patch, likely some time during the tail-end of 2021.)
The threat actor (who calls himself “Ryushi”) made it clear that he ideally wanted that $200,000 from Twitter:
“Twitter, or Elon Musk, if you are reading this, you are risking a GDPR fine over 5.4m breach imaging the fine of 400m users breach source,” Ryushi said. “Your best option is to avoid paying $276 million USD in GDPR breach fines like Facebook did (due to 533m users being scraped) is to buy this data exclusively.”
Ryushi boasted that the dataset contains emails and phone numbers of celebrities and politicians, including Alexandria Ocasio-Cortez, Donald Trump Jr., Mark Cuban, Piers Morgan, and more. (Keep in mind, however, that Breached member “StayMad” exposed Ryushi for lying about the existence of phone numbers in the dataset.)
In early January, Breached member ThinkingOne — the anon who reached to us — published the same dataset from Ryushi, but it was de-duplicated (redundant information removed). As such, the true number of users reportedly scraped from Twitter’s vulnerability is over 200 million, not 400 million (as originally reported by Ryushi).
In response to this, Twitter published a blog post on Jan. 11 claiming that the dataset was NOT obtained by exploiting a security flaw in its systems. However, as we’ve mentioned at the outset, we think Twitter’s full of it — and we have more information why we believe the social media giant is lying.
Why Twitter is likely lying
After conducting a “thorough investigation,” the social media giant concluded that the email addresses and Twitter accounts featured in the dataset of over 200 million Twitter users are likely a “collection of data already publicly available online through different sources,” adding that it is blameless for the leak.
However, what Twitter is conveniently leaving out, according to the anon who de-duplicated the notorious dataset (they go by the moniker ThinkingOne), is that there’s a link between the emails and Twitter accounts in the data dump.
“[That link] is not public, and if that link was obtained through Twitter, it would be a vulnerability/exploit (as they acknowledged in August 2022),” ThinkingOne told Laptop Mag in an email.
In other words, this isn’t just a random dataset of email addresses and Twitter handles. Those email addresses are paired with the correct Twitter handles. For example, if you have a Twitter account where you’re being your authentic self, but you have another where you’d like to remain anonymous, the dataset will likely reveal that you are the owner of both accounts because they’re linked to the same email address.
ThinkingOne did try to give Twitter the benefit of the doubt, though, running through another possibility of how one could obtain a data dump of Twitter handles with correct pairings to email addresses without exploiting a vulnerability:
“The only other plausible possibility I can think of is that someone took a massive email list, a massive list of Twitter accounts and matched them possibly using data enrichment (e.g., real names known to be associated with the emails),” ThinkingOne said, but added that this doesn’t hold water. “There are over 10,000 Twitter accounts with the real name of just ‘Sarah’ and a username of ‘Sarah’ followed by numbers. There’s just no way someone could know which was which.”
ThinkingOne said that it’s also possible that Twitter supplied these email/username pairings to a third-party company, and as a result, it got leaked. But if that’s the case, Twitter is still wrong in saying that the data is “already publicly available.”
It’s worth noting that a New York resident, seeking class-action status, sued Twitter for being negligent with his personal data and is requesting that a third-party security auditor investigate the data dump of more than 200 million users.
Twitter’s got a lot of explaining to do, so we reached out to the social media giant to get a comment. We haven’t heard back yet, but if we do, we’ll update this article.