How can Twitter and GitHub be sure that they haven't been hacked?

They can't be sure. In fact, you can never be sure you haven't been hacked. But a thorough examination can make you conclude that it is more or less likely.

The Twitter statements only says that there is no indication of a hack. That doesn't exclude the possibility that they were hacked, and in urging their users to change their passwords they implicitly admits that.

As for GitHub, the wording is a bit more categorical. But I think forcing a password reset shows that they understand the risks involved.


One more thing to note is that in both cases, the leak was in a purely internal logging system. There is no indication that 3rd party users ever had access to this system. Internal logging systems are rarely exposed externally, and only consulted internally when a system needs troubleshooting. That's also probably the reason why this bug went unnoticed for months: singular log entries somewhere in what's probably a gigantic amount of other statements usually don't get noticed unless they happen to be right next to or in the middle of statements that are needed to debug other entries.

Twitter also only recently found out about the bug themselves, which means it's unlikely that people from outside the company were aware of this bug before Twitter was, let alone figured out and executed an attack to retrieve them.


It's hard to prove a negative.

So how do you prove a positive? In this case: how do you prove an attack from the outside? Typically there are several systems in place to monitor different forms of attacks, breaches or access. These can be firewalls, intrusion detection systems, SIEMs and a variety of monitoring and logging systems. In today's networks each component either has some form of monitoring or is allowing monitoring through third party tools like Check_MK.

So each step of the way - from the border of the corporate network to the machine that held the valuable information itself - is in some shape or form monitored. These logs are, depending on the network and corporate policies, regularly analyzed. The analyzing systems can distinguish between expected and unexpected traffic or behaviour. Un/Expected behaviour is for instance file access.

Internal log files are typically considered confidential data, so file access is probably monitored as well. If someone that is not part of a certain user group tries to copy/access an internal log file, that would've probably been logged as unexpected or even forbidden behaviour. If a possible adversary was able to impersonate someone with the rights to access this file, it would've been logged as well, but as expected behaviour.

In theory it is possible that an attacker is able to overcome all security controls, exploit 0day vulnerabilities, leave no trace in every log on every component, the IDS, the SIEM and so on, copy the internal log file and smuggle it outside, but it is very unlikely.

My guess is, that after the log file was discovered, all these logs were thoroughly analyzed to try to prove if there was an attack from the outside. The analysts did not find any suspicious data and therefore concluded that with almost absolute certainty there was no attack from the outside. And this actually what you see in Twitter's press release (see Florin Coada's comment). Again, my guess: GitHub's press release had a more strict language to stop speculations if there was a hack. (Didn't really work out. ;)

Of course it's also possible that Twitter and GitHub have no such security controls in place, but I really hope not.