▲That 16B password story (a.k.a. "data troll")troyhunt.com

79 points by el_duderino 4 days ago | 8 comments

miki123211 4 hours ago [-]

I always find it funny how the media characterizes a data breach in terms of number of records stolen, or, even worse, its size on disk.

There are ~335 million Americans. Assume for simplicity that each of them owns one phone, and hence one SIM card. Generously assume that each SIM card has 1kb of authentication material. A data breach of all US consumer SIM keys would hence be ~335 million records and ~335 gb.

Such a breach would be far, far more catastrophic than anything we have ever seen (and probably anything we will ever see) in computer security, despite being half the size of this one, and containing less than 10% as many records.

nojs 5 hours ago [-]

In other words, 2.7B -> 109M is a 96% reduction from headline to people. Could we apply the same maths to the 16B headline?

I mean there’s not 16B people in the world, so a row per person can be ruled out pretty easily

NitpickLawyer 3 hours ago [-]

> I mean there’s not 16B people in the world, so a row per person can be ruled out pretty easily

In a hypothetical "master dump", a mix of all the dumps ever leaked, you'd expect dozens if not more entries for every "real person" out there. Think about how many people had a yahoo account, then how many had several yahoo accounts, and then multiply it with hundreds of leaks out there. I can see the number getting into billions easily, just because of how many accounts people have on many platforms that got hacked in the past ~20 years.

Sure, 99% of those won't be active accounts anymore, but the passwords used serve as a signal, at least for "what kinds of passwords do people use". There's lots to be learned about wordwordnumber wordnumbernumber, and so on.

charcircuit 11 hours ago [-]

If there was an open database of password breaches it would be easier for people to do research in if a leak was new or just a password taken from a previous leak. Of course you can get closer to the actual number by filtering out duplicates, but you can't figure out what's new if you can't know what's old.

mananaysiempre 11 hours ago [-]

Pwned Passwords[1] is just such a database (with passwords hashed using either SHA-1 or NTLM as an obfuscation measure, and without any emails). Hunt used to distribute versioned snapshots, but these days he directs you to an API scraper[2] in C# instead, so you can still get a list but it probably won’t exactly match anyone else’s.

[1] https://haveibeenpwned.com/passwords

[2] https://github.com/HaveIBeenPwned/PwnedPasswordsDownloader

charcircuit 10 hours ago [-]

This isn't sufficient for all cases. For example a breach could contained a hashed passwords. If you only have the obfuscated passwords of previous breaches you can't hash it yourself to know that the new breach is just a rehash of an existing one.

Data breaches can also contain other things than just passwords. Things like phone numbers, addresses, etc that would also be useful for checking.

anon7000 7 hours ago [-]

Publishing someone’s leaked credentials in plaintext for anyone to look at also isn’t ideal. I mean, yes, it’s been leaked, but we also don’t need to make it easier for someone to get hacked.

charcircuit 51 minutes ago [-]

Pretending it's private is also problematic. People get a false impression of what is public and what isn't.

Loading comments...