Generate an XKCD #936 style 4 word password

shuf -n4 /usr/share/dict/words | tr -d '\n'
4 random words are better than one obfuscated word
Sample Output

By: Strawp
2011-08-10 13:04:39

Unfortunately, Randall Munroe doesn't understand entropy from information theory. The definition is clearly defined as H=L*log_2(N), where H is the amount of entropy in binary bits, L is the length of the message, log_2() is the log base 2, and N is the total possible characters available given the set of characters in the password. Thus, "Tr0ub4d0r&3" has 78-bits of entropy, and "correcthorsebatterystable" has 117-bits of entropy.
atoponce · 358 weeks and 2 days ago
Sadly, his reasoning is flawed as well, as discussed in great detail here...
costa24 · 358 weeks and 2 days ago
If you're going to downvote this, don't do it because the idea that inspired it is flawed. For what it's supposed to do, it does it well, and could be used for a non-flawed idea. Sadly, 'shuf' is not installed on every machine. It should be. Some later editions of GNU sort have a '-R option, so you could use 'sort -R /usr/share/dict/words |head -4 |' instead... but not every distro has up-to-date version of GNU sort, either. For example, I can see both are missing from RHEL / CentOS 5.6
Mozai · 358 weeks and 2 days ago
Thus, the reason it was downvoted by me. shuf(1) isn't portable across all Unices. However, last I checked, I didn't need to explain why I voted one way or the other.
atoponce · 358 weeks and 2 days ago
Ah, I see. I've always aimed for brevity in commands whereas I see here we're aiming for universally supported, core function based commands. I did wonder about some of my previous submissions...
Strawp · 358 weeks and 2 days ago
@atoponce: I'ts *generally* a good practice to comment and explain whenever you vote down.
parga_nanbat · 358 weeks and 2 days ago
Man that troy hunt guy is super verbose... but after skimming the article I don't see where he says Munroe's idea is flawed. The closest thing would be that Munroe doesn't specify that the total number of characters is important, not using the 4 words. The 4 words are just a convenient way to get a decent number of characters that are easy to remember. All lower case alpha is perfectly fine as long as its long enough. Do the math. The biggest flaw I see is that if an attacker knows you use 4 words like this it will make your password easy to brute force.
eikenberry · 358 weeks and 2 days ago
I think some of you missed the point Randall was making. @atoponce: your equation for entropy is only true for a string which doesn't exhibit "inter-symbol memory" (each character is random and unrelated to adjacent characters). A password based on a word does have inter-symbol memory, so you have to consider the entire word as a single symbol, and N becomes the list of words it was drawn from. Randall apparently came up with a list of 64 thousand words (16 bits). The additional bits show the likely ways those 64k base words are typically obfuscated, so you effectively have a single symbol drawn from 256 million possible values (28 bits). For the 'secure' password, my dictionary file has 98,569 words, so using your equation: H = L*log_2(N) = 4*log_2(98569) = 66 bits. @eikenberry: It's more conservative to consider each word as a 'symbol', so instead of a 25-symbol string with 4.7 bits each (26 different letters), you have a string of 4 symbols with 16.5 bits each. So, it doesn't matter if you tell the attacker that you used this command to generate the password, because he still has 2^66 combinations to try which would take 2.3 billion years to try based on the rates Randall is using (he might get lucky and find the answer in ~1 billion years). An important point (the Troy mentions also), is that these rates only apply to brute force attacks via a web interface. I wouldn't use any of these for a Truecrypt volume or any other local or offline system that doesn't inherently limit the attempt rate.
__ · 358 weeks and 2 days ago
Randall himself has discussed this a little: . . .
__ · 358 weeks and 2 days ago
zyf543 · 115 weeks ago
@atoponce That formula only works if each character is chosen uniformly randomly, which is not how passwords like 'Tr0ub4d0r&3' or 'correcthorsebatterystable' are generated. (I.e.: the user didn't just randomly pick letters one by one that happened to spell out a word).
alec · 49 weeks and 4 days ago

