🍄Passwords And Passphrases

Many people believe passphrases are less secure than passwords simply because passphrases are susceptible to Wikipedia>Dictionary attacks. This is an inaccurate and thus dangerous belief. To understand why, we need to think about passwords and passphrases in terms of tokens.

A password's tokens are its individual characters and a passphrase's tokens are its words. This is because it's always best to operate under the assumption that an attacker knows your secret generation methodology; if you always prefix your secrets with particular characters, assume the attacker knows that. Assume they know whether you generate 13 character secrets or whether you generate 7 word secrets.

This assumption allows us to take a step back and look at the broader picture. Three things matter when generating secrets.

Token space refers to the tokens a generation tool can pick from when creating a secret. In passphrases, a token space might be the 7,776 words from the EFF's Long Wordlist. In passwords, a token space might be the 93 letters, numbers, and characters commonly present on US keyboards.

Token count refers to the total number of tokens in your secret. A 13 character secret would have 13 tokens and a 7 word secret would have 7 tokens.

In this context, entropy is a measure of the aggregate randomness of a given secret. When you're coming up with a secret and you pick the name of your dog plus the year you were born, that secret's entropy is extremely low. If you instead use a secret generator like the one Bitwarden provides, entropy can be extremely high because Bitwarden will provide random characters or random words.

The math we'll use to calculate secret entropy is Wikipedia>Combinatorics: there are A arrangements of a password with a token space of size S and a token count of C. The combinatorics formula is SC=A.

Random character passwords

There are 93 letters, numbers, and special characters on a US keyboard, including the space character. If you always prepend or append particular characters or words to your secret, exclude those from the calculation because they're not random. Again, we should always operate under the assumption that an attacker knows how we generate secrets so we can ensure they'll hold up against intelligent attacks.

  • C = 13

  • S = 93

  • 9313 = 38,929,455,669,999,997,670,129,664 = A

There are 38,929,455,669,999,997,670,129,664 unique secrets we can generate when choosing 13 tokens out of token space with a size of 93.

Random word passphrases

The EFF's large wordlist contains 7,776 words. If you use a 7 word passphrase, you're choosing 7 tokens out of a token space with a size of 7,776.

  • C = 7

  • S = 7,776

  • 77767 = 1,719,070,799,999,999,819,355,521,024 = A

There are 1,719,070,799,999,999,819,355,521,024 unique secrets we can generate when choosing 7 tokens out of 7,776.

Note that I'm not factoring in the very slight complexity increase from word separators, like periods or spaces (correct.horse.battery.staple vs correct horse battery staple). Again, it's best to assume an attacker knows our generation methodologies, and if they know we separate our words with -, there's effectively no complexity advantage.

Comparison results

13 random tokens out of 93:       38,929,455,669,999,997,670,129,664
7 random tokens out of 7,776:  1,719,070,799,999,999,819,355,521,024

7 word secrets are very obviously more complex than 13 character secrets.

A secret such as bunch google squall handwash poker goggles item is not only much more complex than @8F$h*ZeS&d}, it's also probably a little easier and faster to type. I generated those two secrets then timed myself typing them; the passphrase took 7.3 seconds and the password took 7.6 seconds.

Typing the technically much longer string of words was quicker because English-speaking individuals usually understand each word at a glance, know how to spell it, and maybe even know where the letters are on their keyboard without looking. Compare that with a sequence of random characters and you'll probably find you have to look at the first character of the password, find it on your keyboard, type it, look at the second character, find it, press it, and so on. Passwords like this typically require more mental gymnastics to type and are provably less complex than passphrases.

Long randomised secrets

The above comparisons are for secrets you might need to remember or hand-type often. If a secret manager can type them for you, or if you can copy/paste out of the secret manager and into the secret field, you should really use much longer secrets with random characters.

Personally, I like to keep my passphrases to seven words just because it's secure “enough” and quite easy to memorise after typing it a few times. So at what point do passwords become more secure than passphrases with seven random words?

14 random characters: 9314 = 3,620,439,377,000,000,434,150,047,744

14 random characters: 3,620,439,377,000,000,434,150,047,744
7 random words:       1,719,070,799,999,999,819,355,521,024

This does not mean all your passwords should be 14 characters! That is an absolute minimum baseline. When your secret manager can type for you or when you can copy/paste, more tokens is always better: if you're creating a secret for a site and that site has a maximum character count, it's best to hit that limit. If there is no maximum character count, any count higher than 20 is probably “enough”. 50 is better than 20 and 100 is way better than 50, but you should determine for yourself what you think is reasonable.

Personally, the long and random passwords I generate are at most 50 characters.

What if the attacker doesn't know my methodology?

If they're expecting passphrases, they'll have to use a word-by-word Wikipedia>Brute-force attack to try and get in. If they're not expecting passphrases, they'll first try a Wikipedia>Dictionary attack, and when that doesn't yield quick results, they'll fall back to a character-by-character brute-force attack. Your seven-word passphrase might be a total of 56 tokens where a character is a token.

If they're trying all 93 characters on a keyboard for each of those 56 tokens, there are 171,808,687,099,999,989,223,632,959,096,902,893,708,641,730,371,958,299,794,728,387,482,427,563,226,489,734,750,379,497,449,285,939,164,275,015,680 possible combinations.

In a worst-case scenario where an attacker knows our generation methodology, a 7 word passphrase is still more secure than a 13 character password. In an average-case scenario, a 7 word passphrase is way more secure than a 13 character password.

Additional resources

Subhyphae