35

I recently saw the movie Olympus Has Fallen.
Like in many action movies, at the end a missile is launched, and the hero (Mike Banning, played by Gerard Butler) has 60 seconds to recall the launch in order to prevent a disaster. (Spoilers!)

The way to recall it is by inserting a password, but Banning doesn't have the password. Instead, he has radio contact with the pentagon, who do have it.

So the person from the pentagon is reading the password to Manning over the radio: "Lima, Charlie, Hashtag...". But Banning doesn't have a clue what a hashtag is, so he yells: "What?" And the pentagon perso repeats: "Hashtag?" And time is running out... And then someone else from the pentagon yells: "Shift 3!".

And America is saved...

After the movie was over, I thought a lot about that scene. I realize that in real-life there are rarely cases in which a password should be read out loud (most of the time, if it happens, it's because people share passwords, and that's a different problem...).
But here is a case in which the only way to get the password is by saying it over the phone, and it's not a personal password - it's a password only used in an emergency, by whomever has access to the missile launch dashboard.

Now, I know there are various guidelines for passwords: How to make them easy to remember but hard to guess, how to avoid confusing characters, etc. But has anyone come up with rules for over-the-phone-read passwords? I agree, it's a small niche, but at least according to Hollywood – it could be crucial…

enter image description here

To be clear, this is not about "how to read it", but rather "how to choose a password that CAN be read, but is still strong".

UPDATE: Here is the scene, 1:30 minutes on YouTube.

Lea Cohen
  • 469
  • 4
  • 9
  • 19
    Could have been a problem if he was using a UK keyboard like mine - Shift+3 is £ – razethestray Aug 07 '13 at 15:47
  • 4
    Can't help but think that saying "pound key" or even "number sign" would have been much easier than saying Shift + 3 – DKNUCKLES Aug 07 '13 at 15:57
  • 28
    Imagine a hacker on the phone yelling "Octothorpe! Octothorpe!" :-) – John Deters Aug 07 '13 at 17:18
  • 3
    Commander Data stuck to letters and numbers: http://www.youtube.com/watch?v=oNrWgjh9tnU – Neil McGuigan Aug 07 '13 at 19:37
  • 5
    I hear using Japanese phonetics is [quite effective](http://thedailywtf.com/Articles/The-Automated-Curse-Generator.aspx). – user3490 Aug 07 '13 at 23:10
  • 5
    There are far more annoying passwords to read out over the phone, how about ["One1TheFirstJustTheNumberTheSecondSpelledOut"](http://www.mcsweeneys.net/articles/e-mail-addresses-it-would-be-really-annoying-to-give-out-over-the-phone) – Morphed Aug 08 '13 at 07:53
  • 1
    @DKNUCKLES "number sign" would work fine, but "pound key" might be mistaken for the British Pound £. And the timeout after an incorrect password would probably have an inconvenient side-effect in that movie... – Tobias Kienzler Aug 08 '13 at 09:45
  • 1
    @TobiasKienzler. That's not the "British Pound". It's the pound. The Irish pound and the Cypriot pound used the same symbol, and so, according to Wikipedia, do the Egyptian, Syrian, and Lebanese pounds. (It's also Shift+3, by the way.) – TRiG Aug 08 '13 at 10:52
  • 2
    Strong password, easy to tell on the phone? "CorrectHorseBatteryStaple" isn`t mandatory here? – woliveirajr Aug 08 '13 at 12:39
  • @TRiG indeed, though that makes the confusion even worse – Tobias Kienzler Aug 08 '13 at 13:00

5 Answers5

41

Microsoft already has done something like this with their product key alphabet. They selected a subset of characters that are distinctive, and excluded characters that could lead to either confusion or offensive words.

The 24 used are: 2346789BCDFGHJKMPQRTVWXY

The 12 unused are: 015AEILNOSUZ

The hyphen character is used to separate five character groups, but is not significant.

Product keys are broken into five character groups, separated by hyphens for readability. The fifth character of each group serves as an independent check character, ensuring only that the group of five was typed correctly. (It's just a check sum, and doesn't indicate if that group is part of a valid password or not.)

Why this relates to you is that if you are generating passwords, these can be completely unambiguous whether spoken or typed. Nobody has to ask "Oh or zero?" "Two or zed?" There are no symbols that can cause the difficulties you encountered. Furthermore, you would translate them to upper case in software to avoid case issues. And they are common to all languages that use the Latin alphabet.

The drawback is that with only 24 possible symbols, one character can offer only 4.8 bits of unpredictability. For equivalent security a password has to be three times longer than a password that can have upper case, lower case, numeric digits, and symbols. To that, add another 20% for the check characters. That makes every 128-bit password a hefty 35 characters long, or seven groups of five. (Microsoft's five group scheme offers 96 bits of uncertainty.)

John Deters
  • 33,897
  • 3
  • 58
  • 112
  • Don't modern MS apps all phone home to validate product keys? If so they could use randomly generated values leaving nothing to crack algorithmically. 96 bits is enough to prevent brute force guessing from being practical which is all they really need. – Dan Is Fiddling By Firelight Aug 07 '13 at 18:54
  • @DanNeely, yes, these are used for online authorization of products, and as far as I know, they are simply random numbers and not passwords. The point is it is an unambiguous character set that avoids most of the problems of readability and usability. (Most, because 8 and B are still structurally fairly similar.) I've updated the answer to reduce my speculation. – John Deters Aug 07 '13 at 22:10
  • 4
    This is a great answer, but I'm curious about the source of this information. Do you have a reference for this? – Steve Aug 08 '13 at 03:22
  • 1
    Thanks for a great answer. But isn't 35 characters a little long, especially if there is a time limit, such as 60 seconds? – Lea Cohen Aug 08 '13 at 09:23
  • @LeaCohen i doubt there are many scenarios for '60 secs' but i'm sure that in these cases there are multi layer solutions for less then 35 chars i.e. the computer that offers the password has already passed some security certifications or actually may encode de 35 or less password into a certain key they already use. – Alex Aug 08 '13 at 11:03
  • 3
    @SteveS, this is an observation and analysis of what Microsoft has been very successful at using over the last 10 years. I edited the question to remove the assumption that they did research on the topic (although I'm sure there was usability research done.) The rest of the information is easily observable - the characters in use and not in use, letters with confusingly similar shapes to numbers are omitted, vowels that lead to bad words are omitted, case insensitive, auto-error if any member of a 5 digit group is mistyped, etc. And the rest is just math. – John Deters Aug 08 '13 at 17:12
  • 1
    @LeaCohen, I obviously don't know how the military protects their nuclear missile launch codes, nor how much time they allow someone to type a mythical "abort" sequence, but if the end of the world is at stake, is 35 characters enough to protect it? :-) – John Deters Aug 08 '13 at 17:16
  • 3
    Your answer removes *visually* ambiguous letters, which will prevent whoever's reading the password from getting it wrong, but we should also be concerned with *phonetically* ambiguous ones to prevent the person hearing the password from getting it wrong. If you remove N, for example, then you'd surely want to remove M as well. Otherwise you risk hearing, "did you say em or en?" from the recipient. Whoever is hearing the password wouldn't know that their options have been restricted. So phonetically ambiguous letter groups should be removed too: [M and N], [P, D, and E], [G and J], [C and Z]. – Nick Aug 10 '13 at 09:31
  • @Nick, I agree that more work would be needed, but that quickly gets even more complex. Are the speakers Americans or Brits? Americans say "zee" while Brits say "zed", and there are undoubtedly other dialect and language issues. – John Deters Aug 10 '13 at 19:00
  • @JohnDeters Absolutely – dialect issues will cause problems if we're trying to create a universally 'safe' alphabet. For that reason, perhaps whole-word key phrases are a safer approach than letter-based passwords from a crippled alphabet? – Nick Aug 11 '13 at 09:56
  • @JohnDeters 128bit passwords actually only need to be 28 chars long: (24^28) / (2^128) = 1.30 – Perseids Aug 11 '13 at 16:47
  • @Perseids, Microsoft added an extra check character for every four characters to serve as a check that each group of five was correctly entered. 28/4= 7, 28+7 = 35. If you're following their model, of course, but part of the idea is to make it very easy to enter. – John Deters Aug 13 '13 at 02:21
  • 1
    @Nick, Thomas Pornin acknowledges the "whole word" problem doesn't solve everything in his answer ("Battery Staple" could be misheard as "bat tryst able".) For spoken characters, the phonetic alphabet is in specialized use across English speaking countries - I can't say about other languages that use the Latin character set. But a phonetic alphabet only helps if the speaker knows and uses it. Otherwise, you can get people disambiguating letters with homophones: "That's N as in Nail." "Is that M as in Male?" "Yes, N as in Nail." – John Deters Aug 13 '13 at 02:37
14

The ICAO/NATO phonetic alphabet is used primarily to distinguish between letters that sounds the similar when spelling them, like d and b, or n and m. All of the other special characters cannot be spelled, they have their own names.

# is a pound sign or a hash sign, & this is an ampersand, and so on.

The Australian Amateur Radio Service Emergency Communications has a training manual that talks about that.

Alphabet phonetics: Alpha, Bravo, Charlie, Delta, Echo, Foxtrot, Golf, Hotel, India, Juliet, Kilo, Lima, Mike, November, Oscar, Papa, Qubec, Romeo, Sierra, Tango, Uniform, Victor, Whisky, Xray, Yankee, Zulu.

Numeral phonetics: Zero, Wun, Too, Thuh-ree, Fo-wer, Fiy-iv, Six, Seven, Ate, Niner.

Punctuation: Full Stop, Comma, Slash, Dash, Colon, Semi Colon, Quote, Unquote, Open Bracket, Close Bracket, At Sign.

Other sources even list almost all needed special characters:

enter image description here

Adi
  • 43,953
  • 16
  • 137
  • 168
  • 2
    The problem is that not all of those symbols really lend themselves to being read out loud, unambiguously. Then there is the complications of keyboard layouts, as @razethestray mentioned, and different names for some characters, as DKNUCKLES noted. Bottom line, if you remove the ambiguous characters - and that would (apparently) include any case-sensitivity - how much does that harm the entropy of the password? How should the password selection process be modified, given these constraints? – AviD Aug 07 '13 at 16:49
  • 1
    Also note that the question is about *generating* such a password, that would lend itself well to unambiguous-reading-aloud, as opposed to a general "how do I spell this aloud". – AviD Aug 07 '13 at 16:51
  • 1
    @AviD Well, I kind of felt this doesn't fully answer the question, I forgot to post it as a CW. – Adi Aug 07 '13 at 16:52
  • 1
    @AviD Also, I don't see how case sensitivity and keyboard layout is a problem here. An Upper Alfa is an Upper Alfa on all keyboards, a Percent Sign is a Percent Sign on all keyboards. – Adi Aug 07 '13 at 16:54
  • Not all keyboards have all the signs. Some signs are called by different names depending on locale. And, the movie's fallback of SHIFT+3 would not work on other layouts... – AviD Aug 07 '13 at 16:58
  • 7
    And on my keyboard a pound sign is very different to a hash sign. As in, it looks like a pound sign. Surely no one thinks that in Britain we write #5 to mean £5? – Rory Alsop Aug 07 '13 at 17:39
  • @RoryAlsop Well, you DID introduce us to the Imperial, or English system. If I were a 133t-speaking rank amateur from the U.S.A., I could make a syntactically-motivated mistake such as that ;o) You are correct, of course. In fact, 5lbs is an equally plausible misrepresentation of £5 as #5 – Ellie Kesselman Aug 08 '13 at 12:56
  • For years, I thought that £5 meant `five Israeli pounds` or `five Italian pounds`. In forex, we'd say BPS for British Pound Sterling. I am being facetious, but also trying to explain why priority is critical. If being understood, quickly, is important, I would dispense with all punctuation marks, despite loss of entropy. Amateur radio operators' defined pronunciation of alpha-numeric's as specified by @Svetlana is all that I would use. – Ellie Kesselman Aug 08 '13 at 13:05
10

"Reading aloud" is about enunciating a sequence of "phonetic symbols" in due sequence. You want these symbols to be unambiguous when pronounced. It so happens that we humans have such a system: it is called words. When I speak a sentence, it consists of a lot of words, which other people don't have trouble understanding because their brains are highly trained, from their prime infancy, to do such a thing.

So what you want is a nice "alphabet" of words which are easy to make apart, as well as a convention for turning the words into the characters you type. The convention with Alpha = A, Bravo = B, Charlie = C... is just that: a set of 26 symbols, each being encoded as one letter. Each symbol, randomly generated in the list of 26, is worth about 4.7 bits of entropy (because 26 is almost equal to 24.7), so just generate as many as needed to reach the appropriate entropy for your target security level. 20 letters ought to be highly sufficient for most purposes, including launching nuclear missiles (that's 94 bits of entropy; anything beyond 80 bits is really good).

Another similar method is the famous one which leads to passwords like "correcthorsebatterystaple". In this case, we have a "list of common words", assumed (in the comic) to contain 2048 words. These are your symbols. Each is worth 11 bits of entropy (because 2048 = 211), so eight words would bring you to the very comfortable level of 88 bits. The symbol-to-keystrokes convention is then: type the whole word. Such passwords have the distinct benefit of being easy to remember, easier than sequences of random letters. However, they imply more key typing.

The longer the symbol list, the more probable confusion can occur. For instance, you say "battery staple" but the secret agent at the other end of the line understands "bat tryst apple". Also, I would not like the idea that the nuclear safety of America relies on the spelling skills of some field agent: he is highly trained at beheading enemy spies with his bare hands, but will he know that "battery" is not spelled "batery"? It's not like password entry fields contain spell checkers...

I thus tend to consider that the best passwords that can be "read aloud" are sequences of random letters. Sequences of digits could be used, too: less entropy per digit (3.32 bits), but we have decades of experience about customers reading out their credit card number through a phone line to some underpaid operators, and this works. In any case, as with all password things, randomness reigns supreme: make each symbol a random uniform choice, independent of the other symbols, and accumulate as many as necessary to reach the entropy goal.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • You make an interesting point about a long series of numbers and layperson clarity. – schroeder Aug 07 '13 at 22:35
  • Thank you for this analysis! However, I'm wondering about the conflict between entropy and length: if the password is too long, it could take too long to read it over the phone, and in this case, time is crucial... – Lea Cohen Aug 08 '13 at 09:17
  • If passcodes are assumed to be strings of common words, a system could accept passphrases whose words are sufficiently "close" to the correct ones without really harming security (someone listening for spoken password FAX who rejected "foxglove alpo extra" because it should be "foxtrot alfa x-ray" would ignore the fact that the speaker knew the letters "FAX", which is what's supposed to really matter. – supercat Feb 02 '14 at 22:11
10

The solution is simple and is already widely practiced: don't use any special characters in passwords.

When a password is read out loud, there are many steps that can go wrong:

  • The reader must read the password correctly. Is that a ( or a [ or a {? Is that a - or a _? Is that a l or a I or a 1? Is that a : or a ;?
  • The reader must enunciate the character correctly. Believe it or not, some people don't believe that " is called “parenthesis”.
  • The listener must understand the character name. That's the same problem as with the reader. Remind me which one is backslash? And what would you type if you hear “dash”?
  • The listener must find the character on the keyboard. Shift+3? Imagine you need to save the world from Germany (§) or Spain (') or France (3)!

There are 95 printable characters in the ASCII character set, which is all you can expect to find on a PC keyboard. On a Mac or mobile device, even some ASCII characters can be hard to type. (Try typing | on a French Mac.) If you generate a random password with n printable ASCII characters, there are 95n combinations. If you restrict the password to a smaller character set containing only C characters, then in order to have as many possible combinations (i.e. as much entropy), you'll need to make the password longer. There are Cm passwords of length m, so you need to achieve Cm ≥ 95n, i.e. m ≥ n × log(95) / log(C). Here are a few values for the multiplicative factor:

  • Using only letters of either case and digits: m ≥ 1.103 n. Up to n=9, it's enough to add one more character.
  • Using only letters of either case and digits, excluding l, o, O, 0, and 1: m ≥ 1.127 n. Now one more character is enough up to n=7; for n=8, you get a very slight reduction in strength with m=9.
  • Using only lowercase letters and digits: m ≥ 1.271 n. For example, instead of 7 arbitrary ASCII characters, you'd need 9 characters with this restriction. With n=8 you'd need m=11 (or m=10 if it's ok to slightly reduce the strength).
  • Using only lowercase letters and digits, excluding l, o, 0 and 1: m ≥ 1.314 n. For n=7 you now need m=10, for n=8 you still need m=11.
  • Using only lowercase letters: m ≥ 1.398 m. Instead of 8 arbitrary ASCII characters, you need 12 lowercase letters to achieve the same strength.

In this case where the password needs to be communicated quickly over the phone, I'd go for either the second option above or the last one: mixed case but avoiding the most confusing letter (requiring the password to be about 13% longer), or sticking to letters of one case (requiring the password to be about 40% longer).

A lot of systems impose passwords that contain mixed case and punctuation. This is well-known to be counter-productive. For security, it's the entropy that matters, not the choice of characters used to encode this entropy. For usability, the choice of characters is relevant, and up to a point (where the password gets too long) less is better.

Gilles 'SO- stop being evil'
  • 51,415
  • 13
  • 121
  • 180
2

The PGP word list comprises 2 × 256 words (two sets are used as an error detection method) chosen carefully for phonetic distinctiveness. For instance, E582 94F2 becomes "topmost Istanbul Pluto vagabond".

TildalWave
  • 10,801
  • 11
  • 46
  • 85
Evan Harper
  • 121
  • 3
  • But are users required to memorize that whole long list, to translate each code word to byte value? – AviD Aug 09 '13 at 10:45