My guess is the policy exists for all user-inputed fields and the same user input policy (no special characters) was applied to fields including passwords for simplicity. Or at some point some bank wasn't hashing passwords and had an SQLi attack through their password field, and a policy was decided that passwords can't have special characters (and the reason for the policy was forgotten once hashing was introduced).
There definitely is a security benefit of not allowing special characters in other user-inputted fields that are not hashed and could be used for various attacks like SQLi or XSS (on a bank administrator looking at an account). However, these threats are generally solved by always using bound parameters in SQL and always sanitizing user-input before displaying/saving to db.
My other guess is that they want to have custom rules that may be different from other places, so you can't easily reuse your standard strong password (which may get lost), and have to come up with something unique for their site.
EDIT: On further thought, depending on the language I can think of at least three special ASCII characters that should universally be forbidden/stripped as allowing them only tempts fate. Specifically: \0
(null - ascii 0), since it is customarily used to indicate end of string in C-style languages (possibly users to alter the memory after the end of the string). Also carriage returns and line feeds \r
(ascii 13) and \n
(ascii 10), as these are often system dependent whether line breaks are \r\n
or \n
and that brings up the inevitable I can only login on from windows not my mac/linux machine. In fact, it seems quite reasonable to only allow printable ascii (32-126) and ban the non-printable ASCII 0-31 and 127.
But if by special characters you mean unicode not ASCII characters (like ,./<>?;':"[]{}\|!@#$%^&*(-=_+
it seems reasonable for simplicity of implementation from multiple operating systems/keyboards/browsers to not allow special characters. Imagine your password had a lowercase pi in it. Was that the Greek pi (π) which is codepoint 0x3c0 or the Coptic lowercase pi (ⲡ) which is codepoint 0x2ca1 -- only one will work and this type of problem of similar characters with different codepoints will exist in unicode. Your hash which operates on bits will not be able to equate the two π, so if you try logging in different places you may input different characters.
Similarly, though this problem is one the programmer can largely control and attempt to do correct, allowing unicode characters creates encoding issues. That is for your basic ascii characters everything is represented in a one byte number. However, there are a bunch of different schemes for how to encode unicode. Is it UTF-7, UTF-8, UTF-16, UTF-32, Latin-1 (iso-8859-1), Latin-N encoding, and (for some encodings) what's the byte order (little or big endian)? The unicode codepoint for pi (0x3c0) would be represented as bytes 2b 41 38 41 2d
in UTF-7, CF 80
in UTF-8, FF FE c0 03
in UTF-16, and FF FE 00 00 c0 03 00 00
in UTF-32. You wouldn't be able to represent pi in Latin-1 (it only has 95 extra printable characters), but if you had an A1
in your password and were in different encodings you may have to represent it from any of the following characters ¡ĄĦЁ‘ก”Ḃ
(which in some Latin-N may to A1).
Yes, the webpage may have a charset defined, but users can override the charset on a page in their browser or have copied and pasted data from elsewhere in a different encoding. At the end of the day, it may be simpler to just forbid those characters.