To my naive mind, it seems like the encryption of a document with AES should go like this:
HumanPassword ---|hash|--> key ---|AES|--> Ciphertext
plaintext|--^
And the decryption like this:
HumanPassword ---|hash|--> key ---|AES|--> Plaintext
ciphertext|--^
And as such it seems that the encrypted package (the excel file) should not need to contain the hash of the human password, and furthermore, that it would be insecure for it to do so.
One of my co-workers says that "if the file didn't contain the hash, what would it compare the password to?"
It seems to me like it needn't do any comparison. It should decrypt with whatever key it gets from the hash, and if it worked, the result won't be nonsense. If it didn't work, excel should see nonsense and say "Wrong password, nincompoop." But it shouldn't have the hash in the file. It shouldn't need to compare anything.
However, it seems that it does have the hash in the file.
When my coworker and I picked this file apart, we found that there was indeed a hash value in there, and we are now running dictionary and brute-force attacks against it.
So I suppose my question is this:
Why does an encrypted MS Office 2010 document need to store the hash of the password?
And, if secondary questions are appropriate on SE,
If it doesn't absolutely need to, why did Microsoft choose for it to? Isn't that less secure if the hash algorithm isn't very good?