1

From my understanding, when you digitally sign a file, it changes that file's checksum because they are bundled together.

Example of what I mean:

file1.txt only contains the letter 'd' and the crc32 checksum of this file is 98dd4acc. However, I want to digitally sign it using my certificates private key. After I digitally sign it, what is the hash of the file? Does it remain the same as 98dd4acc or does it change after its bundled with the certificate?

Am I right or wrong in assuming the checksum of the file will change? Is a digital signature not actually included in the file itself but instead included in the properties or something.

Thank you.

  • Note that CRC is linear and thus allows an adversary to change the data without detection, defeating the purpose of signing. Proper signing uses a cryptographic hash (also called digest) which has at least second-preimage resistance and preferably collision resistance; see https://security.stackexchange.com/questions/69405/difference-between-second-pre-image-resistance-and-collision-resistance-in-crypt . For examples, MD5 and SHA1 have been broken for collision and thus are now considered unacceptable for use in signing. – dave_thompson_085 Oct 22 '18 at 08:46
  • Not all the checksum are the same. You must use adequate checksum under correct scenario. CRC checksum is pretty useful to group file, for file integrity error detection. Out of the context, you should not use CRC, i.e. file integrity check. That's why most signed file use SHA256/SHA384/etc checksum to validate the file integrity. – mootmoot Oct 22 '18 at 12:45
  • To validate a signed file, you must use appropriate tools that extract the signature and able to validate the file portion excluding the signature. For unsigned file, a simple way is simply published the SHA256 key (as long as you can make sure nobody can modify the published value) . – mootmoot Oct 22 '18 at 12:49

2 Answers2

2

There are many possibilities.

Some file formats, like PDF as described by Keith, and Authenticode for Microsoft Windows executables, put the signature in the file but arrange for it not to cover itself, and perhaps not some other non-critical data as well.

Some file or message formats, like XMLdsig and optionally S/MIME CMS/PKCS7 and PGP, put both the data and the signature in a larger file structure so that they can easily be separated, the signature verified, and the data (without the signature) used. Java codesigning uses Java's JAR format, which is a slightly modified ZIP file: each class (or other resource) is an entry in the JAR file, the digests of all classes are listed in the manifest which is also an entry in the JAR file, and the signature of the manifest plus related certificate(s) are stored in two additional entries in the JAR file.

Sometimes people just put the signature in a separate file that is linked to the data file; S/MIME CMS/PKCS7 and PGP 'detached' signatures do this. For example, a software download site might have one file named superwondergizmo-1.2.3.tar and its signature in a file named superwondergizmo-1.2.3.tar.sig (or sometimes ...asc for 'armored' aka ASCII PGP). On modern Windows with NTFS it could make sense to use a supplemental stream to store the signature, but I've never seen anyone do so; similarly for MacOS with the resource fork; but those wouldn't usually work for transfer, which is often important nowadays.

dave_thompson_085
  • 10,064
  • 1
  • 26
  • 29
  • What exactly do you mean by "not to cover itself" when putting the signature with the file. Is it packaged with the file but not exactly the same thing as the file? Would taking the checksum of this file differ from the checksum taken before the signature is added? –  Oct 22 '18 at 18:30
  • (sorry I iniitally lost the notify) yes, a checksum (or hash) of the _file_ changes (with overwhelming probability) when the signature is _included_ in the file, but since the signature is not included in the data that is hashed and signed, the signature still can correctly validate that data; the 'relier' (which checks the signature) then trusts only the data that was signed, but not any data other than the signature that was not signed. – dave_thompson_085 Oct 26 '18 at 05:07
0

Typically you'd sign the contents of the file but not the signature.

Remember, the signature is not just the hash, but the hash that has been encrypted with the private key of the signer. As an example, here's how Adobe describe the signing of PDF documents:

The hash of the entire file is computed, using the bytes specified by the real ByteRange value using a hash algorithm such as SHA-256. Acrobat always computes the hash for a document signature over the entire PDF file, starting from byte 0 and ending with the last byte in the physical file, but excluding the signature value bytes

I'm no expert, but I gather from this that the signature is excluded from the hash+sign process -- because otherwise you end up in a infinite recursive loop :)

keithRozario
  • 3,631
  • 2
  • 12
  • 25
  • I understand that part but I mean where does the digital signature go? Is it included with the file itself, thus, changing the files original checksum. Or is it separated like in the way that file properties and information in Windows are not included with the file. –  Oct 22 '18 at 03:25
  • Yes it will (or at least I think it will). Since the signature is included in the file, it will change the checksum if you pass it through a regular hashing program like MD5SUM. File Signing is format specific -- i.e. the way to check it for PDF is different from word etc. But since the signature is always packaged into the file itself, it **will** change the checksum of the file. – keithRozario Oct 23 '18 at 06:07