How to Store Signed Source Code + Encrypted HASH +Code Signing Cert

Question

I have been tasked with designing a solution for signing source code. I have not found much on whether or not there is a standard for storing the signed code and it's hash and code signing cert.

As I understand it, signing source code involves taking our private key (stored in our HSMs) and encrypting the hash of a source file. Then a signature would be applied to that hash file using the code signing cert.

For one project, there could be 1000's of files needing signed. Where should all of those signed hash files be stored? Is there a standard for the naming of the hash files?

Hope this makes sense. Thanks for any help!

What problem are you attempting to tackle? Would signed git commits do the trick? If it's for archival, would stuffing it in a tarball and signing the tarball work? What about computing a checksum of every file and signing a file containing checksums? In short - the solution depends on the problem. — vidarlo, Mar 01 '23 at 17:44
It seems like the method that Ubuntu and other linux distros use would work for you. Simply list all the file names and their hashes in a text file. Then, use your private key to sign (not encrypt!) the text file. See https://releases.ubuntu.com/22.10/ as an example - sha256sums is the text file containing the filenames and their hashes, and sha256sums.gpg is the gpg signature of this file, made using Ubuntu's private signing key. https://discourse.ubuntu.com/t/how-to-verify-your-ubuntu-download/14010 explains how verification is done. — mti2935, Mar 01 '23 at 18:48
"encrypting a hash" is not what "signing" is. It's a quick and dirty way to explain the process, but that's not what actually happens. — schroeder, Mar 01 '23 at 19:07
@schroeder I am new to the concept of source code signing. From what I have found in trying to research it, it seems that the hash for the source needs to be encrypted. Here is a link I found showing this: https://pkic.org/uploads/2013/10/CASC-Code-Signing.pdf It seems to follow what most others are stating. — MegaHurts2010, Mar 01 '23 at 20:21
@vidarlo My program has a requirement to sign all of our source code to not only authenticate but also show what security classification the source is certified at. Thanks for the input. — MegaHurts2010, Mar 01 '23 at 20:24
What schroeder is saying about ""encrypting a hash" is not what signing is," is that signing is hashing but with a change that only the signer's private key can make, and is verified by the public key. Take for example, I know a secret number, you give me a math problem of 1+1, my answer is 2 and I sign it saying 6. We can verify I solved the problem because I said 2 and hash of 6. My secret is 4, and we know its my answer because it's 2+4 is the signed "hash". Thats a rudimentary way to look at it... but an example. — AustereGrim, Mar 01 '23 at 20:33
@MegaHurts2010 But *WHAT*. The code repository? Or do you make an archive, and want to ensure the integrity of the archive? In short - ***what*** is your signature protecting? Simply repeating source code is not helpful to understanding your problem. — vidarlo, Mar 01 '23 at 20:35
@AustereGrim I Agree. I guess I worded it wrong. There is also a step involved of adding a digital signature using a code signing cert. It appears that the signature should be added to the hash file. — MegaHurts2010, Mar 01 '23 at 20:38
The signature is added to the hash process. So a "signed" source file's hash would be unique from the "unsigned" source's hash, and only that signatures key can determine that that hash is valid. [Sourcefile1.txt, sourcefile1.gpg, sourcefile2.txt, sourcefile2.gpg... and so on...] So you could record every files signed hash, Or as mti2935 says, just sign hash a hash list of the sources, like ubuntu does for their repository. [sourcefile1.txt, sourcefile2.txt, sourcehashes.txt, sourcehashes.gpg] the gpg file being the signed hash. — AustereGrim, Mar 01 '23 at 20:57
The signature doesn't protect anything. It's only there to validate that the sources data has not been altered. So having a list of hashes of the source files can validate that source file has not been altered, and having a signed hash of the list of source hashes validates the hash list hasn't been altered, because it must be signed and you can't (practically) forge that. — AustereGrim, Mar 01 '23 at 21:01
That all being said; if an attacker modifies sourcefile1.txt they need to hash it to put it in the sourcehashes.txt list, but since they can't sign that sourcehashes list file and create a new sourcehashes.gpg file, that can identify that something has been changed without approval (signing and creating a new sourcehashes.gpg). — AustereGrim, Mar 01 '23 at 21:06
@AustereGrim the signature *protects* *something*. Probably the integrity of something, but it's useful to know the threat model and what *something* happens to be. As I've specified a number of times the solution may depend on what the problem looks like. Everything from signed commits in git to a simple signed tarball may make sense. — vidarlo, Mar 01 '23 at 22:22
@vidarlo yeah, I misread your comment as being from the OP and made kind of a rough response on the word "protect" trying to make a point I was already midway with the point once I realized. lol ... I agree we can't solve his problem and make the solution. But I think I've given some options that make sense, and at least answered the two specific questions he's posed in the post. — AustereGrim, Mar 01 '23 at 22:30
@MegaHurts2010 as a side note, the CASC-Code-Signing document, isn't "source code signing" at all, but executable/binary code signing. This is done at compile and packaging time saying "this binary is unmodified from the compiled source code", and is verified by the signature certificate and hashes. If you're just trying to track changes, and to validate that code has not been modified without tracked changes, then that document is not what you're looking for. (it does illustrate the concepts discussed though.) — AustereGrim, Mar 01 '23 at 22:59
Not to beat a dead horse, but for a good read on why a digital signature over a file is very different than 'encrypting a file with a private key', see the answer by Thomas Pornin at https://security.stackexchange.com/questions/87325/if-the-public-key-cant-be-used-for-decrypting-something-encrypted-by-the-privat (and pay particular attention to where he that the confusion is due to the *deleterious effects of post-Disco pop music*). — mti2935, Mar 01 '23 at 23:21

AustereGrim · Answer 1 · 2023-03-02T15:57:04.777

Okay, after reading the discussion I think I have a way to answer your questions. However you brought up the paper of "CASC Code Signing", this is "executable code signing" (publisher verification) and not related to "source code signing" (tracking/validating changes to source file) which is what I'm outlining below, they do use similar concepts of signing the files.

I have been tasked with designing a solution for signing source code.

You probably don't want to design a new solution, but try to find an existing solution instead. Signed commits in a git server might be all you're looking to really do, especially if the code is still in active development.

The discussion didn't clarify the end purpose of the signing of source code, so my response is from a change management perspective, that being your organziation is trying to comply with a policy where changes need to be tracked and approved. Validating those changes are approved and not modified without approval is to be performed via file signature.

sign all of our source code to not only authenticate but also show what security classification the source is certified at

You should rely on the source code submitters to "certify" the classification. Something as simple as portion markings and file markings should be appropriate, but follow your organization's policy on that.

Where should all of those signed hash files be stored?

Following the discussion you have some options;

You can sign each file individually and store the signature file with each file.
- This validates each file has not been changed, and can store a copy of the signed file in the signature file itself. (This can be unwieldy and space consuming.)
- Validation process itself could be time consuming.
You can hash out all the files (sha256sum * > hash.sha256) storing just the hashes in a text file, and sign only that hash list file (gpg --output hash.sha256.gpg --sign hash.sha256), store that signed file with the file.
- This is how software repositories allow you to validate you're getting unmodified versions of software, ie. Ubuntu.
- This validates the hash list file has not be changed, which itself validates that the sources have not been changed.
- Validation of this is fast as you just need to create a new hash list, and compare the differences between the old and new hash lists.
You can have some sort of database maintain it. (Git for example can use a MySQL database.)
- This keeps everything in a managed system with tracked changes.
- Validation is automatic.
Create an archive of the entire source and hash and/or sign the tar file.
- For archiving, and not for tracking changes. This is more for along the lines of verifying that file corruption has not happened.

Depending on your organizaiton's requirements I think option 2 or 3 is probably what you're looking to do.

Some research from the link below; contrary to my point in the discussion, by default with gpg you don't just create a signature of the file, or just a hash result. It is a copy of the source file compressed, and ammended a signaure block into the output file. You can create just a detached signature (gpg --output doc.sig --detach-sig doc) where you would need the source file and that signature file to perform validation.

In the instance where you do create a copy, it's not technically "encryption" merely compression, but perceptively compression is a method of weak encryption. And gpg does use the option named "--decrypt" to extract the source.

Is there a standard for the naming of the hash files?

Hash lists of files typically are named extensions with the hash method used. .md5, .sha256... or I've seen .md5.txt sometimes.

The signed file would be the signing method, ie. .gpg, so the signature file of sourcefile1.txt could be sourcefile1.txt.gpg (this follows Ubuntu's signing format.)

gnupg.org uses .sig in an example, but there's no real requirement for it.

https://www.gnupg.org/gph/en/manual/x135.html

@vidarlo in the hashing? like `sha256sum * > hash.sha256` in #2? — AustereGrim, Mar 01 '23 at 22:38

How to Store Signed Source Code + Encrypted HASH +Code Signing Cert

1 Answers1