94

The Let's Encrypt documentation recommends that when a certificate’s corresponding private key is no longer safe, you should revoke the certificate.

But should you do the same if there are no indications that the key is compromised, but you no longer need the certificate? Let's Encrypt certificates will automatically expire after 90 days. Is it enough to delete the certificate and its private key?

As a background, this is my concrete scenario:

  • When we deploy new software, it will create new EC2 instances, which will eventually replace the existing instances (immutable server pattern).
  • At startup, new instances will acquire a new Let's Encrypt certificate.
  • Certificates (and their private keys) never leave the EC2 instance.

So, when old instances are terminated, the certificates assigned to that machine will be destroyed. At this point, we are no longer able to get access to the private key.

Questions:

  • From my understanding, revoking might be a good practice. But strictly speaking, it will not increase the security of the system (of course, assuming that the private key was not compromised). Is that correct?
  • Will it help the Let's Encrypt operators to explicitly revoke unused certificates, or will it do more harm? (I'm not sure, but revoking could trigger extra processes, which might be unnecessary if there is no indication of the key being compromised.)
Mike Ounsworth
  • 58,107
  • 21
  • 154
  • 209
Philipp Claßen
  • 1,054
  • 1
  • 8
  • 16
  • 4
    How long do your EC2 instances typically live? I could see the answer varying depending on if it's always much less than 90 days, close to 90 days, or always above 90 days. – Captain Man Aug 04 '17 at 11:50
  • 1
    @CaptainMan Depends greatly. Could be that they are terminated after a few minutes if there is an issue with the release. Otherwise, typically several days or a few weeks. If there are new commits, I try to deploy them as soon as possible. So far, no instance got near the 90 days limit, but I'm not if always that stays so, as the new system is only in place since two months, or so. If there are no changes to the code and there are no security vulnerabilities (e.g., nodejs recently had a security issue, which I patched by redeploying), it could happen than they exceed the 90 days. – Philipp Claßen Aug 04 '17 at 12:18

4 Answers4

107

This is a subjective Cost vs Risk decision. We can't make it for you, but I can help you examine the factors involved.

Cost

To you: the effort of revoking the cert. If you have to do this manually, that's annoying, but if you can script it up in 10 mins and add it to your CloudFormation plays, then why not? As @Hildred points out, this also advertises that your server has been decommissioned, which could be considered a privacy / security issue depending on how much you care.

To LetsEncrypt: They need to handle the revocation request, which is not a particularly heavy request. Each revoked cert adds a line to their CRLs, slightly higher bandwidth costs to transmit the CRLs, and slight performance penalty to their OCSP responders that need to search the CRLs. But it's certainly not a burden since the system is literally designed for this.

Risk

If an attacker finds out that you terminate your VMs without revoking the cert, can they use that to their advantage? A rogue admin (either yours or amazon's) could pull the cert and key from the VM as it is being terminated and you'd be none the wiser. Is that likely or any bigger of a threat than pulling it from a live system? Probably not.


So really, we're dealing with a very small cost vs a very small risk. Your choice. Thanks for asking the question though, neat to think about!

Mike Ounsworth
  • 58,107
  • 21
  • 154
  • 209
  • 1
    Let's Encrypt doesn't have CRLs, only OCSP. – Tom Aug 03 '17 at 18:52
  • 3
    @tom Ah, I didn't know that LetsEncrypt does not offer CRLs to the public. Thanks. But even so, I'd bet they still use them internally. You could design a CA so that it has one central OCSP responder that queries the CA database directly, but for security and performance reasons, it's far better to have OCSP responders in each geographic region that work off a local cache of revocation data. However you implement that, it's basically a CRL. – Mike Ounsworth Aug 03 '17 at 19:17
  • Nice comparison, thank you. After thinking about it, I believe in my specific use case it makes sense to revoke them. I already have a hook that does some cleanup before the instance is destroyed, so it should not be hard to automate it. That a revocation request signals that the server has been decommissioned is a good point that I did not consider. In my specific case, it does not apply, though, as the list of our active instances is public, anyway. If an attacker monitors this list, it should be pretty easy for him to find out when servers are started or go away. – Philipp Claßen Aug 03 '17 at 22:37
  • Won't the OCSP server have less load, as the client uses the downloaded CRL first? – allo Aug 04 '17 at 09:23
  • It's worth to point out that a rogue admin is able to create certificates for your domain without your knowledge – so in this case you are already lost. @allo In theory you're right, but the few OCSP-requests cause much less load than refreshing your CRLs every two weeks... and with OCSP-stapling the overhead does not matter. – K. Biermann Aug 04 '17 at 16:02
  • 3
    @K.Biermann Depends on your definition of "without your knowledge". Most publicly trusted CAs now log all certs they issue to Certificate Transparency logs. You can search for certs issue to a specific domain name at [google's Transparency Report page](https://www.google.com/transparencyreport/https/ct/). Good sysadmins will have notifications set up for certs being issued to their domains ... but in practice I'm sure very few sysadmins do this. Some public CAs even offer CT alerts as a service. – Mike Ounsworth Aug 04 '17 at 16:24
  • @MikeOunsworth Yeah; you’re right - I forgot about this... – K. Biermann Aug 04 '17 at 20:08
26

Revocation is not necessary, from a security point of view, if the private key is not compromised.

Unnecessary revocation will add a little load to the Let's Encrypt infrastructure but not much: https://community.letsencrypt.org/t/does-revocation-cause-additional-load/25203

Tom
  • 2,073
  • 12
  • 19
  • 7
    How often can one be sure that there's no possible way the private key could have been compromised? If one signs a revocation before destroying the key and retains a copy of that, one will avoid any risk of future key compromise, but retain the ability to respond if one discovers that the key had been compromised before all known copies were destroyed. – supercat Aug 04 '17 at 23:38
  • Your link goes to a throwaway remark. I'm quite sure that the cose of revocations of compromised keys is low. If people start revoking uncompromised keys by the thousands, that might be very different. – gnasher729 Aug 07 '17 at 06:08
  • @supercat: For the certificates that you use, you are not sure there's no possible way the private key would be compromised. You better revoke them as well. – gnasher729 Aug 07 '17 at 06:09
  • @gnasher729: There is a difference between signing a revocation and revoking a key. If the damage caused by malicious revocation would be limited, it may make sense to sign revocations for keys one does use and store such revocations off-site. A thief who steals the off-site revocations could use them to revoke keys that hadn't actually been compromised, but that would be the limit of his power (as opposed to what a thief could do if the keys themselves had been kept off-site). If a thief steals the original keys but revocations are stored off site, the keys could be revoked. If... – supercat Jun 08 '18 at 18:51
  • ...the thief steals the originals and they've never been used to sign revocations, then there may be no way to revoke the thief's keys. What's the downside to archiving signed revocations? – supercat Jun 08 '18 at 18:53
18

One possibility you overlooked is to generate a revocation but not publish until needed. It does put a slight load on your infrastructure but hides the takedown of the machine, and has a revocation available if needed.

hildred
  • 449
  • 1
  • 4
  • 9
4

This is a very subjective question.

There's no harm in revoking the certificate. Whether you want to simply let it expire in due course rather than explicitly revoking, it's really up to your risk analysis. There's of course more risk that the certificate will be leaked if you don't revoke, but if you consider this risk acceptable compared to the maintenance hassle you'll have to go through if you explicitly revoke every time, then you can accept the risk and do so.

Whether this risk is acceptable or not us something you must evaluate for yourself based on your own infrastructure. You probably want to enumerate the list of people who have access to the key, either by being an employee to your company, or the data center, or by other risks like theft, etc and evaluate how much risks any of these people could accidentally or deliberately leak the keys. You'll also want to consider the list of services running in the system, evaluate their security risks in the aspect of whether they can be abused to leak the keys. You'll also have to evaluate what the keys are being used for, and how much damage would these keys being abused to the company. Based on these and other considerations, you can then make an informed decision if you want to accept these risks.

Lie Ryan
  • 31,279
  • 6
  • 69
  • 93