32

I have used ssh-keygen for creating an RSA 4096-bit SSH private and public key pair. I used a passphrase for the private key.

If an attacker, Eve, knows the passphrase in addition to the public key:

  1. Can they derive the private key? - I presume yes with enough time.
  2. If they can derive the public key, what algorithms can they use to do this? - I don't know.
  3. What is the number (or order) of operations needed for each algorithm to derive the private key?

Update - it seems that with using "yafu" on one computer (http://iamnirosh.blogspot.co.uk/2015/02/factoring-rsa-keys.html) that the brute force cracking process / factoring takes significantly less time.

  • Would be interesting to see how much difference yafu makes on a distributed environment and on supercomputers.
unseen_rider
  • 423
  • 4
  • 10
  • Although a user's public and private keys are mathematically related, knowledge of a public key does not make it possible to calculate the corresponding private key. If the attacker would know the passphrase that protects the private key I guess he/she would not be that far from "obtaining" the private key from the user ;) – cyzczy Mar 13 '17 at 21:39
  • I realise that, however what about if the passphrase is also known? - This is central to my question. – unseen_rider Mar 13 '17 at 21:40
  • 1
    @unseen_rider: Given your above comment, the simple answer is that the passphrase provides no value in recovering the private key from the pubic key. It is entirely unrelated to the public and private keys' mathematical relationship to each other. Michael Kjörling's response below is very comprehensive. – adfaklsdjf Mar 14 '17 at 17:42
  • There is work in this area, however. https://github.com/gdisneyleugers/NED-RSA. – h4ckNinja Mar 15 '17 at 02:41
  • 6
    @unseen_rider The passphrase has nothing to do with the keys. Basically when you provide a passphrase what happens is that instead of storing the private key in the clear, it encrypts it using your passphrase. This means that in order to access the private key the attacker cannot simply steal your file, but it has to know the passphrase too. But the passphrase has literally no part in the production of the private & public key pair. – Bakuriu Mar 15 '17 at 07:42

5 Answers5

50

The private key is unrelated to the passphrase. So is the public key. The public key is also generally stored unencrypted, even when the private key is protected by a passphrase. (Exceptions may exist where the public key is stored in an encrypted form, but in the basic case and assuming a sufficiently large key, doing so provides no additional security because the public key is meant to be publicly distributed or at least can be assumed to be available to an adversary.)

Now, don't get me wrong. Protecting your private key with a passphrase is a very good idea, and it is best practice to do so. The private key and the public key have a mathematical relationship to each other. But your private key is still just a number. This number is unrelated to the passphrase you use to encrypt the file that holds this number, which is all the passphrase does.

If an attacker Eve knows the passphrase in addition to the Public Key; Can they derive the Private key? - I presume yes with enough time.

The public key is mathematically related to the private key, and the simple answer is that yes, with sufficient time, in theory, one can be derived from the other. The public key almost by definition needs to be transmitted in the clear in order to establish an identity before the encrypted communications channel is set up, so Eve (by convention Eve is a passive eavesdropper; Mallory is an active man in the middle) can see it.

Deriving the public key from the private key in RSA is easy. Deriving the private key from the public key is supposed to be computationally infeasible, and with sufficiently large keys effectively impossible. So far as we know, there is no fast way to derive the private RSA key data from a 4096-bit public RSA modulus on a classical computer (which is what we currently have). Even an efficient attack on a 2048-bit public RSA modulus would be quite a feat, though 1024-bit RSA is within the realm of possible in reasonable amounts of time and a 768-bit RSA private key has been factored in public (it took the equivalent of on the order of 2,000 CPU-years on reasonably modern hardware).

So, "with enough time", it is possible. But that is unrelated to the passphrase, or lack of passphrase.

To understand the why of the above, I will use textbook RSA for simplicity. (Real-world RSA is similar to, but not exactly the same as, textbook RSA, as there are many real-world attacks that textbook RSA doesn't deal with.) Here, among some other numbers, the private key is the pair of prime numbers commonly referred to as p and q, and the public key is their product, n=pq, where n is referred to as the public modulus. Given p and q, calculating n is borderline trivial (you just need to multiply two -- very large -- numbers together), but given only n, determining the corresponding values for p and q is extremely difficult. We have no proof that this is so, though; only the absence of any publicly known easy way to actually do it, even with lots of very smart people having worked at the problem for a long time. It's like figuring out that 15 = 3 * 5, except that instead of numbers of a few digits, the numbers involved are many hundreds of digits. To a first order approximation, with your 4096-bit RSA key, n is about 1300 decimal digits long, and p and q are a little more than 600 decimal digits long each. We can see this because log10(24096) ≈ 1233, and 10615 * 10615 = 101230.

If they can derive the public key, what algorithms can they use to do this? - I don't know.

Semiprime number factorization algorithms. The current state of the art for numbers larger than 10100 (100 decimal digits, about 332 bits) is the General Number Field Sieve (GNFS) algorithm.

A quantum computer could use Shor's algorithm to efficiently factor a semiprime, but (at least in public) we don't yet know how to build a large enough quantum computer that actually works. That's part of what the push toward so-called post-quantum cryptography is about. Last I heard, state of the art was having factored the number 15, but that was a few years ago; it is possible that publicly known quantum computers are now able to factor numbers larger than 15.

What is the number (or order) of operations needed for each algorithm to derive the Private key?

The complexity of factoring a number using the GNFS is discussed on its Wikipedia page. It's really difficult to discuss the number of operations directly, but after some discussion on prerequisites, the Wikipedia article summarizes by stating that the running time of the number field sieve is super-polynomial but sub-exponential in the size of the input. You can compare the various factoring times and algorithms used for the various RSA numbers, which were part of a competition started in the early 1990s to expand the knowledge relating to integer factorization, now inactive.

Compare also How is it possible that people observing an HTTPS connection being established wouldn't know how to decrypt it? Note that HTTPS works differently from SSH, but the basic operating principle remains similar.

user
  • 7,700
  • 2
  • 30
  • 54
  • 8
    "The public key is also generally stored unencrypted." No, the public key is *always* stored unencrypted since it would make no sense to encrypt it. – dr_ Mar 14 '17 at 12:05
  • «Deriving the public key from the private key in RSA is easy.» Is it? Aren't the public and private key essentially interchangeable (except for the part that one of them is kept secret, of course)? Wouldn't that mean that deriving the public from the private should be *exactly* as hard as deriving the private from the public? – Darkhogg Mar 14 '17 at 13:11
  • 2
    @dr01 There are no significant downsides *per se* to storing the local copy of the public key encrypted. It would require some workflows to be somewhat different, but that's about it. I agree that it provides no significant *benefits* either, the way that encrypting the private key for storage provides a benefit over not doing so. And of course, that's only about the SSH client and toolset; on my system, SSH public keys are stored encrypted, and SSH private keys doubly encrypted, because of full-disk encryption. The encryptions are there to solve different problems, though. – user Mar 14 '17 at 13:12
  • 3
    @Darkhogg Among a handful of other values, textbook RSA defines the private key as the two prime numbers (p,q), and the public key as their product n=pq (this is the modulus). Given p and q, calculating n is trivial (after all, it's just a multiplication); given n, calculating the corresponding p and q is *not* trivial for large values of p and q. (It's like figuring out that 15=3x5, except with numbers that are many hundreds or even thousands of digits long.) Real-world RSA also stores some other related numbers particularly associated with the private key to speed up various calculations. – user Mar 14 '17 at 13:15
  • 2
    @MichaelKjörling I see. I was under the impression that both keys were indistinguishable and didn't actually become "public" and "private" until they were chosen as such. My bad. – Darkhogg Mar 14 '17 at 13:18
  • 5
    @Darkhogg It's a simplification (and a dangerous one, at that) to think of RSA private and public keys as interchangable, where encryption is "encrypt with public key, decrypt with private key" and signing is "encrypt with private key, decrypt with public key". This is true in a strictly technical sense in textbook RSA, but it opens a truckload of worms to treat them that way, so no sane implementation actually does it like that. – user Mar 14 '17 at 13:20
  • 1
    As an aside - this is why dragnet and wholesale capture of encrypted traffic is such a worrying thing. 1024-bit encryption was fine. In 20 years perhaps all the traffic we've been sending around encrypted with 4096-bit encryption is reasonably cracked, so all of those cat videos you've been tunneling over your SSH connection can now be decrypted by the three letter agencies who can now blackmail you. Good luck! – Wayne Werner Mar 14 '17 at 13:22
  • @WayneWerner Good thing then that I'm more of a dog person myself! My sharing and tunnelling of cat videos is, to put it mildly, limited, which obviously reduces by a very large degree the risk of such blackmail scenarios as that you describe. – user Mar 14 '17 at 14:08
  • @waynewerder there have been many vulnerabilities in comms protocols and software implementations in the past, so it should not surprise anyone that there are probably numerous unknown vulnerabilities present, known by three letter agencies that can be exploited at will currently. – unseen_rider Mar 14 '17 at 21:42
  • @MichaelKjörling Re dog person - so you have nothing to hide, and therefore nothing to fear! :-) – SusanW Mar 15 '17 at 12:52
  • `The public key is mathematically related to the private key, and the simple answer is that yes, with sufficient time, in theory, one can be derived from the other` Given modern implementation with `e` set to 65537 I find this statement questionable. Public key consists of e and n. You know n because it's part of the private key and you know e because it's preset by the implementation. There is no "derivation" or "time" involved. – Andrew Savinykh Mar 15 '17 at 20:59
  • @AndrewSavinykh I specifically did not go into all of the math, nor touch on the encryption and decryption exponents besides lumping them into "some other numbers", but while the modulus (value of n) may be stored as part of the private key for optimization purposes, the modulus is public. To compute the corresponding private key, you would need the prime factors p and q of n, which are not part of the public key. Remember that [RSA key generation starts out by picking p and q](https://en.wikipedia.org/wiki/RSA_(cryptosystem)#Key_generation), then calculating the other values to match. – user Mar 16 '17 at 08:17
  • @MichaelKjörling, the way I see (and the wikipedia article on RSA seems to agree with me) `p` and `q` are just "temporary variables" in the key generation process. They do not form part of the they. The modulus on the other hand is a part of both private key and the public key. You do not *need* `p` and `q` for decryption, but you do need the modulus. (I know that `p` and `q` derived values can be used to speed up decryption using the chinese reminder theorem, but this is completely beside the point) – Andrew Savinykh Mar 16 '17 at 08:22
  • 1
    I'm guessing we can agree that what you wrote is a simplification, and as all simplifications do, it's not entirely accurate, but good enough in its context. – Andrew Savinykh Mar 16 '17 at 08:26
23

The passphrase guards against accessing the private key

The passphrase is meant to guard the private key in the event of physical access. If the hacker can sign on to your server and can access your certificate store, he could use the passphrase to obtain a copy of the certificate that would include the private key. Hopefully hackers don't routinely have access to your certificate store, but sometimes a hacker is just a disgruntled employee.

The public key algorithm guards against deriving the private key

If the hacker does not have access to your certificate store, and only has the passphrase and the public key, it is not possible to derive the private key. This due to the clever way the RSA algorithm is set up. The relationship from the private key to the public key is a trap door function, meaning it is easy to compute it one direction but not in the other.

Here is an example. If I told you that X * Y = 3014569, and X and Y are integers, could you tell me X or Y? It would be very difficult (because of the way I chose X and Y). On the other hand, if I told you X = 1237 and Y = 2437, it would be very easy for you to tell me that X * Y = 3014569 (you just need to multiply!) So it's easy one way, hard the other way. This is known as the prime factorization problem.

That being said, he could spend a few thousand years going through the trap door over and over to find your private key. Like this:

  1. Iterate through all possible private keys
  2. For each private key, derive the public key
  3. Check the computed public key against the targeted public key

This is called a brute force attack. It would take a long time.

There may be some shortcuts (e.g. use a quantum computer) and mathematicians are always coming up with new ways to get around things like this. But for now, a 4096-bit key is considered plenty long to last into the foreseeable future (see this link for details).

John Wu
  • 9,181
  • 1
  • 29
  • 39
  • 2
    This should be the accepted answer. – dr_ Mar 14 '17 at 12:08
  • To expand on this good answer, you could encrypt with passphrase any sensitive file: zip archive, private key, database, whatever. Nothing unique about private key in this regard, and once accessed, it's irrelevant to your use of it. – Paul Draper Mar 15 '17 at 19:41
5

It might be possible to derive your private key from the public key (it is only thought that no-one knows how to do it efficiently), but the passphrase won’t help at all.

You should understand that ssh-keygen first generates a par of keys that is completely unrelated to the passphrase, and that the passphrase then only serves to encrypt the private key on your disk. Hence, the passphrase is only useful for someone who knows your encrypted private key.

By the way, you may change your private key’s passphrase (with ssh-keygen -p) without changing the public key.

user2233709
  • 745
  • 6
  • 13
  • Ok, and what algorithms are there for working out what the private key is? – unseen_rider Mar 13 '17 at 21:55
  • Deriving the private key from the public key is a matter of finding the 2 factors of a huge (4096-bit in your case) number. I don’t know the best (public) algorithm for this, but Michael Kjörling named one in his (comprehensive) answer – user2233709 Mar 13 '17 at 22:04
  • @unseen_rider The basic algorithm is factorization. You did this in school when for example you did division - say for example the public key is the number 35 (or 0x23 in hex). So how can you divide that number with another number without remainders? Let's try `35/2` you get 17.5 - fail. Let's try again `35/3` - fail and try and try until you get `35/5` and you get 7 - you've found the private key! That's the basic algorithm. People have spent considerable time to come up with optimized ways of doing this – slebetman Mar 14 '17 at 04:33
  • @unseen_rider: Note that what I've said above is true for RSA which uses factorization. There are other public-key algorithms like elliptic curves that are not widely used because factorization is not broken yet (still remains hard). – slebetman Mar 14 '17 at 04:37
2

Can they derive the Private key? - I presume yes with enough time.

The passphrase is just used to encrypt the storage of the private key on the client system, so it is only useful if they have the encrypted private key file. So this boils down to simply whether the attacker can derive the private key from the public key.

The answer to that depends on the type and length of the key and what computational resources they have available.

For a RSA key the best known way of cracking it is to factor the modulus. Once the modulus is factored it is then a simple matter to re-derive the private key from the modulus factors.

The best known algorithm for factoring is the general field number sieve. Factorisation of 768 bit numbers has been publically demonstrated. Factorisation of 1024 bit numbers has not been publically demonstrated but is belived to be feasible for spy agencies and similarly resourced organisations. 2048 bit RSA should be safe unless there are major advances in factorisation techniques or an alternative way to attack RSA is found. 4096-bit is another step up in paranoia.

Note that we cannot absoloutely state that even a 4096 bit key will be secure. The possibility remains that there will be a breakthrough in factoring algorithms or that another route for attacking RSA is found.

For traditional DSA the best known route of attack involves solving the discrete log problem on a modulo-prime field. As it turns out the best known algorithms for this is also the general field number sieve but application of this algorithm to the discrete log problem is harder than applying it to the prime factorisation problem. Figure a 1024 bit DSA key is a bit harder to crack than a 1024 bit RSA key but probablly not totally unfeasable.

For elliptic curve DSA the best known route of attack involves solving the discrete log problem on an elliptic curve field. This is belived to be much harder than solving it on a modulo-prime field and so keys can be shorter for a given level of assumed-security.

Peter Green
  • 4,968
  • 1
  • 22
  • 26
0

Public key encryption relies on the belief that there is no way to derive the private key from the public key. The encryption is called asymmetric because there is no known algorithm to produce the private key from the public key. It doesn't mean that there is no such algorithm, it just isn't known.

Given that no 'reverse' algorithm exists, the process of cracking your private key requires time, computational power and guesses/assumptions. If you're concerned about the security of your RSA keys, use a minimum of 8192 bit key size. Assuming that your platform's key generation algorithm doesn't take shortcuts that compromise the security of the generated key. As long as there is no reverse algorithm increasing the key size from the defaults is the easiest way to protect yourself from suspicious people.

So I must agree with the others that there is no known way that the passphrase and public key can be used to crack your private key.