44

SHA is the hashing mechanism. However, RSA is the encryption algorithm.

So does RSA algorithm use SHA hashing mechanism to generate hashing keys which in turn is used to encrypt the message??

Moreover, RSA itself gives 2 keys. One can be kept public and one private. Now, these keys can be used to encrypt as well as decrypt. Ref : RSA. Then what is the use of SHA in RSA?

In a certificate given by any site that gives HTTPS security, there is an SHA as well as a MD5 key present. How are these produced and used in the eccryption or decryption of data transferred to the browser?

Tom Leek
  • 170,038
  • 29
  • 342
  • 480
whitehat
  • 553
  • 1
  • 5
  • 7

5 Answers5

66

RSA is actually two algorithms, one for asymmetric encryption, and one for digital signatures (the signature algorithm is traditionally -- but incorrectly -- described as "encryption with the private key" and this is an endless source of confusion).

Asymmetric encryption uses keys. Keys are parameters to the algorithm; the algorithm itself is the same for everybody (in software terms, it is the executable file) while keys vary between users. In a key pair, the public key is the key which is used to encrypt data (convert a piece of data, i.e. a sequence of bytes, into another sequence of bytes which is unfathomable for everybody) while the private key is the key which allows one to decrypt data (i.e. reverse the encryption).

Whereas in symmetric encryption, the encryption and decryption keys are identical, but with asymmetric encryption, the encryption and decryption keys are distinct from each other (hence the name); they are mathematically linked together, but it should be unfeasible (i.e. too hard to do with a mere bunch of computers) to recover the decryption key from the encryption key. This is why the encryption key can be made public while the decryption key is kept private: revealing the public key does not reveal the private key.

What asymmetric encryption achieves is no trivial feat. The possibility to reveal the public key while not saying too much about the private key, but such that both keys work together (what is encrypted with the public key can be decrypted by the corresponding private key, but none other), requires a lot of mathematics! RSA is full of mathematics. This contrasts with symmetric encryption algorithms which are "just" ways to make a big mess of data by mixing bits together.

Asymmetric encryption is the natural tool to use when we want to allow for confidential transmissions between any two users among a big population. If you have 1000 users, and you want any of the two users to be able to exchange data with each other without allowing anybody to spy on them (including the 998 other users), then the classical solution would be to distribute keys for symmetric encryption to every pair of users. Alice and Bob would have a known, common key; Alice and Charlie would also have a shared key (not the same); and so would Bob and Charlie; and so on. Each user would need to remember his "shared key" with every other one of the 999 other users, and you would have 499500 keys in total. Adding a 1001th user would involve creating 1000 additional symmetric keys, and give one to each of the 1000 existing users. The whole key distribution soon turns into an unusable/infeasible nightmare. With asymmetric encryption though, things are much more straightforward in terms of key distribution: every user just has to remember his/her own private key; and the public keys(being public) can be distributed through some sort of broadcasting (e.g. a directory).

RSA has some operational constraints. With the most used variant (the one known as PKCS#1 v1.5), if the size of the RSA key is "1024 bits" (meaning that the central mathematical component of the key pair is a 1024-bit integer), then RSA can encrypt a message of up to 117 bytes in length, and yield an encrypted message of length 128 bytes. That limited size, and the size increase when encrypting, are unavoidable consequences of the mathematical structure of the RSA encryption process. Due to these constraints, we do not usually encrypt data directly with RSA; instead, we select a small sequence of random bytes, which we call session key. We encrypt the session key with RSA; and then we use the session key with a symmetric encryption algorithm to process the whole message. This is called hybrid encryption.


SHA is the common name for a family of cryptographic hash functions. The very first member of that family was described under the name 'SHA' but was soon deprecated after a serious weakness was found in it; a fixed version was published under the name SHA-1 (the weak version is colloquially known as SHA-0). Four new SHA-like functions were added to the family later on (SHA-224, SHA-256, SHA-384 and SHA-512: which are collectively known as 'SHA-2').

Hash functions have no key. A hash function is an executable algorithm which is pure code. There is one SHA-1 and everybody uses the same.

Hash functions "just" make a big mess of the input data, which is not meant to be unraveled. Actually, it is meant to be resilient to unraveling. Even though everybody knows all that is to be known about a hash function (there is no key, only code, and nothing of it is secret), it still turns out to be "too hard" to recompute a matching input message, given the hash function output. It is even unfeasible to find two distinct input messages which, when given to the hash function, yield the same output; there must exist such pairs of messages -- called collisions -- because a hash function output has a fixed small size, while accepted inputs can be widely larger, so there are more possible inputs than possible outputs. It is a mathematical certainty that collisions exist for every hash function, but actually finding one is another matter.

A hash function, as itself, does not do anything of immediate high value, but it is a very important building block for other algorithms. For instance, they are used with digital signatures. A digital signature "proves" conscious action of a designated signer over a piece of data; like asymmetric encryption, this involves key pairs and mathematics, and associated constraints on the signed data. A hash function h is such that signing h(m) is as good as signing m itself: since it is unfeasible to find two distinct messages which hash to the same value, approval of the hash output is good enough. The point being that the output of the hash function is small enough to be usable with the mathematics hidden in the signature algorithm, even if the message itself is big (SHA-1 can process gigabytes of data, and yield a 20-byte output).

It can be noted that some recent variants of RSA-the-encryption-algorithm (with the 'OAEP padding' from PKCS#1 v2.0) internally uses hash functions. Hash functions are good "randomizers" (the output of a hash function does not exhibit recognizable structure) and this makes them appropriate for building more elaborate cryptographic algorithms with good security features.


In SSL/TLS (HTTPS is just HTTP-within-a-SSL/TLS-tunnel), hash functions are used for several things:

  • as part of asymmetric encryption and/or digital signatures;
  • as part of HMAC to allow client and server to verify that exchanged data has not been altered in transit;
  • as a building brick for a Key Derivation Function, which "expands" a given session key into several symmetric keys used for symmetric encryption and integrity checks in both directions of the tunnel.

The KDF relies on the "randomizing" and non-invertibility of the hash function. In SSL/TLS up to TLS 1.1, the KDF is built over two hash functions, MD5 and SHA-1, in an attempt to make it robust even if weaknesses were later found in either MD5 or SHA-1. It turns out that weaknesses were found in both, but it did not allow any break on the KDF as used in SSL/TLS. Nevertheless, TLS 1.2 switched to another KDF which uses a single, configurable hash function, usually SHA-256, for which no weakness is currently known.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • 1
    No, the signing is decryption of plain text data with the private key, which kinda makes no sense either until one thinks of encryption and decryption as inverse operations we can basically do backwards. – ewanm89 Nov 29 '11 at 13:24
  • " the size of the RSA key is '1024 bits' " - you use the term 'size of the RSA key'. RSA is the algorithm and doesn't have a size. The word 'size' here describes the 'difficulty' of the algorithm in terms of something from the internals of the algorithm? – croraf Oct 23 '17 at 17:08
  • Thank you for this answer. Could you kindly clarify or point to literature about the point made in the first section. About how its common misunderstanding that signing with private key is digital signature. Would love to learn more but everywhere, even in cyber security courses, they say digital signature is the hash result of signing with a private key. Cheers. – KasparTr Mar 02 '21 at 19:19
4

SHA is not used in RSA.

However, cryptographic protocols like SSL, SSH and others, use different algorithms like SHA and RSA for different purposes. SSL uses RSA (encryption) or DH (with RSA, DSA or ECDSA signature) for key negotiation and AES or 3DES for data encryption. In the PGP protocol/file format, RSA, DSA and ElGamal are used for signing and encrypting.

Paŭlo Ebermann
  • 2,477
  • 20
  • 20
chris
  • 3,000
  • 14
  • 22
  • 1
    SHA is used in the signing and integrity checking in SSL. – ewanm89 Nov 29 '11 at 12:55
  • It can be used, you choose a cipher suite (consisting of authentication algorithm, hashing algorithm and encryption algorithm) prior to key exchange. You could also choose MD5 if the server supports it. – chris Nov 29 '11 at 13:03
  • well, no CA in their right name would use that on their certs now though, right? ;) Yes, I'm just pointing out you kind of missed half the question all about hashing! – ewanm89 Nov 29 '11 at 13:14
  • you also missed how PGP uses RSA or ElGamal and DSA doe it's asymmetric stuff. – ewanm89 Nov 29 '11 at 13:19
  • 3
    I didn't miss anything. The question was how is SHA used in RSA. THe answer is, it isn't. I'm not giving a conclusive list of all the possible uses of cryptograhpic algorithms. Also, CA's using MD5 in their certificates is not the same as servers and clients using MD5 in their SSL session negotiation. You can use MD5 in the latter with a SHA-1 certificate. – chris Nov 29 '11 at 13:40
  • of course, and visa-versa, the question also asked about sha specifically! – ewanm89 Nov 29 '11 at 15:00
  • You're right, I missed the last part about SHA vs MD5 in SSL. Still don't see what PGP has to do with that though. Fortunately, Thomas Pornin gave an excellent answer already. – chris Nov 29 '11 at 15:09
  • 1
    PGP != RSA, DSA or El Gamal – ewanm89 Nov 29 '11 at 15:29
  • more PGP is a system for public key crypto on stored messages like S/MIME – ewanm89 Nov 29 '11 at 15:29
  • It's such things that end up with questions like this in the first place. – ewanm89 Nov 29 '11 at 15:30
  • read again: "In PGP, RSA, DSA and ElGamal are used for signing and encrypting.". So I'm saying that RSA, DSA and ElGamal are used in PGP. How is that incorrect? – chris Nov 29 '11 at 15:37
  • Okay, I'll let you have it, it's easily miss read at least, vague where the list starts and finishes. – ewanm89 Nov 29 '11 at 15:44
  • I call B.S. on this answer. RSA CAN and DOES use SHA1: https://www.w3.org/PICS/DSig/RSA-SHA1_1_0.html When you use an encryption algorithm, such as RSA, you have to specify the hashing function for generating the keys. RSA only allows SHA. – vapcguy Apr 11 '17 at 00:49
1

As you noted, they are two different things. Hash (SHA) is to ensure data integrity and encryption (RSA) is for data confidentiality. They are used in conjunction to make sure the data is not being tempered with and only the right party is able to read it.

HTTPS is a form a PKI, which provides integrity, confidentiality, authentication. SHA or MD5 are just two different hashing algorithms used to ensure integrity of https connections.

  • 2
    SHA in itself is not used to ensure data integrity, it is only used to construct a unique value from data. Also, HTTPS is a variant of HTTP that uses SSL, and it is not a PKI. – chris Nov 29 '11 at 08:31
  • Having an unique value, wouldn't you say that give you integrity of the data? http://www.garykessler.net/library/crypto.html#hash – breakingigloo Nov 29 '11 at 08:51
  • https makes use of digital certificate, private and public keys, therefore I think it's a form of PKI. – breakingigloo Nov 29 '11 at 09:03
  • 2
    Having a unique value does not give integrity. Having a protocol that creates and verifies such data is what gives integrity. There's a distinction between algorithms and protocols that use them. About PKI, the SSL PKI is much broader applied than just in HTTPS. HTTPS is not PKI, HTTPS uses SSL which uses a PKI (that is also used by S/MIME and other applications). – chris Nov 29 '11 at 10:28
  • 1
    It may seem like just semantics, but imho the distinctions are important in reasoning about security :-) – chris Nov 29 '11 at 10:29
  • @chris: software signing 'verifies' data integrity. And signed data is a hash but has an associated with it. I'd say SHA ensure data integrity when the author identification is not required (as in you dont care who i am. Just if the package i am offering is intact). –  Sep 02 '12 at 21:54
0

Of course it's possible that SHA is used to construct a PRNG (pseudo random number generator) to generate the RSA keys. But it's possible to use other crypto primitives (e.g. AES) also. For this PRNG there are special specifications (NIS SP 800-90, ANSI X9.31). The "pure" RSA don't care how the key is generated, but to applying RSA in real world application we must be sure that the generated key must generated by a true (or appears to be generated by) random process. If not then for an attacker it's easy to guess what the keys are (If the keays aren't random then the attacker can calculate the too).

user536
  • 51
  • 1
  • 3
  • Actually, there are some things in there about them being large primes, we use large likely primes, but it's not just a random number. – ewanm89 Nov 29 '11 at 13:21
0

RSA is a asymmetric cryptography algorithm, these are used to send information to a specific party without even yourself being able to decrypt it again. But still can be decrypted by the other person. These algorithms are very slow, and so only want to be used on small amounts of data.

SHA is a suite of cryptographic hashing algorithms, this is a smaller representation of a large block of data, for cryptographic purposes we want this to meet the requirement that it's infeasible to find a set of data such that it it matches a given hash (in other words you can't go from hash to data that produces that hash (whether the same data or not).

Finally there are algorithms like AES these are used in symmetric cryptography where both sides have the same key and can both encrypt and decrypt with the same key.

Basically we put these different things together in different ways: So we'll use RSA to send and agree on a set of symmetric (AES) keys (just for the one session), this way no one else can decrypt our data but the other person as they couldn't get hold of the symmetric keys and we are not doing the slow asymmetric cryptography on all the data but just a relatively small symmetric encryption key.

Hashing is usually used in such systems in the following way, we want to check the data did come from the other person, so he decrypts the plain text with his private key, if we encrypt this cipher text we should get the data back out (basically think of encryption as the inverse of decryption and this suddenly makes sense!), however as I said earlier, asymmetric cryptography is slow. So, why not hash the data and sign the hash this way we can verify the integrity of the data without someone being able to make data that seems to match.

ewanm89
  • 2,043
  • 12
  • 15
  • 1
    "sign with private key" != "decrypt with private key". Signing is a fundamentally operation than decrypting/encrypting, despite the superficial resemblance between RSA signing and RSA decryption. – D.W. Nov 29 '11 at 21:56
  • No, it isn't, it's not the encrypt step either, which is why it confuses people both ways! – ewanm89 Dec 05 '11 at 14:51