Technorati Tags: cryptography, public-key, encryption, RSA, asymmetric encryption
The most successful public key cryptosystem in use today is RSA – named for its inventors Rivest, Shamir, and Adleman. I first learned about RSA in grad school from one of my professors, Errol Lloyd, who was one of Ron Rivest’s students. Errol is without a doubt the best teacher I’ve ever had (and also a thoroughly nice guy). If you want to go to grad school to study algorithms, you frankly couldn’t do better than heading to Delaware to work with Errol. I have very fond memories of Errol’s class where we talked about this. He’s got a way of teaching where he doesn’t come out and tell you anything; what he does is ask questions that lead you through the process of figuring it out yourself. That’s an incredibly effective way to teach if you can carry it off. Personally, I can’t. And I’ve never known anyone except Errol who could do it for a topic like RSA encryption!
Anyway, moving on… In general, public key cryptosystems are based on problems that are easy to solve computationally in one direction, but really hard to solve computationally in the other. In the case of RSA, the basic underlying problem involves prime numbers: if you’ve got two really large prime numbers, then multiplying them together is easy; but if you’ve got a number that’s the product of two really large primes, factoring it is very hard.
One of the things that makes RSA difficult to explain is that it’s hard to find a starting point. The actual encryption and decryption processes are so simple that
they seem like they can’t possibly work; there’s just no way to make sense out
of why they work without understanding where the keys come from; but it’s strange to describe an encryption system by starting with how to generate keys, when you don’t yet know how they’re going to get used.
The reason for this tangling comes back to the fundamental goal of a public key
system: you’ve got to have two keys which share a complex mathematical relationship.
The encryption system itself is really just a simple expression of the relationship
between the keys. So the algorithm is (conceptually) very simple; the whole thing is
based on the nature of the keys, their relationship, and the way that they’re
generate. It’s not like symmetric cryptography, where you can simply chose a key; in
public key systems, the keys have to be generated to satisfy a set of mathematical
relationships.
With that it mind, we’ll start working through the way that RSA works by showing
how you can create a public/private key pair. The key generation process is actually
pretty simple, but most descriptions of it get tangled up because it uses a lot of
terminology from number theory. I’ll try to present the standard terms as clearly as I
can.
The basic structure underlying an RSA key pair is a pair of two large prime
numbers. What’s large? That depends on how hard you want it to be to crack. The
tradeoff is that the larger the original pair of primes are, the more complex it is to
compute ciphertext using a key. So you need to choose a key size which makes your keys
hard to crack, without making the cost of encryption and decryption unacceptably
high. Once you’ve decided on a key size, that dictates the size of your two prime numbers, and you’re ready to compute a key pair, as follows.
- Compute two large primes, which are typically named P and Q.
- Using P and Q, you compute a number N=P×Q, which is called the modulus of the keys being generated.
- Compute the totient of the modulus. For any integer, I,
the totient of I (written φ(I)) is the number of integers smaller
than I that are relatively prime to I. Because P and Q are prime, φ(P×Q)=(P-1)×(Q-1). (The totient of P is P-1, because there are P-1 numbers relatively prime to P; the totient of Q is Q-1 for the same reason; and since P and Q are (trivially) relatively prime to each other, the totient of P×Q is (P-1)×(Q-1)). - Choose an integer, E, smaller than and relatively prime to φ(N). E is
called the public key exponent. - Compute an integer D such that D×E=1 mod φ(N). D is called the private key exponent.
The public key is the pair (N, E) of the modulus and the public key exponent; and the private key is the pain (N, D) of the modulus and the private key exponent. So you’ve got your key pair.
Encryption and decryption are amazingly simple.
Suppose that the ubiquitous Alice and Bob want to communicate. Alice gives Bob her
public key, (N, E). Now, when Alice wants to send a message to Bob, she encodes the
plaintext of the message as an integer, M. (I’ll leave the exact encoding of plaintext open for now.) To encrypt with her private key, (N, D),
she takes that integer and computes:
Ciphertext = MD mod N
Then to decrypt the message, Bob uses his key pair, and computes:
M = CiphertextE mod N
For Bob to encrypt a message for Alice, he does exactly the same thing that he did to the ciphertext – except he does it to the encoded message, M. For Alice to decrypt that, she does exactly what she did to encrypt the original M, except that she uses the ciphertext she recieved from Bob instead of the encoded plaintext M. In other words, if you’ve got a ciphertext message encrypted by the private key, decrypting it is exactly the same process as encrypting a plaintext with the public key, and vice versa. (This point is what used to cause me lots of confusion remembering what was symmetric and what was assymetric – RSA style asymmetric encryption is really very symmetric in how the algorithm works.)
How can this possibly work? On the face of it, it looks ridiculous! You encode by exponentiating once; you decode not by taking a root, but by exponentiating again!
It all comes back to the way the keys were generated. If we look at the process
in terms of modulo arithmetic, it’s pretty easy to see why it works:
- Take an original message, M. Encrypted, it’s
C = MD mod N. - Now, take the ciphertext, C, and decrypt it.
M’ = CE mod N. - Now, expand C: M’ = (MD mod N)E mod N.
- Now, we can combine the exponents: M’ = MD×E mod N.
- D×E = 1 mod N.
- By some trickiness, related to the fact that D and E are relative primes related to both each other and N by their relation to the
totient of N, we can show that the fact that D×E = 1 mod N means
that M’ = MD×E = M1 = M.
Watch how we can walk through a ridiculously simplified example. Let’s start with
a pair of primes – 29 and 61.
- Generate keys:
- First, we compute the modulus: 29×61=1769.
- Then we compute the totient: 1680.
- Then we choose an E which is relatively prime with 1680. To make things easy,
I’ll just pick a prime number: E=13. - Now, I need to compute a D, such that D×E=1 mod 1680;
or to pull out a bit of standard terminology, D is the modular multiplicative
inverse of E mod 1680. Doing that is an exercise in modulo arithmetic
which gets beyond the scope of what I want to talk about today, so I’ll
cheat: whipping out Mathematica, I get D=517. - So, the public key is (1769,13), and the private key is (1769,517).
- Now, let’s encode a message. Say the message is 236. So, encrypted by
the public key, that’s 23613 mod 1769 = 7044595136617487310722334457856 mod 1769 = 573. - And now, let’s decode it. That’s 573517 mod 1769 = 9245694881849602770197401240655288154608138663291308421093411147578949
4462244595981994549450775979702694154377539110666284415651476199963925
1321729713042853618725578035043275339491371655153997356248940804495270
8984801258907252216498797520780679511957335315751292762455167434140192
4399386435156204841871858091160737587648566008200581107378219553124074
9374669246678685272914050993442585237015640426625522901746517901417734
7954757584818934596889432709457570226928654243870833959974236930834811
3204280174093386319717080086558480154647619900966739378086145377566860
9195850562761848872414474511076491179637551417498920992360433000710853
3935087767676514264669956171228256961906386882369681633698604708335082
4532038380922358459546076540454839797340473363177061704334483341984626
0992808331110751927026090969022077767175228102822099312339565763871188
4006940217478961345132678704142880421227822352750998264777937728911493
6283819708613556665220171457755862562705567302041244568283475209225680
3828781673802145987568796646578059853165517716374603716155560308698631
2650416428650673915642280074852668462543146649056536263032788685526436
7863995916605455190338263161582118236712167656565774640875461698797822
0025671340803745351386743506255677806855116853206286798808454276083160
2812779603110688541701764925723493557773173917037390830611160570524069
7471206139919064156642428626093922418127017110711925314998235480530680
56675958655700624098981453 mod 1769 = 236.
So it works – but computing those results is not exactly trivial. It happens that there is a fast algorithm for doing those sorts of exponentiations – but “fast” is a relative term. RSA encryption is a lot slower than most symmetric encryptions, by a significant factor.
In real-world use, RSA is rarely used for full-message encryption. It’s used for signatures (where you encrypt a small summary of a message), and for protocol negotiations to allow the two sides to settle on a symmetric encryption key to
use for the rest of their communication.
You’ll notice that I still haven’t gotten back to how you take a message
and convert it to an integer. That’s not trivial. The problem is,
for certain message sizes, or messages with certain numeric properties,
RSA can be easily broken. So you need to protect against those cases. That’s
done by padding the message – adding bits to it in a way that removes
or obscures the properties of the message that would otherwise make it vulnerable.
Discussing the vulnerabilities of RSA for certain types of messages, and how
to build a padding system that defeats them is a topic for another post.
In future posts, I’ll also describe the two most common uses of RSA in more detail – both using RSA for signing a message to prove that you really wrote it; and using
RSA as part of a secure cryptographic protocol where most of the traffic is really
encrypted using a symmetric algorithm.
Ah! Brilliant, thank you for that post! I always wanted to understand how RSA works. Can I ask for posts here? (Or make a couple of suggestions anyhow). I’d love to see an explanation on the Off-The-Record Messaging protocol, in particular what makes it so different than others and how this was done… and also, drifting a bit from your original intention with this series of posts, I’d like to know how my privacy is assured by third parties… banks, web applications, mobile phone companies. What are they doing to protect my privacy, and should I trust them?
“He’s got a way of teaching where he doesn’t come out and tell you anything; what he does is ask questions that lead you through the process of figuring it out yourself.”
I think that’s colloquially known as the Socratic Method. The same style is used in book form in the The Little Schemer (formerly The Little LISPer, I believe) to teach Scheme. On the darker side of things, someone who is very skilled at arguing can use this technique to great effect to trap you with your own reasoning.
It’s indeed a nice post, but I have a question: in the examples you give, encryption is done with the private key. Isn’t this the way digital signatures are done? (Actually on real systems it’s not, as they are typically hybrids that use both asymmetric and symmetric algorithms, to overcome the speed limitations you mentioned, but I digress).
My point is, by saying things like encrypt with her private key, without explaining that’s for digital signatures (instead of encryption proper), aren’t you creating a source of confusion? Even if you reserve the details of public key system for a later post, I think you should have at least mentioned that here. But as I said in the beginning, it’s a nice post nonetheless 🙂
It’s indeed a nice post, but I have a question: in the examples you give, encryption is done with the private key. Isn’t this the way digital signatures are done? (Actually on real systems it’s not, as they are typically hybrids that use both asymmetric and symmetric algorithms, to overcome the speed limitations you mentioned, but I digress).
My point is, by saying things like encrypt with her private key, without explaining that’s for digital signatures (instead of encryption proper), aren’t you creating a source of confusion? Even if you reserve the details of public key system for a later post, I think you should have at least mentioned that here. But as I said in the beginning, it’s a nice post nonetheless 🙂
Some nitpicking about academic credit…
RS and A were RE-inventors of the algorithm. It had first been discovered in the UK by James Ellis and Clifford Cocks of the GCHQ. They were soon joined by Malcolm Williamson, who went on and invented what became known as the Diffie-Hellman(-Merkle) key exchange. Unfortunately their role remained hidden, because they were not allowed to publish their discoveries at the time.
You can read the full story in the Code Book by Simon Singh.
Oscar:
Public key encryption is used for digital signatures, but that’s not the only use of it. And even when it is used for signature, the way that it’s used is to encrypt a section of the message which contains an authentication code for the rest of the message.
I’ll talk about this in a later post, but a quick outline of how secure digital signatures work is:
(1) Take a message, M, that you want to send. You don’t
care if other people can read M, but you want the
intended recipients to be absolutely sure that it was
you who sent it.
(2) Using a special kind of hash function, compute a value
D, called a digest of M, where a D is a short code
that is produced solely from the content of the message.
(A good digest has a set of interesting properties, but
I’m not going to list them here.)
(3) Encrypt D, using your private key, and append
the result to the end of your message. The encrypted
digest is called a signature.
Now, anyone who sees the message M can compute D. They don’t
need any kind of key – the value of D is solely dependent
on the content of the message.
Anyone can verify that you sent the message, because
only you could have encrypted the digest in a way that could be decrypted with your private key.
The big advantage of this technique is that there’s no secret that needs to be shared with your intended recipients. Unlike a MAC in a symmetric encryption system,
the digest is not encrypted, and anyone can generate it. But only you could produce an encrypted copy of the digest.
So digital signature is really just a particular use-case of public key encryption that encrypts the message digest instead of the entire message.
You can see this in a nice, visible form if you use the FireGPG plugin for Firefox. Given a message, you can
encrypt it without signing it; sign it without encrypting it; or encrypt it and sign it. In all three cases, the encryption is RSA; in two of the three, RSA is used to encrypt the entire message.
Xzy:
I know that it’s called the socratic method; but in my experience, a lot of people don’t know what the socratic method was. I figured this post already had enough
jargon in it without needing to introduce the name for that teaching method, when I would still need to include the description.
And actually, I’ve never seen it used in a serious way to tangle people up. In various texts, particularly Plato, I’ve seen it used to tangle up extremely stupid interviewees. But I’ve never seen it used in real life to tangle up someone who wasn’t already being an idiot to begin with. That sort of rhetorical use of it is generally a dressed up form of proof-by-contradiction, and the questioning process is really just for show.
Nice article, just a small nitpick:
Perhaps you should clarify that you can only effectively compute D because you know the primes p and q. After all that is what makes the system secure.
If you could just plug E and M into Mathematica and find the modular inverse, anyone could do this, since (E,M) is the public key. But luckily the Extended Euclidian Algorithm require knowledge of the prime factors of M to work.
Mark:
You’re right, but the use you make of the verb “encrypt” is somewhat loose. I mean, usually when one talks about “encrypting”, one means “to write in a form that cannot be easily read”, and you absolutely cannot do that using (“encrypting”) with the private key (because everyone with the corresponding public key will be able to “decrypt” that message). But after reading this paragraph:
Suppose that the ubiquitous Alice and Bob want to communicate. Alice gives Bob her public key, (N, E). Now, when Alice wants to send a message to Bob, she encodes the plaintext of the message […]. To encrypt with her private key, (N, D), she takes that integer and computes […]
This might seem to suggest that the private key should be used to do encryption (as in “only the intended recipient gets to decrypt”) which is far from being the case.
Final remark: in your reply, you said:
Anyone can verify that you sent the message, because
only you could have encrypted the digest in a way that could be decrypted with your private key.
You meant public key, right? 🙂
Lassi: GCHQ did not invent the RSA algorithm. They invented the algorithm also known as DH. While both are public key methods, they are quite different.
Mark: Oscar is right. You used the private key parameter, D, as part of the encryption step (“C = M^D mod N”). Like nearly all mistakes with cryptography, the impact is that your described system is totally insecure. I’ve already written articles (parts 1 and 2) about how to calculate E and N given only two messages encrypted with D. In your system where D, N is the public key, the attacker can calculate E, P, and Q without seeing any encrypted messages.
Take a moment to think about this. Inverting the sense of the E and D exponents in relation to encryption and decryption results in a system where your “private” key is no longer private! This is because of the asymmetry in selecting them, something which you used in your example for choosing the parameters.
While I appreciate your enthusiasm for cryptography, the lack of attention to detail in your posts is troubling. In crypto, being exact with your terms and paranoid in your implementation really matters. Thanks.
You mentioned that for too small M, RSA is easy to break, and that your typical encoding function to turn text into M will add padding for this reason. However, since everything relating to E and D doesn’t depend in any way on the way M is encoded, why can’t you just use a modified encoding function that leaves out the padding, pick a plaintext to make an M that is small enough, and then do whatever you can do when M is small to break the key?
Your detailed list had a small error: it’s D×E=1 mod φ(N), not DxE=1 mod N.
In fact, I’m a little surprised you didn’t mention the core relation:
aφ(N) = 1 mod N
(from Euler, as I recall).
if you use the basic definition of D and E,
D * E = 1 mod φ(N) → D * E = n φ(N) + 1
then
MD*E = Mn φ(N) + 1 = M*Mn φ(N) = M*(1n) = M.
Also, on the totient of a product… you were a little too casual imho. It’s not hard to explain.
For the number from 1 to pq, you need to eliminate all multiples of ‘p’ (p, 2p, 3p, …, qp) and all multiples of ‘q’ (q, 2q, 3q, …, pq), and then restore the duplicate of ‘pq’.
That gives you pq – q – p + 1, or (p-1)(q-1) if you factor it.
A really cool way of looking at it geometrically is to draw a grid of p by q cells. Color one edge — that’s the multiples of ‘p’. Color an adjacent edge — that’s the multiples of ‘q’. What’s left is the smaller rectangle (p-1)(q-1).
Very nice article. RSA really is quite astonishing in both its simplicity and its effectiveness. It always amazes me that although cryptography has been around for thousands of years, it was only 33 years ago that public-key crypto was conceptually invented and (in the classic Diffie-Hellman paper) and 32 years ago that RSA was designed.
One minor point to clarify: You calaculated 573517 mod 1769 by calculating 573517 first (giving a 1426-digit number) and then reducing the result mod 1769. That’s a big number before reduction mod 1769, and it can get a lot bigger in practice (where the p and q are not 2-digit primes but more like 200-digit primes). So we wouldn’t actually calculate it that way, but rather use a repeated squaring technique where we start by computing 5732 mod 1769 (=1064), then 5734 mod 1769 (=10642 mod 1769 =1705), then 5738 mod 1769 (=17052 mod 1769 =558), and so on up to 573512 mod 1769, then use the results of these calculations to find 573517 mod 1769. This keeps the digit size of the numbers involved down and therefore is a lot faster and more memory-efficient.
Of course you knew that, and were just keeping it simple for us, right? 🙂
Ditto on Lloyd.
I’ve said it before, but I’ll say it again: my math is sorely lacking — though I recognize number theory being used, mostly — but I’m really enjoying these posts.
Pingback: The Heartbleed Bug | Good Math/Bad Math