In 1977, three students from MIT (Massachusetts Institute of Technology), devised a number that was 129 digits long, that is the product of two prime numbers, meaning that there were only two non-trivial factors of this number. The students were Ronald Rivest, Adi Shamir and Leonard Adleman. The number came to be known as RSA129. They used this number and the fact that it had only two factors other than 1 and the number itself as a method of encryption. This method established Rivest, Shamir and Adleman as the leaders in the field of cryptography.
RSA129 was published in the August 1977 issue of Scientific American along with a challenge that if anyone could factor the number and decipher their hidden message, they would win a prize of $100. It was commented by one authority that the money would be safe for the next 20,000 years. It, in fact, was safe for only 17 years. A team of 600 mathematicians factored the number over a period of 8 months. The two prime factors of the 129 digit number were respectively 64 and 65 digits long.
This technique and the extreme difficulty of factoring numbers larger than 100 digits is the basis of modern crypto systems. Most modern systems use a number of 200 digits raised to a power of a number of 100 digits and greater. Now that, my friends, is some serious math. However, this sytem, in theory, can still be broken. The idea is not that any given system is impossible to crack but rather that the time required to crack it is so great that by the time it is cracked, the information is useless. Some estimates say that factoring a number of 400 digits using a desktop computer would take billions of years.
It may be that I love this subject because I have always secretly dreamed of being a James Bond type of secret agent, but encryption is one of my favorite subjects and this month that is what we are going to talk about. I must first qualify this article and any statements made herein by confessing that I am a novice at encryption. Many of you may, and probably do, have much more experience and expertise in this area than I.
While there is no such thing as a 100% secure encryption method, the particular method I will talk about is very basic and the security of it is minimal. It should be accepted for introductory and educational purposes only. If you want to be able to send semi-sensitive emails that can only be read by someone with the “key” to decipher the message, this will suffice. If you are sending sensitive information, like credit card numbers or financial statements you definitely should not use this method.
The average person is not going to be able to crack messages encrypted with this formula but someone with even rudimentary knowledge of code breaking probably can so use this method at your own risk.
Before getting into the actual code it is necessary to define some basic terms that will be used frequently in this article. An understanding of the two basic encryption methods will also be helpful.
As mentioned, there are two basic forms of encryption: transposition and substitution. Almost everyone is familiar with both methods, though they may not realize it. Anyone who has ever played a word scramble puzzle is familiar with transposition. Likewise, anyone who has ever played with morse code has used substitution.
Transposition is a rearrangement of the characters in the message itself. It is the least secure of the two types. One example of encryption of the phrase “The eagle has landed” by the transposition method might produce "het “gleea sha dleand”. Transposition can and normally is much more complex than this example, using geometric patterns and transposition of blocks of the text.
In substitution, on letter of the alphabet is substituted for another. An example of the substitution method is shown in the diagram below.
================================
a b c d e f g h i j k l m n o p q r s t u v w x y z
d e f g h i j k l m n o p q r s t u v w x y z a b c
================================
So our message “The eagle has landed” encrypted with the substitution scheme above is “wkh hdjoh kdv odrghg”.
Keep in mind, that any encryption method needs to be systematic so that once encrypted, there exists some way of unscrambling the message quickly and accurately. To achieve this I will use the following formulas:
For enciphering: C=(mP + s) mod n
C is our ciphertext, m is the number by which we will multiply the value of the Plaintext P, s is the number of characters forward we will shift and n is the size of our character set. For the time being we will use a character set of 26 letters, but ultimately we will want to work with 256 because there are 256 ascii characters.
For deciphering: P=(mC + s) mod n
So, the in the inverse C is our ciphertext, m2 is our multiplicative inverse, s is our shift and this should yield P our plaintext.
It is also going to be important to define a couple of mathematical terms to help us with encryption. They are modulus and multiplicative inverse. Though I had no trouble in understanding the concepts of modulus and multiplicative inverse, it took me weeks to get my head around how to actually find the multiplicative inverse of a given number on paper. I will do my best to explain it here.
The concept of modulus is quite simple. It is what is left over when a number is divided by any other number. So, 5 mod 2 = 1. Why Because 5/2 leaves a remainder of 1. Another example of modulus is a clock. If it is 10:00 AM, and you need to be somewhere in 5 hours, the time you must arrive is 3:00, because a clock uses a modulus of 12 (hours). Modulus is important to our method because we will be adding to the numerical value of our plaintext in order to find our ciphertext. Since there is an upper limit to the size of the character set (in the example shown above it is 26), if we want to encrypt the letter y by the “shift forward 3 characters” method (more commonly known as a Caesar Cipher), simply adding 3 won’t work because the numerical value of y is 25 and there is no character with the numerical value of 25+3. Therefore, the value is relative to the modulus 26. By figuring the value of a character relative to the modulus, y (or 25) becomes b (or 2) because (25 + 3) mod 26 = 2.
Multiplicative inverse is also essential to successfully enciphering and deciphering our message because in unscrambling the message and arriving back at the starting point, we will need to know how to get from say d (character 4 in our set) to a (character 1 in our set). Since we will be multiplying the numerical value of our plaintext b[/b] by some number, it might seem logical that we would do the opposite and divide. However, this will not work. Say we determine the C value of a character by multiplying P by 5 and reducing it mod 26 where C=4 (the letter d). The C value would be 6. If we simply divide 6 by 5 we get a result of 1.2. There is no character with the numerical value of 1.2 so this obviously doesn’t help us.
To get around this, it is necessary to find a number by which we can multiply our C value, when lowered mod 26 will result in the number 4 (the letter d). When working with numbers relative to a modulus (n), a number m2 is the inverse of another number m when the product of the two numbers lowered mod n is 1. The formula for this is m*m2 mod n = 1.
Okay, so now we’re ready to jump in with some AppleScript code. I know what you’re thinking, finally!, right
]To get the inverse
on getInverse(m, n)
set m to m as integer
set m2 to 0
repeat until ((m * m2) mod n) = 1
set m2 to m2 + 1
end repeat
return m2
end getInverse
If we call this subroutine, substituting 5 for m and 26 for n, the result will be 21, because 5*21 mod 26 = 1.
Okay, now we have a formula for finding our enciphering and deciphering keys right Well, not exactly. There is a problem I did not warn you about. That is that the value of m must be relatively prime to the modulus. Two numbers are said to be relatively prime when the only factor they have in common is 1. This is crucial because if we use a value of say 12 with a modulus of 26, whose common factors are 1 and 2, we will not have a one-to-one ratio of characters between our ciphertext and plaintext. This means that there will be more than one possible decription of our message. Our “key finding formula” will not work either because there is no number by which you can multiply 12 that when lowered modulus 26 will yield 1. I won’t go into the math behind all of this so I’m going to have to ask you to trust me. You can, if you like, try using the routine above with m=12 and n=26 but you will be waiting for a very, very long time for the results.
Okay, so now we know that any number, relatively prime to our modulus is a valid value for (m).
We are ready to introduce some simple code for actually encrypting a message. I will first give the formula, then the code.
The formula is C=m*P mod n. And, as promised here’s the code:
]Enciphering routine
on encipherText(p, m, n)
set c to (p * m) mod n
return c
end encipherText
Let’s test it using the letter h whose position in our set of 26 characters is 8.
]Encipher the character “h”
set c to encipherText(8, 5, 26)
The result is 14. So our plaintext character h becomes our ciphertext n because n is the 14th character in our 26 character set. Isn’t this exciting!
You might well ask “That’s great but how do I get back to the plaintext character”. We have already established the tools we need to achieve this.
We can use our subroutine for finding multiplicative inverses above and the formula for deciphering our ciphertext. The formula is P=(m*C) mod n.
Below is the code for this formula. You might notice that it is exactly the same as the encriphering code. I prefer to use two separate routines in order to keep the steps clear in my mind, but in fact, you can use the same routine for both.
]Deciphering routine
on decipherText(c,m,n)
set p to (c * m) mod n
return
end decipherText
So we run the following code:
]Get our Decrkption key (kD)
set kd to getInverse(5, 26)
Which results in 21. Now all we do is plug our values into our deciphering routine:
Decipher the character “v”
set p to decipherText(14, 21, 26)
This returns a value for p of 8, which in turn is the numerical value of the letter h in our 26 character set. So we have arrived back where we started.