Simple Encryption

BillSoo
07-04-2002, 01:36 PM
Encryption is the process of hiding meaning in a message in such a way that only the proper recipient can deduce the meaning.
There are 2 basic methods for encryption, codes and ciphers.
Codes are simple flags that carry meaning previously agreed upon. For instance, the phrase "Venice is nice" could mean "attack at dawn". Codes are nearly impossible to break since any arbitrary string could have any arbitrary meaning. You need to have the code book. However, sometimes it is possible to deduce certain codewords if the code is used too much. For instance, during WWII, the americans regularly intercepted japanese transmissions. They had broken most of the code before and knew that an attack on a target "omega" was imminent. However, they did not know what "omega" was. As a test, they the airbase on Midway Island transmit a message in the clear saying that their water was low. Sure enough, they then intercepted a japanese message stating that target "omega" was low on water. This allowed the americans to prepare for the battle of Midway that changed the course of the war in the pacific.

But for our purposes, codes are too inflexible. You need to have a codebook that carries a code for all occasions....you have to update all the codebooks out there periodically....and you need a secure method of transmitting the codebook in the first place.
A more flexible solution is ciphers. A cipher is a way of scrambling a message in such a way that the original message can be unscrambled. Thus, any message can be encoded, and all you need is an algorithm instead of a codebook.

Ciphers have a long history. One of the first ciphers was a simple shift algorithm used by Julius Caesar. He simply added 3 to each letter in his messages, so A would be D and B would be E for instance. It wasn't very secure, but it did the job against the barbarians he faced, who had a hard enough time comprehending Latin let alone encryption....
A more modern variant is ROT-13. Even Pig Latin is a sort of Shift algorithm. In both of these cases, the encryption is mainly used to hide meaning from children, which tells you how weak it is.

Here is some sample code:

Private Function ShiftCode(ByVal InText As String, ByVal Shift As Integer) As String
'encodes by shifting each character a certain amount specified by 'shift'
'can decode by simply specifying -shift
'to prevent certain problems, we will restrict ourselves to the
'asc values 32 to 126 inclusive
Dim i As Long
Dim c As Integer
Dim s As String

For i = 1 To Len(InText)
c = Asc(Mid$(InText, i, 1))
If (c > 31) And (c < 127) Then 'in range so code
c = ((c - 32 + Shift + 95) Mod 95) + 32
End If
s = s & Chr$(c)
Next i
ShiftCode = s
End Function

By the time of the Elizabethan period, more sophisticated codes had been developed. At this time, substitution ciphers were in vogue. A substitution cipher replaces all instances of one letter with another letter. It is 26 times harder than the shift algorithm since you have to deduce the shift for each of the 26 characters instead of just one. It was proof against the ordinary citizen of the day and even today, it has some value. However, Queen Elizabeth had a spymaster who was more than ordinary and he was able to break these codes using frequency tables. As it happens, every letter in a language has a characteristic frequency in which it appears in a word. For instance, the letter "e" is the most commonly used word in the english language. If you scan a substitution cipher, you will find that some letters are used more often than others. If you start by trying "e" for the more common letters, this can help you break the rest of the cipher. Thus was born the modern cryptographer.

Arthur Conan Doyle wrote of a substitution cipher in his Sherlock Holmes story "The Case of the Dancing Men". In it, Holmes uses frequency tables to break the cipher to the utter amazement of Watson.

In many newspapers or puzzlebooks, you can find cryptograms which use substitution ciphers. You can try to break them yourself for fun.

Here is a variation of a substitution cipher. It is somewhat harder because it differentiates between upper and lower case, and it adds numbers and some punctuation:

Private Function SubstitutionEncode(ByVal PlainText As String) As String
'encodes plaintext by using a simple substitution cipher.
Dim s As String
Dim i As Long, j As Long
Const InText As String = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijkl" & _
"mnopqrstuvwxyz0123456789 .,?!"
Const OUTCODE As String = "Kv iFyaehOVGpM.HfT60StRDBZ3XUmWdCo" & _
"P8u2,cqIwj!J9sbLnQ?EAlz7rk41xg5NY"
For i = 1 To Len(PlainText)
j = InStr(InText, Mid$(PlainText, i, 1))
If j Then
s = s & Mid$(OUTCODE, j, 1)
Else
s = s & Mid$(PlainText, i, 1)
End If
Next i
SubstitutionEncode = s
End Function

Private Function SubstitutionDecode(ByVal CodeText As String) As String
'Decodes codetext by using a simple substitution cipher.
Dim s As String
Dim i As Long, j As Long
Const OUTTEXT As String = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijkl" & _
"mnopqrstuvwxyz0123456789 .,?!"
Const INCODE As String = "Kv iFyaehOVGpM.HfT60StRDBZ3XUmWdCo" & _
"P8u2,cqIwj!J9sbLnQ?EAlz7rk41xg5NY"
For i = 1 To Len(CodeText)
j = InStr(INCODE, Mid$(CodeText, i, 1))
If j Then
s = s & Mid$(OUTTEXT, j, 1)
Else
s = s & Mid$(CodeText, i, 1)
End If
Next i
SubstitutionDecode = s
End Function


The main difficulty with the previous algorithms was that it was like a code book; once you knew the algorithm, you could easily break the code. To get around this, the Vignere Cipher was developed. With Vignere, you have a password as well as the cipher. In this way, the message cannot be decoded without the password, even if the algorithm is known.

The vignere cipher is similar to a series of substitution ciphers. Each series is based on a letter of the password. If the message is longer than the password, the series repeats. If you have a sufficiently long message and a small password, it is possible to analyze the message to deduce the length of the password. You can then set up a number of separate substitution cyphers and solve each one independantly using frequency tables. Thus, you really need a long password for this to work. Currently, this method is used for super secure transmissions from embassies. They use special "one-time" pads which are very long keys composed of random numbers which are only ever used once. Thus, it is similar to a code book. Nowadays, these one-time pads are coded onto CDs and shipped to embassies in diplomatic pouches. As long as the keys are truly random, the code is unbreakable.

Here is a variation of the vigner method which uses XOR instead of adding or subtracting. This allows the same function to be able to decode as well as encode.

Private Function VignereCode(ByVal InText As String, ByVal Password As String) As String
'uses the vignere method with XOR. Can decode as well as encode
Dim i As Long, j As Long
Dim s As String
j = 0
If Len(Password) Then
For i = 1 To Len(InText)
s = s & Chr$(Asc(Mid$(InText, i, 1)) Xor Asc(Mid$(Password, j + 1, 1)))
j = (j + 1) Mod Len(Password)
Next i
Else
For i = 1 To Len(InText)
s = s & Chr$(&HFF Xor Asc(Mid$(InText, i, 1)))
Next i
End If
VignereCode = s
End Function

A problem with Vignere ciphers is that it is essentially only good for one to one communication. If you have many people communicating with one, each will require their own password. Otherwise, they would be able to read each others messages.
From this need was born Public Key encryption. With public key encryption, the key is composed of a very long number that is the product of 2 very long prime numbers. The key is made public along with an encryption algorithm. With this key, any message can be encrypted, but not decrypted unless you know the prime numbers that make up the key. Since current technology is not able to factor very large numbers very fast, this is a fairly secure method. Unfortunately, the method is a bit hard to explain for a simple primer on encryption so I have no example code.

Another problem relates to password storage. Everybody needs a password and this password should be long. For instance, many organizations require at least an 8 character password, but some individuals may have even longer passwords, like 20 or more characters. Storing all these passwords can take up a lot of room so most programs don't store the password at all. Instead, they store a HASH of the password.
A hash is very similar to a pseudorandom number generator, as well it should be since the object is the same; to generate a pseudo random number from a known input. To do this, we use simple mathematical techniques to convert a string into a number. We then store this number. When the use requires access, we use the same method to hash the password they enter and compare the hashed value against the stored hash. If they match, the user is validated.

Aside from saving storage, this method also increases security since your password isn't stored anywhere on the system, therefore a hacker looking through your files can't get your passwords.

Here is a really simple way to Hash a password:

Private Function HashPassword(ByVal pw As String) As Long
Dim i As Long, t As Long

For i = 1 To Len(pw)
t = Asc(Mid$(pw, i, 1)) Xor t
Next i
Call Rnd(-1)
Randomize t
HashPassword = (Rnd() * &H7FFFFFFF) And &HFFFFFFFF
End Function


An encrypted message should look random. There should be no order to give cryptologists a foothold to decrypt your message. Coincidentally, a compressed message should also look random, since order represents a waste of space. Therefore, compressing a message is a good way to encrypt it as long as the algorithm is unknown. Plus you save space.

BillSoo
07-04-2002, 03:32 PM
For more about encryption, do a search on the web. For example, here is a site with some stories about encryption.
http://cybercrimes.net/Cryptography/Articles/Hebert.html

BillSoo
07-04-2002, 04:35 PM
Errata: I did some more research into RSA Public Key encryption and I found that I was slightly in error....the public key is actually a pair of numbers. One is the huge product of 2 primes, as I said, but the other is a smaller, possibly non-prime number called the public exponent. The private key is also a pair of numbers. The same huge product plus another number called the private exponent.

The encoding/decoding functions are the same:

A = B^E MOD P

where A and B are the characters to be encoded/decoded, E is the exponent (public or private) and P is the huge prime product.

It's a simple enough algorithm, but VB has no easy way to handle numbers that large.

For an explanation of how these numbers are derived, I recommend this site:
http://world.std.com/~franl/crypto/rsa-guts.html

Here is an example using very small numbers:
P, Q = initial prime numbers
E = Public exponent
D = Private exponent
T = Plain text character
C = encoded character

To start, let's have P=11 and Q = 13. Thus the "huge" prime product (PQ) would be P*Q = 143.

E is determined to be a number between 1 and (P-1) * (Q-1) that has no prime factors in common with (P-1)*(Q-1). It doesn't have to be prime.

(P-1)*(Q-1) = 120 = 2*2*2*3*5 'prime factors

So we decide to choose E = 7

D is selected such that D *E - 1 = ((P-1)*(Q-1)) * X where X is any integer. In other words, D*E - 1 MOD ((P-1)*(Q-1)) = 0

In our case, D * 7 - 1 = 120 * X

D = (120*X + 1) / 7

We now just select a value for X that yields an integer value for D. As it happens, X = 6 works.

D = (120 * 6 +1)/7 = 721/7 = 103

So now we have the following keys.
Public Key:
PQ = 143 E = 7

Private Key:
PQ = 143 D = 103

Let's now encode the letter "a" (ascii value 97 so T=97)

C = 97 ^ 7 MOD 143 = 59

To Decode, we simply use the private key

T = 59 ^ 103 MOD 143 = 97

Why I think it works:

C = T^E MOD PQ
T = C^D MOD PQ

So
T = (T^E MOD PQ) ^ D MOD PQ

This simplifies to:

T = T ^ED MOD PQ

Thinker
07-04-2002, 05:03 PM
Really good! I have enjoyed reading this.

Derek Stone
07-04-2002, 08:39 PM
As have I. Thanks Bill.

-CL

Volte
07-04-2002, 08:47 PM
http://www.visualbasicforum.com/images/icons/icon14.gif

BillSoo
09-26-2002, 10:55 AM
For more advanced methods, you can download DLLs or OCXs that encapsulate the methods. For example, you can find DLLs for DES or Blowfish here:
http://www.di-mgt.com.au/crypto.html

Mathimagics
10-29-2003, 02:03 AM
If interested, I've just written (for another thread) a demo RSA encoder/decoder function in VB, short (20 lines), but optimised, quite fast, and capable (32-bit keys). I'll post a demo in Code LIbrary

Optikal
11-20-2003, 04:49 PM
Just thought I'd make mention of the current standard for symmetric encryption (ie. not public-key). AES (Advanced Encryption Standard) has been endorsed by the US government and is used by all departments including the US military. It is based on what is called the "Rijndael" algorithm. The specification for how it works and how to implement it can be found here: http://csrc.nist.gov/CryptoToolkit/aes/

*** This replaces the out-dated DES and Triple-DES algorithms.

Mathimagics
11-22-2003, 06:38 AM
First off, apologies for not following up on my promise to post 32-bit RSA code. I've been busy getting the general extended-precision version going. I will prepare and post that demo as promised soon.

Secondly, thanks to Optikal for the info about Rijndael - this relatively new topic should keep Dr Memory off the streets for a while ;)

Finally, here are a couple of links to short, easily digested pages that describe the essential differences between asymmetric (public) key and symmetric key encryption methods:

http://www.anujseth.com/crypto/publickey.html

http://www.brienposey.com/kb/public_vs__symmetric_key_encryption.asp

EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum