Substitution Ciphers

A cipher is a method of encryption that works at the letter level.  A substitution cipher, as the name suggests, works by replacing each letter of a message with another symbol - a letter, a number or some entirely different symbol. 

The ROT-13 cipher that is so common on the internet is an example of this kind of cipher as each letter is replaced by the letter that is 13 places forward in the alphabet.  Actually, in this special case, since the English alphabet is 26 characters long, it does not matter if you count forward or back, so the entire cipher can be represented by the grid that will be familiar to you if you often visit sites that utilize ROT-13:

A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z

Substitution ciphers may use a mixed alphabet in which one character always translates to another character.  The greatest difference between a rotation or caesar cipher and a mixed-alphabet substitution cipher is that while there are 25 possible variations on the caesar cipher, there are more than 4.8E27 possible solutions.  The simple substitution cipher is frequently presented as a puzzle - as in the following example, which you may wish to solve yourself:

“GIKDV KW KF WQMV WIPW WIV MAVOPSKAVU DKCV KF AEW
GEQWI DKLKAH, KW KF YMFW PF WQMV WIPW WIV MADKLVU
DKCV KF AEW GEQWI VOPSKAKAH.” -*FGPSK *ZVNEAUPAPAUP

Mixed alphabet substitution may simply replace each plaintext letter with a randomly selected ciphertext letter or may simply rearrange the ciphertext alphabet with the use of a keyword.  Purists seem to prefer to use a single word with no repeating letters, but any word or combination of words can be used so long as repeated letters are elided when constructing the ciphertext alphabet.  The keyword ROLLING STONE would create this ciphertext alphabet:

a b c d e f g h i j k l m n o p q r s t u v w x y z
R O L I N G S T E A B C D F H J K M P Q U V W X Y Z

It is not uncommon to include numerals in the alphabet in applications where numbers are a frequent integral part of messages.  The general formula for including numbers, so that they are not grouped at the beginning or end of the alphabet, is to place each numeral directly following the letter of the same value, whether or not it is in the keyword.  The numberal 0 (zero) is placed after 'j' which has a value of 10.  The above ciphertext would become:

a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0
R O L I 9 N G 7 S T E 5 A 1 B 2 C 3 D 4 F 6 H 8 J 0 K M P Q U V W X Y Z

Other simple substitution ciphers may replace letters with numbers - such as A=01, B=02, C=03, through Z=26; or characters such as in Edgar Poe's story, "The Gold Bug," which used other characters found on an old-fashioned typewriter keyboard or the Sherlock Holmes story, "The Adventure of the Dancing Men" which used stick figures in different poses to represent the various letters.  (The Dancing Men cipher also uses a feature to represent the last letter of a word).  One cipher that uses other characters is the PigPen or Mason's cipher:

PigPen cipher

By drawing a pair of tic-tac-toe grids and a pair of X's and entering the letter values as shown, you have a tool for deciphering this cipher (there are two common ways of producing the cipher tool by arranging the constructs: # X # X as I have done above or # # X X).  Using only the lines directly adjacent to the letter you want, you can write out a message of any length.  Using the arrangement shown, the message "come at once" would become:

come at once

In fact, the symbology used is entirely arbitrary so long as both the sender and the intended recipient both know the system. 

Simple substitution ciphers - those in which one character replaces another throughout a message, are easily cracked, usually by starting with a frequency analysis and the application of logic. 

Some substitution ciphers are more complex, using various strategies to hide the frequency of common letter and hide recognizable words by using multiple character sets to encrypt the message.  A common textbook example of this is the Vigenère cipher which uses a grid of 26 alphabets, each starting with a different letter (lowercase letters in top row represent the plaintext):

  a b c d e f g h i j k l m n o p q r s t u v w x y z
1 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
2 B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
3 C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
4 D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
5 E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
6 F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
7 G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
8 H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
9 I J K L M N O P Q R S T U V W X Y Z A B C D E F G H
10 J K L M N O P Q R S T U V W X Y Z A B C D E F G H I
11 K L M N O P Q R S T U V W X Y Z A B C D E F G H I J
12 L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
13 M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
14 N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
15 O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
16 P Q R S T U V W X Y Z A B C D E F G H I J K L M N O
17 Q R S T U V W X Y Z A B C D E F G H I J K L M N O P
18 R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
19 S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
20 T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
21 U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
22 V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
23 W X Y Z A B C D E F G H I J K L M N O P Q R S T U V
24 X Y Z A B C D E F G H I J K L M N O P Q R S T U V W
25 Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
26 Z A B C D E F G H I J K L M N O P Q R S T U V W X Y

The secret to using this method is to cycle through several alphabets as you encrypt your message.  This is most frequently done by establishing a keyword and using that to index which alphabet to use for each letter of a message.  For instance, by using a keyword CIPHER you might encode a message such as "the quick brown fox jumped over the lazy dog."  A succesful mechanism for keeping track of what you are doing while enciphering the message is to write the keyword repeatedly over the plaintext, thus avoiding confusion about which alphabet you are using at any given moment:

C I P   H E R C I   P H E R C   I P H   E R C I P H   E R C I   P H E   R C I P   H E R
t h e _ q u i c k _ b r o w n _ f o x _ j u m p e d _ o v e r _ t h e _ l a z y _ d o g

The first letter of the keyword tells us to use the alphabet that begins with the letter 'C' - so reading down from the lower-case 't' in the first row to the 'C' alphabet, we find 'V' which becomes the first character of our ciphertext.  The next letter of the keyword indicates that we should use the 'I' alphabet, so we go down from the 'h' in the top row to find 'P.'  The third letter of the keyword is 'P' so reading in the 'P' alphabet we find that 'e' becomes 'T.'  When we reach the last letter of the keyword, we go back to the beginning and start though it again, so that our coded message becomes:

C I P   H E R C I   P H E R C   I P H   E R C I P H   E R C I   P H E   R C I P   H E R
t h e _ q u i c k _ b r o w n _ f o x _ j u m p e d _ o v e r _ t h e _ l a z y _ d o g
V P T   X Y Z E S   Q Y S N P   N D E   N L O X T K   S M G Z   I O I   C C H N   K S X

To further obfuscate matters, we could remove the spaces between discrete words and present one long string of letters or break the string into some uniform length.  As you can see in this example, the first word of the message, 'the' and the seventh word, also 'the,' do not encrypt the same.  They have become VPT and IOI respectively - and to further complicate matters, the second 'the' contains two I's - each representing a different letter.  This makes the Vigenère cipher much harder to crack than a simple substitution cipher.  In fact, though this cipher was first recorded in the 16th century, it was not until 1854 that Charles Babbage developed a methodology for breaking this type of cipher.  Solving this type of cipher begins with analysis to determine the length of the keyword and then treating it as so many substitution ciphers - more difficult than a monoalphabetic substitution cipher, but not impossible.  There is a handy applet that may be useful in solving Vigenere ciphers here: http://islab.oregonstate.edu/koc/ece575/02Project/Mun+Lee/VigenereCipher.html.

further discussion of substitution ciphers can be found at Wikipedia