8
Author: Dr. Kent D. Boklan Director, Security Research Razorpoint Security Technologies, Inc. (Dedicated to the memory of Lydia M. Kawka) Version: 1.1 Date of current version: 2006-05/01 Date of original version: 2005-09/28 Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved. How I Broke the Confederate Code (137 Years Too Late) [ ARTICLE ]

HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

Embed Size (px)

DESCRIPTION

A very interesting article by Razorpoint's Director of Security Research (and former NSA cryptologist) about cracking a confederate code from the Civil War. This story is one of intrigue, history, and cryptography. A compelling read dealing with real world security, albeit 137 years after the fact.

Citation preview

Page 1: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

Author:Dr. Kent D. BoklanDirector, Security ResearchRazorpoint Security Technologies, Inc.(Dedicated to the memory of Lydia M. Kawka)

Version:1.1

Date of current version:2006-05/01

Date of original version:2005-09/28

Copyright © 2006 Razorpoint Security Technologies, Inc.All Rights Reserved.

How I Broke the Confederate Code(137 Years Too Late)

[ ARTICLE ]

Page 2: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

Table of Contents.

Introduction .................................................................................................................................................. i

An Historical Artifact ................................................................................................................................... 1

Decrypting The Confederacy ..................................................................................................................... 1

The Vigenère Method ................................................................................................................................. 3

A New Key ..................................................................................................................................................... 4

The Message, 137 Years Too Late ............................................................................................................ 4

About The Author ......................................................................................................................................... 5

About Razorpoint Security .......................................................................................................................... 5

May 1, 2006 How I Broke the Confederate Code (137 Years Too Late) [v1.1]

31 east 32nd street, sixth floor | new york city, new york 10016-5509 usa | tel: 212.744.6900 | fax: 212.744.6344 | www.razorpointsecurity.com | [email protected] Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved.

Page 3: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

Introduction.Often in the security world discussion turns toward firewalls, strong encryption, intrusion detection, exploits, and other such topics. While interesting and rewarding, it can become routine. When our Director of Security Research, Dr. Kent D. Boklan, mentioned the following story to me, I strongly encouraged him to finish this paper.

At Razorpoint Security, we are frequently asked to relate “real world examples” to our work, but rare is it that we have the opportunity to discuss security with an example plucked right from American history. Dr. Boklan’s story is one of intrigue, history, and cryptography. An interesting read dealing with real world security, albeit 137 years after the fact.

We hope you enjoy it.

Gary C. Morse, CISSP, CISMPresident / FounderRazorpoint Security Technologies, Inc.

May 1, 2006 How I Broke the Confederate Code (137 Years Too Late) [v1.1] Page i of i

31 east 32nd street, sixth floor | new york city, new york 10016-5509 usa | tel: 212.744.6900 | fax: 212.744.6344 | www.razorpointsecurity.com | [email protected] Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved.

Page 4: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

How I Broke the Confederate Code(137 Years Too Late)

by Dr. Kent D. Boklan

An Historical Artifact.In the spring of 1999, I received a catalogue for a sale of Fine Books and Manuscripts Including Americana that Sotheby’s was to hold in New York on June 22nd. Lot 79 was described as,

Kirby-Smith, Edmund, C.S.A. GeneralLetter signed (“E.K.S.”), 1 page (10 x 7 7/8 in.; 254 x 200 mm), n.p., 14 September n.y. [1862], to an unidentified recipient; the text of the letter in pencil, tiny chip to right margin.

A LETTER OF INTELLIGENCE, PARTIALLY WRITTEN IN CONFEDERATE CODE. The first eleven lines of this intriguing document are in undeciphered code, but the last paragraph provides pertinent information regarding the Union army and its movements. “... A part of Genl Grant’s army is reported to have arrived at Louisville. [General] Buell was expected to come on in advance of his army, and to arrive there yesterday. Several old batteries have arrived there within the past few days.”

“General Smith and his forces ... were waiting to join forces with [General] Bragg before advancing on Louisville. “... Bragg’s advance ... pushed on towards Louisville, and on the 14th [of September], two brigades under ... General Duncan ... encountered a little more than 2000 National troops, under Colonel T.J. Wilder at Mumfordsville [sic] ... Duncan ... demanded an unconditional surrender. It was refused, and ... the next morning the Confederates drove in the National pickets. A battle began ... and raged for about five hours, when four hundred of the Fiftieth Indiana ... came to the aid of the garrison. The assailants were repulsed with heavy losses. Assured of final success, the Confederates remained quiet until the 16th, when a large portion of Bragg’s main body ... appeared [and overpowered the Union forces] ...” (Losing, The Civil War, p. 238)”

The letter was pictured (Illustration 1). I thought that it would be good fun to (try to) decrypt the message, and a few days later I traveled from my home (then in Baltimore, Maryland) to New York City. I visited Sotheby’s and expressed my desire to break the code. I was forewarned that I would not be paid if I succeeded.

Decrypting The Confederacy.On the train ride back to Baltimore, I realized that accurately transcribing the very deliberate penmanship of the cipher clerk was going to be a challenge. A single error, I expected, would render a good deal of the message undecipherable. Fortunately, I was able to compare the cipher characters against the plain text (that is, unencrypted text) characters in the last paragraph of the letter. But there were other problems. In the left margin of the third line of the text, there was a capital Z in a different hand. I didn’t know if this was a part of the cipher. And not only had a few letters faded but there was a very unusual looking character that resembled a spermatozoon.

I performed my initial investigations on the train, first counting the number of appearances of each of the cipher characters (the letters and that one odd symbol) in the body of the, roughly, 280 character long encrypted section. Twenty-five of the twenty-six English letters were employed. The letter ‘g’, however, was not present and, so, I surmised that the ‘o’ with a tail was indeed the cipher clerk’s own stylish way of writing a ‘g.’ Now, had the method of encryption been a mono-alphabetic substitution whereby each distinct cipher character would have been a unique representative of a plain text letter (so the cipher text alphabet is a simple permutation of the plain text alphabet), it would have been exceedingly unlikely that all twenty-six letters would have appeared in so short a message. And it was at this point that I concluded my first analysis since the canter of the railroad car was making me feel ill.

My childhood Cap’n Crunch coding disc (which came free in the cereal box) allowed for both the encrypting and decrypting of messages by way of a simple linear shift. The plain text letter A was transformed into the cipher letter B and the plain text B became the cipher C and this continued until the Z was encrypted as A (to complete the cyclic rotation). A shift of three letters is called a Caesar Shift in honor of Julius Caesar who, historians tell us, used this method to create his encrypted messages. To decode (decrypt) such a message, one need only shift the right number of steps backwards. This is the simplest example of a mono-alphabetic substitution. In general, mono-alphabetic substitution ciphers may be broken by considering the relative frequencies of the appearing letters and comparing them to a model of the underlying language. The most common letter in written English is E with T coming in a distant second. In descending order, the top eight are ETNORIAS. The character counts for Lot 79, though, were too flat, too even to indicate simple mono-alphabetic method; this suggested a more sophisticated technique was used. This point is also immediately evident when one considers the cipher “word” TTTET. I did notice, before my train arrived home, that the parsing of the presumptive words in the Lot 79 cipher seemed reasonably natural to English – but I did sense an abundance of long words.

May 1, 2006 How I Broke the Confederate Code (137 Years Too Late) [v1.1] Page 1 of 5

31 east 32nd street, sixth floor | new york city, new york 10016-5509 usa | tel: 212.744.6900 | fax: 212.744.6344 | www.razorpointsecurity.com | [email protected] Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved.

Page 5: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

My next step was to do a bit of research to see if I could discern which types of encryption techniques were employed during the War (that is, the Civil War, or the War of the Northern Aggression depending upon your point of view). I consulted David Kahn’s excellent resource, The Codebreakers, and what I found was very interesting,

“The rebels reposed their major trust, however, in the Vigenère (Illustration 2), sometimes using it in the form of a brass cipher disc. In theory, it was an excellent choice, for so far as the South knew the cipher was unbreakable. In practice, it proved a dismal failure. For one thing, transmission errors that added or subtracted a letter ... unmeshed the key from the cipher and caused no end of difficulty. Once Major Cunningham of General Kirby-Smith’s staff tried for twelve hours to decipher a garbled message; he finally gave up in disgust and galloped around the Union flank to the sender to find out what it said.”

So here was direct evidence that Kirby-Smith’s staff had used the Vigenère scheme. I read on,

“Lincoln’s three young cipher operators – Tinker, Chandler and Bates ... solved it. It proved to be Vigenère key MANCHESTER BLUFF ... This was only one of a number of Confederate cryptograms solved by the triumvirate ... it [the solution] provided the three young men with a Confederate keyword, of which the South apparently used only three during the war, MANCHESTER BLUFF, COMPLETE VICTORY (a phrase the Confederates clung to long after that cherished hope had dissipated) ... [and at] about the same time that Booth and others were being hunted down and captured, Jefferson Davis was using the third Vigenère key to compose the last official cryptogram of the Confederacy ... COME RETRIBUTION.”

Now I was almost convinced (a touch of skepticism always kept in reserve) that Lot 79 was indeed encrypted by the Vigenère method.

Lot 79, Edmund Kirby-Smith’s Letter

[ Illustration 1 ]

Reproduced courtesy of Sotheby’s, Inc.

May 1, 2006 How I Broke the Confederate Code (137 Years Too Late) [v1.1] Page 2 of 5

31 east 32nd street, sixth floor | new york city, new york 10016-5509 usa | tel: 212.744.6900 | fax: 212.744.6344 | www.razorpointsecurity.com | [email protected] Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved.

Page 6: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

May 1, 2006 How I Broke the Confederate Code (137 Years Too Late) [v1.1] Page 3 of 5

31 east 32nd street, sixth floor | new york city, new york 10016-5509 usa | tel: 212.744.6900 | fax: 212.744.6344 | www.razorpointsecurity.com | [email protected] Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved.

The Vigenère Method.The method of encryption we call Vigenère today began with the work of Alberti in the middle of the 15th century and culminated with the work of Vigenère a hundred years later. The protocol employs a poly-alphabetic substitution and its security is predicated on a certain key word (a key) that is known to only the sender and recipient of the message. The key may be a word, or words, or gibberish. A modern English Vigenère tableau (or, Vigenère square) (Illustration 2) is a 26x26 array, the first row of which is the alphabet beginning with A and ending with Z. Each successive row is a cyclic shift of the preceding row by one letter to the left.

The Vigenère Tableau

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

B C D E F G H I J K L M N O P Q R S T U V W X Y Z A

C D E F G H I J K L M N O P Q R S T U V W X Y Z A B

D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

E F G H I J K L M N O P Q R S T U V W X Y Z A B C D

F G H I J K L M N O P Q R S T U V W X Y Z A B C D E

G H I J K L M N O P Q R S T U V W X Y Z A B C D E F

H I J K L M N O P Q R S T U V W X Y Z A B C D E F G

I J K L M N O P Q R S T U V W X Y Z A B C D E F G H

J K L M N O P Q R S T U V W X Y Z A B C D E F G H I

K L M N O P Q R S T U V W X Y Z A B C D E F G H I J

L M N O P Q R S T U V W X Y Z A B C D E F G H I J K

M N O P Q R S T U V W X Y Z A B C D E F G H I J K L

N O P Q R S T U V W X Y Z A B C D E F G H I J K L M

O P Q R S T U V W X Y Z A B C D E F G H I J K L M N

P Q R S T U V W X Y Z A B C D E F G H I J K L M N O

Q R S T U V W X Y Z A B C D E F G H I J K L M N O P

R S T U V W X Y Z A B C D E F G H I J K L M N O P Q

S T U V W X Y Z A B C D E F G H I J K L M N O P Q R

T U V W X Y Z A B C D E F G H I J K L M N O P Q R S

U V W X Y Z A B C D E F G H I J K L M N O P Q R S T

V W X Y Z A B C D E F G H I J K L M N O P Q R S T U

W X Y Z A B C D E F G H I J K L M N O P Q R S T U V

X Y Z A B C D E F G H I J K L M N O P Q R S T U V W

Y Z A B C D E F G H I J K L M N O P Q R S T U V W X

Z A B C D E F G H I J K L M N O P Q R S T U V W X Y

[ Illustration 2 ]

To encrypt a message of plain text, the key word is written, letter by letter, above each of the letters in the plain text until each letter of the plain text has a ‘key letter’ above it. Now using the Vigenère tableau, locate the plain text letter on the top row of the tableau and the key letter in the left column and trace into the table. The letter you get is the cipher text letter, the encrypted form of the plain text letter. This process is repeated until the whole message is encrypted.

As an example, to encrypt the name John Wilkes Booth with the key word BUFFY, we first write: B U F F Y B U F F Y B U F F Y J O H N W I L K E S B O O T H

and then use the Vigenère tableau to create the cipher message KIMS UJFPJQ CITYF. (An interesting fact, a Vigenère tableau was found in John Wilkes Booth’s room at the National Hotel and was used at the trial of the eight sympathizers charged with conspiracy to assassinate President Lincoln.) In our example, the key (BUFFY) is secret, so anyone intercepting our cipher message (KIMS UJFPJQ CITYF) would not have both required components for decryption.

To decrypt a Vigenère message, knowing the key, write the key word, again, letter by letter above each character (now a cipher character). Use the tableau and the key letter to trace into the tableau from the left-side to find the cipher character. Then trace upwards to the top row and the plain text letter is identified.

If Lot 79 in the Sotheby’s sale was encrypted by the Vigenère scheme and Kahn’s suggestion that only three key words were used by the Confederacy was correct, testing each of MANCHESTER BLUFF, COMPLETE VICTORY and COME RETRIBUTION should lead to a proper decryption of the message. So I wrote the Lot 79 cipher (Illustration 1) on graph paper on a width of fifteen – as all three of these key words have length fifteen characters. I did this twice, once with the mysterious Z on line three present and again without it. Since the cipher text had two one-letter words, Z and U, I naturally surmised that each of these must correspond to either a plain

Page 7: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

May 1, 2006 How I Broke the Confederate Code (137 Years Too Late) [v1.1] Page 4 of 5

31 east 32nd street, sixth floor | new york city, new york 10016-5509 usa | tel: 212.744.6900 | fax: 212.744.6344 | www.razorpointsecurity.com | [email protected] Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved.

text A or a plain text I. The tableau indicates that if a cipher Z is the encrypted form of a plain text A, there must be a Z in the key word. But none of the three keys had Z’s so this was not the case, I thought. If the cipher Z came from a plain text I, there would be a corresponding R in the key. This was a possibility. So if the spaces in Lot 79 were indeed indicative of breaks between words, the cipher word Z was followed by the cipher word WIAWE. Our line of reasoning would then suggest that one of the key fragments BLUFF, YCOMP, ETRIB and IBUTI (that is, those parts following the R’s) should yield a decrypt of WIAWE. And trying this out, the four decrypts are, respectively, VXGRZ, YGMKP, SPJOD and OHGDW. These are not good, they’re pretty clearly not English words or even nearly English words.

The key for Lot 79, I realized, was not one of the three known Confederate keys.

A New Key.Throughout my investigation, I was conscious of the fact that there were likely to be garbles – sporadic errors – that had been accidentally introduced into the cipher (noted with square brackets to signify the plaintext letter to which a cipher letter should have decoded). If the true key length had been fifteen or a divisor of fifteen, the cipher characters in each column on my graph paper would have been generated from a single row in the Vigenère tableau. That is, each of my fifteen columns would have followed a mono-alphabetic substitution scheme. So I considered these columns of about eighteen characters each. The letter frequency counts were, again, very flat with no apparent disposition towards causality. They seemed “too random.” This new key, I thought, did not have a length of fifteen, or five, or three.

I then could have tried variable widths in order to gauge the key word length but there was only a small amount of cipher. Setting aside my awareness that the true (key) length might not shine prominently because of this, I was also growing weary of recopying the cipher on graph paper.

Since I stubbornly refused the aid of a computer (since no such device had been available to Tinker or Chandler or Bates), I decided that it was a good time to make use of an apparent signature in the cipher, the initials QYJ.

Once a method of encryption has been identified, the most potent weapon in a cryptanalyst’s arsenal is to exploit mistakes and lapses. The last part of the Lot 79 letter was not encrypted and was signed EKS, for Edmund Kirby-Smith. Perhaps this signature QYJ, then, was the encrypted form of EKS. If this was the case, the Vigenère tableau provided the related key fragment MOR.

And then I made a small leap of faith, an educated guess based upon the observation that I had made on the train ride home, that some of these “long” words in the cipher were names and locations. Since Louisville was mentioned prominently in the plain text part of the Lot 79 letter, so might it be in the cipher section too. The first ten-letter word in the cipher was EWGWJMJLWX. If this was the encrypted form of Louisville, I determined, using the Vigenère tableau, that the associated key must be TIMORRBALT. So I shifted IMOREBALT (making an educated correction because I certainly recognized the city in which I lived) ahead nine spaces, and nine spaces again, and tried to decrypt the next part of the cipher – to see if English would come out. And out came AND COVINGTON THEY AR.

So BALTIMORE it was. BALTIMORE is a new Confederate key word, a fourth key.

But why Baltimore? Perhaps the key choice was a portent of what would occur near Baltimore only three days later on the 17th of September, the Battle of Antietam (also known as the Battle of Sharpsburg).

The Message, 137 Years Too Late.With a Vigenère tableau to the left of me and graph paper and colored pencils in front of me, the drama unfolded after 137 years. The first two words in the cipher, UHP XVQAP became THE ENEMY and my excitement was suddenly tempered with a sense of the scope and intimacy of history.

Kirby-Smith’s missive, Lot 79, was probably sent to General Bragg who was headed towards Munfordville to join forces before marching to Louisville. The Union troops had set up a garrison at Munfordville and had repelled the initial Confederate attack. The siege lasted from the 14th until the 17th of September. Bragg’s troops arrived on the 15th and 16th and surrounded the garrison. On the 17th, they appeared and overwhelmed the Union:

THE ENEMY RAPIDLY CONCEN[T]RATING AT LOUIS[V]ILLE AND COVINGTON. THEY ARE CONFI[D]E[N]T OF SOON CRUSHING MY FORCE HERE IT IS IMPORTANT OUR COMMUNICATION WITH EACH OTHER S[H]OULD BE KEPT OPE[N] I SHALL PRES[E]NT A BOL[D] FRONT IN ORDER TO DECEIVE THE ENEMY AS LONG AS POSSIBLE AND WHEN COMPELLED I SHALL FALL BACK UPON YOU. MARSHALL IS STILL FAR BEHI[N]D. E.K.S.

Page 8: HOW I BROKE THE CONFEDERATE CODE (137 YEARS TOO LATE)

About the Author.Dr. Kent D. Boklan is, by trade, a cryptologist and a mathematician. He designed his first public key cryptosystem in 1988 and worked for the National Security Agency (1996-1999) where he was professionalized as a Cryptologic Mathematician. Dr. Boklan received his SB from MIT and Ph.D. from the University of Michigan. He has published more than twenty research papers in cryptography and cryptanalysis (most CLASSIFIED), on mining large data sets and in analytic number theory.

Dr. Boklan lived in Iceland for 3 years where he spent much of his time correcting errors and identifying inversions in the sequencing of the human genome. He has taught at several schools including Vanderbilt University, MIT, the University of Michigan, the University of Iceland, NYU, and presently at Queens College and the CUNY Graduate Center. Dr. Boklan has provided cryptographic consulting services to several New York City agencies including the NYC Police Department, the Mayor’s Office of Contract Services, the Department of Sanitation, the Department of Environmental Protection and DoITT.

Currently, Dr. Boklan serves as Director of Security Research for Razorpoint Security Technologies where he focuses on extending Razorpoint’s capabilities in the research and analysis of security vulnerabilities. His work directly enhances Razorpoint’s ability to ascertain vulnerability levels and develop specialized and effective initiatives for securing business environments.

About Razorpoint Security.Razorpoint Security Technologies, Inc. specializes in researching and analyzing security vulnerabilities and conducting comprehensive security assessments. These assessments provide business leaders and corporate clients the necessary security services and solutions that help keep corporate networks secure. Razorpoint Security has exceptional expertise in network security, attack/penetration testing and identifying security vulnerabilities especially as they relate to Internet solutions and web applications. Razorpoint offers all sectors of business the services necessary to maintain a firm grasp on the evolving state of network security.

For more information, Razorpoint Security Technologies, Inc. can be reached at their headquarters at Madison Avenue and 32nd Street in New York City.

Razorpoint Security Technologies, Inc.31 East 32nd StreetSixth FloorNew York City, NY 10016-5509t: 212.744.6900f: 212.744.6344e: [email protected]: www.razorpointsecurity.com

May 1, 2006 How I Broke the Confederate Code (137 Years Too Late) [v1.1] Page 5 of 5

31 east 32nd street, sixth floor | new york city, new york 10016-5509 usa | tel: 212.744.6900 | fax: 212.744.6344 | www.razorpointsecurity.com | [email protected] Copyright © 2006 Razorpoint Security Technologies, Inc. All Rights Reserved.