View
237
Download
0
Category
Preview:
Citation preview
7/30/2019 Index Coincidence
1/101
IA CRYPTOGRAPHIC SERIES
THE INDEX OF COINCIDENCEAND ITS APPLICATIONS
IN CRYPTANALYSIS
byWilliam F. Friedman
FromAegeon Pork Press
7/30/2019 Index Coincidence
2/101
THE INDEX OF COINCIDENCEAND ITS APPLICATIONS
IN CRYPTANALYSIS
byWilliam F. Friedman
M i D D L E B U R Y COLLEGE LIBRAS*
7/30/2019 Index Coincidence
3/101
1987 AEGEAN PARK PRESS
ISBN: 0-89412-137-5ISBN: 0-89412-138-3
(soft cover)
(library bound)
AEGEAN PARK PRESSP.O. Box 2837Laguna Hills, California 92654(714) 586-8811
Manufactured in the United States of America
7/30/2019 Index Coincidence
4/101
TABLE OF CONTENTS
Introduction
Part IThe Vogel Quintuple Disk
Part II
The Schneider Cipher 65
A Special Problem 89
7/30/2019 Index Coincidence
5/101
THE INDEX OF COINCIDENCE AND ITS APPLICATIONS IN C R Y P TA N A LY S ISINTRODUCTION
Frequency tables in the analysis and solution of ciphers have commonly been employedto make assumptions of plain-text equivalents for the cipher letters constituting a message.The significance of the various phases of the curves themselves, i.e., the crests and troughsand their relative positions in such frequency tables, has been recognized to some extent, butlargely only in connection with the determination of two more or less preliminary points intheir analysis: (1) whether the frequency distribution approximates that of a substitutioncipher involving only one alphabet or more than one alphabet; (2) whether this approximationcorresponds to that of a standard alphabet, direct or reversed, or that of a mixed alphabet.
It will be shown in this paper that the frequency tables of certain types ofciphers havedefinite characteristics of a mathematical or rather statistical nature, approaching more or lessclosely those of ordinary statistical curves. These characteristics may be used in the solutionof such ciphers to the exclusion of any analysis of the frequencies of individual letters consti-tuting the tables or curves, and without any assumptions whatever of plain-text values Jo r thecipher letters.
It is true that cipher systems admitting of such treatment are not very commonly encoun-tered. But inasmuch as such systems are always of a complex nature, which the ordinarymethods of cryptanalysis would find rather baffling, a description of a purely mathematicalanalysis that may be applied to other cases similar to the ones herein described may be con-sidered valuable. In fact, it is possible that the principles to be set forth may find con-siderably wider application in other phases of cryptanalysis than is apparent at this time.
Two examples ofsuch a treatment will be given in detail: One dealing with a substitutioncipher wherein a series of messages employing as many as 125 random mixed secondary alpha-bets can be solved without assuming a plain-text value for a single cipher letter; the other amultiple alphabet, combined substitution-transposition cipher, solved from a single messageof fair length.
(1)
7/30/2019 Index Coincidence
6/101
P A R T ITHE VOGEL QUINTUPLE DISK CIPHER '
This cipher system involves the use of J i v e superimposed disks bearing dissimilar randommixed alphabets. These disks are mounted upon a circular base plate, the periphery of whichis divided into 26 segments; one of these is marked "Plain", indicating the segment in linewith which the successive letters of the plain text as found on the five revolving disks are tobe brought for encipherment. The remaining 25 segments of the base plate bear the numbersfrom 1 to 25 in a mixed sequence, which we have called the "Numerical key." This key may ,however, consist of less than 25 numb ers, in which case one or more of the segments of the baseplate will remain blank. The numbers constituting the key are written on the base plate in uclockwise direction beginning immediately at the right of the plain segment (fig. 1 ) .
M E T H O D O F E N C I P H E H M E N TIn the accompanying example illustrating the details of encipherment it will be seen that the
numerical key consists of 21 numbers, leaving blank, therefore, the four segments immediatelypreceding the plain segment.2 A ssum in g a series of messages, let us suppose the first threebegin as follows:
1. Prepare for bombardment at Harvey . . .2. Enemy attack on Hunterstown . . .3. Second Field Artillery Brigade . . .
Revolving the five cipher disks successively, and thus bringing the first 5 letters of message1, P R E P A , in line with the plain segment, reading from the outer disk inward in the order 1-2-3-4-5, the cipher letters for this first set of 5 plain-text letters are then taken in th e same order fromthe segments of the disks directly in line wi th tha t segment ofthe base plate that bears the numb er 1.In this case it is the eighteenth segment after the plain, in a clockwise direction, a n d , as shownin figure 1, the equ iv a l en t cipher letters for this group are M E K J R . The second set ofl i v e plain-text letters of message 1, R E F O R , are then in a similar manner set in line under the plain segment,and their equivalent cipher letters ar e taken from the segment immediately following segment1 of the numerical key, in a clockwise directio n, vi z, segment 13. The cipher lette rs in this caseare V Z Q W H . The third group ofletters in message 1 finds its cipher equivalents a t segment
1While on duty in the Code and Cipher Section of the I n t e l l i g e n c e Division of th e G e n e r al S t u f f , G . H . Q . ,A.E.F., Lt. Col. F. Moorman, Chief of Section, turned over to the writer for s t u d y a cipher system t o g e t h e r
with a series of 26 test messages submitted by Mr. E. J. Vogel, Chief Clerk, w h o ha d taken considerable interestin cryptography and had, as a result of his studies, d e v i se d the system presented for e x a m i n a t i o n . Th e w r i t e rworked upon the cipher during his leisure moments, but the problem involved considerable labor and so l u t i o nw as not completed before being relieved from duty at t h a t statio n. The main p r i n c i p l e s for solut ion, h o w e v e r ,were established and only the detailed work remained to be completed. A f t e r m i interval of more than a year,while Director of the Cipher Department of the R i v e r b an k Laboratories, G e n e v a , 1 1 1 . , the writer t u r n e d hi sattention once more to this cipher and succeeded in completely solving th e problem by carrying out tho s eprinciples to their logical conclusion. It is recommended that the reader prepare a duplica te oft h e set of disks in o r d e r tha t he may m o r e readi lyfollow the v a r i o u s steps in the analysis.
(1!)
7/30/2019 Index Coincidence
7/101
18; the f o u rt h , at segment 8. The fifth group of plain-text letters, however, will take itscipher equivalents from the first segment to the right of the plain segment, inasmuch as thesegments immediately following segment 8 are blank. Plain-text letters T A T H A , therefore, willbe enciphered on segment, 9, becoming XONJE. This method is continued in like manner through-o u t message 1. Jf message 1 contains more than 21 groups of5 letters, the twenty-second groupwill take its cipher equivalents on segment 1 again; the twenty-third.on segment 13, and so on.
F I G U R E 1
Proceeding now to message 2, the first 5 plain-text letters, E N E M Y , are set up under the plainsegment an d the cipher letters ar e t a k e n fro m segment 2, becoming LTVBM. The second groupof 5 letters is enciphered on segment 1 , the third group on segment 13, and so on, throughoutth e message. The first 5 let ter s of message 3, SECON, are enciphered on segment 3, becomingY A P A C . The first group of cipher letters in n message is always to be taken from the segmentbearing the number which coincides with the serial or accession number of the message in theday's activity.
It will be seen, ther efor e, th at no mat ter wh at repetitions occur in the plain-text beginningsof messages, the cipher letters will give no evidences ofsuch repetitions, for each message has adifferent start ing point, determined, ns said before, by the serial number of the message in the
7/30/2019 Index Coincidence
8/101
4
day's activity. This automatic prevention of initial repetitions in the cipher text is true, how-ever, only of a number ofmessages equal to the length of the numerical key; for, with a number ofmessages greater than the length of the key, the initial segments from which the cipher lettersare to be taken begin to repeat. In this case, message 22 would necessarily have the same initialpoint as message 1; message 23, the same as message 2, and so on. Messages 43, 64, 85, . . . ,would all begin with segment 1; messages 44, 65, 86, . . . with segment 2, and so on.
The secrecy of the system is dependent upon a frequent change of alphabets and a still morefrequent change of numerical key, since it must be assumed, as with all cipher systems or devices,that sooner or later the general method of encipherment will become known to the enemy.The only reliance, therefore, for the safety of the messages must be placed in keeping the specificalphabets and the numerical key for a given series of messages from the enemy.
P R I N C I P L E S O F S O L U T I O NThe following messages 'are assumed to have been intercepted within one day and therefore
to be in the same alphabets and key:M E S S A G E No. 1
MLVXK QNXVD GIRIE IMNEE FEXVP HPVZR UKSEK MVQCI VXSFWGVART YBZKJ WVUPV XZCBD BDOLS GHINZ LJCTE KSLPY VPBYDWTRJK BDDFA ANJXE XGHED ERYVP YPWDJ DFTJV ZHTWB WXTMFOZDOJ
MESSAGENo. 2ULJCY GXAEU DTEIL UZBRW GJZSS QLUOX PTFOO NWSHD BPTJOHQRYY YAXRZ KTEMP UAYMK ISRDZ VUVKW HXAYD YAGSM CURBZLBXOV EBBPI BMLCB UMAXF ZSLXV QFXUE MPZMK MQZZT KMURWEJVB MESSAGE No. 3YLMKW CBGSF VGABP HOZFV QQNSQ NQLQL DIGXM XCWAI QFJOQTYDEL MBMJB SEPSO DHREM ELKIP KXNMW QYBIH BHFDC GLYWCYGMMP EEXZH UBBSB SBONG URQKW YRAYU NYUCS LNEMV VNSXNWGVME MXPDF WGTZE KRLGU ZJFZJ W
MESSAGENo. 4UFHUJ LTMKJ PONFG RIUGG OZGWS UBNMW WGILB JNXTD BPREXMMWHB OBFVO TFGSJ SLXEH RTZMI LLUUX FIFWC PGSBA KRCASXWKQV SLDKS NTESD QVQBN RZDMB JQLYH LTMXB BSVWL ILKYUNMFEB HB
MESSAGENo.5VJZCQ KZJJK DCVQD KSSXY TSUXE GRROH PXKZF ZKFMS VGDWUXLTBW EXHEF AWQWF ZESMX GWCEM JPNVB GRJGB IBWOH YMOAPIZYNX GXMSB OZZGK FVURN NJFGQ DTPLV STOID DWVLR TXTBHWNWIE YJXXW BKOJQ FOUHO T
J These are the actual messages submitted by Mr. Vogel.
7/30/2019 Index Coincidence
9/101
MESSAGE No. 6
XGORF GCHAX DUEIQ XAOWK BKBUH SKWWW XNEYZIWTVU NPCLU KQDIG YLMNC KYJJF SDSTU DCBUKVGQSW OGHPI XCYIO SUAEU BQAPY RMDMW FWGSQLNGQT IPJRI HUMAD DZUTW BFMO
MESSAGE No. 7UFUCL HJYDY ZTHHA NWJWJ IFMZI VZIJE QODUTDKXXR TOSVL MNHNO RZRTX QIPFN HOUNL FGUVORTDCO LNTTM MXMXR TUIIG ZOJCU BTXJK KGUEJIKGSL APA MESSAGE No. 8NWWVY UMHVB RPQHF XOHQN IPATI CFMZT DIMQIVTEGJ IAFEO BMUUT VPSKO HUYNA VPRXS SCUZBGXHRP OIHWF LBTKF QESIO YXVNK AWDAA EQLVJKLXHR NCVNY XSQMC XVXJC TJVSC UJEGZ TFONC
MESSAGE No. 9NYMPE YZYRZ AWLPP IMPBB VWAXZ QSVFG ITZMTNNCMP SOJKI GDPQZ IIPMV ZOCBO UCKXR SEPEMNDEUH YUEIP RFOHI QLQHG IJFRT UTNQC JEGAFYKXXQ VRQUW MESSAGE No. 10TEPDA XXHHC FYMFK QRBJV YVJID JBNXF JBLXUUFRXL WELQJ QJKFW RLSSF BQJWR BZKYN EAUWPCQRVW ZVXXH NZHCW SVEYH NEANW G
MESSAGE No. 1 1SEYBZ MGSOZ CMPSQ BASFH VFSCG CHKSB ZOPRZLGRXO XXCKK MVQGJ XYUOC FFFVJ OZCJE AQSJYKEXYA VWONX GTCWT TGLOI IWORT JJVQE HYNK
MESSAGE No. 12OMTXX WUXZE YEHOJ ALJCO EPLPJ RBCVX WARWXEUJG ROHUP OFSGQ PONLW RAIBP KACIB GMYSBWKZTE KZYIZ ZXFJO HZIVG ESGOX YYCEO B
MESSAGENo. 13QNKIT FZHNC GJBSA JIQBX PFTAO RJLUD IKPKILJUFZ UBSMD VNMNZ UWVYQ DJPFY KETMN CLEQSJBCPG RNNNH BSYZR STGBV E
JAKUK BEQMGOQUVA WHSXEHYMRQ LPUVN
MZPRX SWNFCTPOWI VHNCQMSJBS ESVWJ
IRUJI NSBWRJHCDE WUHJVMYYTH GSEYAHNPM
KZNFC DHSUAOTZNM AMHWXSQRWO DAJMT
KSOXR KNHJKAYSKO UWJDX
CZEVH OCADCWZCMO TOXVM
DYBDQ FTZDAVKHEU XSPFG
TNUWU EVGHUDQXNA GEYHC
7/30/2019 Index Coincidence
10/101
6
ZJRKK CLYZK RINOKYVWFL DXZSK ZPWNIKKGEF FJWWR ZXROXZMKMI D
OSULC WZAFW, SQAUCQQDCX CBIMT DGIASUUHTF FOQOK KNRPIXDCAS BEHQV OPWEY
VHXVD RSRYI PVNWQRRQDM STTAL GKPPODIPSG ZUFVH OZIGB
UMMOL HVVEQ GWTJIRAABX ZCVWR UUJEGCJEDW AKHZA SDZIE
MESSAGENo. 14NBTAK NOBBI WRZRX DQTAR AEKMY MOXLTJUULI TPUJR TPQGH RJZCQ XCNHU NOKMIPZUGG HMYNB DUHQZ BJDFA FJAYU RKHKB
MESSAGE No. 15
WPRBI WCFUO JKQHYBAQRK RUHHF JYUJGPLLXQ DRDEM JMCFBUTISE NZFCU BXI
MESSAGE No. 16QEDMJ LTKFN RMDGTPZTCU BNCEE HAWMBTQXND HMHEW VHTLT
MESSAGE No. 17PQTNS AEFZH TOFQHNMJHQ CHZUM TTSIUHIHWG LBUFZ DVPUB
OIWVX KSMSX MBXOZDYNMA CWULT PKWEEWLHVX FIFZR STVJV
DNMBM JDGVP KJMNDKRGNU ZXCWQ VEZTC
OOXNQ NPDDT QVSJR
GHELW HUMFH LZHNHDCMXO NAYEY GQHID
MESSAGE No. 18
LALNH QUDUA ZBZUD VSJFE MEHXW EUWZT OKWNO OOSIL TASEGOVQVX PMKKW BQRBI VGDGJ JDAHW RZDJA WQAXB FBRLA AHJEPPUMEU HJQJP ZSPPQ VZWDL HECDA LPJJS ZOJYB MO
MESSAGE No. 19
YTVUL TWEVD MBHKV IHTPI GNXBQ XAUAQ OUFVO GSMKB BAKIGYRNAF BKIJC ZJSRN WBQHM UYJPT GHCCB RLNVH OLDQA ZZDCVUWMNZ OPRFG RDONY RCZAM ZYNVQ WFONZ CITES IRWER GKETAYUSUK TFECD BMQVB VBWVV VWCZP TWCTJ FHFEH VNDCO MZVLKYUJPZ BHLDY VVPMD KFHPB VCYU
MESSAGE No. 20
OBHOK UWRON AJDFH FRQMI ULOTG XXIEV HMAKV PVMAV OITKDLDQIW UYVWI JEJRQ MCUZP KGUDN QSPBF TQVZP IZJTU RBUFZJUSIZ DCIGX QESJD LIZVM AOPME YNEXI HOXJK KUYHK AORUYLVD
MESSAGE No. 21
OMXOW LTWEW KFJHN ZMSKK GXFDL, YLGPT YYYNQ CYYMA ZWRADHGWBI HHCGM FDGMN XIQDN NPLYQ NPJNZ OWFVV KMGVH KHCUCSTNVZ CGXHV LZZXX SVTKG POEAC OJYQU MEULH KHYDD ETPDNQYZVU HIMGG RBHEW CUSO
7/30/2019 Index Coincidence
11/101
MESSAGE No.SFDFMYJOUN
CXEUFJJMIV
AQMQWAEGQG
UKCTBJFCEU
FFONYHTZPB
BETEX
QFPMO
GVFZHYRNEIZSLRK
TFSOQ
JTPSY
BWIVZQFXUVXKESC
MESSAGE No.LHKGPXUWVELCSMIDZAUO
KUKNYYZAJE
YVSWWW
WJROYNFDEX
DQNAP
XZHPK
GDZGA
KFPAO
JJIHUJBZET
FIQTRMESSAGE No.
OMOOK
YJODO
WFKKZ
OKWHRXIRIXPXXZL
MPXQIKDCBX
XFRWL
PIQMNUEQTUFHTEG
ALWNKVPIWAINNJI
MESSAGE No.FPETJ
YXDKRGWILMFJYAZ
CDVNYGWMQLFGCVS
YQNVO
LMKQUFQMEJNCZGB
WSMSS
CDALXKOBMSHUIRTCCMFY
FIYGRZURANXUWAIKVKRA
MESSAGE No.
XFYKUVDCYOZAECE
TRJTG
VWRIHESOWP
DLZCS
HQWVZYQXGD
DJBCW
VEASF
XRLCH
NDBFZGYZLIXQDYBPZJCMMPMHGMJQECEMVZLXZUAB
SECCE
IHSAMTZCWL
IOKZO
KNDMFTRURCFDBPB
LSFXW
DURHGGGJBFGAMFQGLHMHCIJTIRHCMJAUHKAEPLQU
GBVIJGBIWLSEBMAANHJOIJKIRSGMFCULGXCZFTIQBUILZRAQBJ
MSTZZ
XJZKI
TKNDA
VEKYGSRUEUUZHRQUFRNA
TQRGTWIZWKOTJEP
ZUAPV
IWORYDDQYB
JNSSI
22
IAONRJNPXG
CLRILCJCFI23
LUVBUXYHALKQQIE24
HZIKQNFWOH
ETVUP25
HQMIPULZFE
DGAMLY
26
LGPKS
YOOHV
RUWKEULXFUAPKQPJODEE
IIZYYVZXCL
QRYXB
OISIERCMUSHCHXX
TWSZJSXXSZ
LHIGGBN
ZRUDA
DGECA
HHYCA
XQNUYGWXXYQHTJY
FAONRYDLOT
QBTFK
CWPSK
IAUNO
RIYITXSINCKAXUF
XHRXEHTAVV
YFYOE
VHMWLRESKE
UXWQX
ZMZKW
OJUGA
HVRUCFVDSQ
XXHRFCDPYMGHEBF
GQZNG
EKBMG
OP
QCAVJ
UZMJMVYIGT
BSNWHLPSJJ
BDWVG
MVKNTCACVC
KIOZGOTTTO
LSIJULPXGC
PAUJD
WDPDB
PVUGD
HOGXWQEHQRRJCSH
MOAFT
BTMYKPFTYF
DFTFO
MGETD
MFIACSETNP
FLUVK
LTTXNUTNIEMZNGB
ZWXXQ
LNICDDLRZVRFTHL
SYBMC
RLETISMQVBBYGMQ
QCVQ
It may be of advantage to begin the elucidation of{he principles of solution by t r a n s l a t i n gthis cipher into terms of the sliding of primary alphabets ag ain st one another with the con-sequent production of a multiplicity ofsecondary alphabets. For example, by using ordinarysliding alphabets such as are commonly used in cryptanalysis, we may produce the sameresults as are given by the set of concentric disks. Let us use the alphabets of the illus-trative disks, mounted upon sliding strips in pairs, and let us slide each pair of alphabets 8letters apart. Thus, if we consider the upper one ofeach pair of alphabets in figure 2 as theplain-text alphabet and begin each alphabet arbitrarily with the letter A, w e have the f o l l o w i n g :
7/30/2019 Index Coincidence
12/101
[Plain text[C iph er
F I G U R E 2A U F Q Z E R H Y G W J O I D M N C T X S B L P K V A U F Q Z E R H Y G W J O I D
A U F Q Z E R H Y G W J O I D M N C T X S B L P K V
fP lain textCipher
A O X G F Z M Y L P U K E T D J V S W I R B N H C Q A O X G F Z M Y L P U K E T DA O X G F Z M Y L P U K E T D J V S W I R B N H C Q
[Plain text
Cipher
A W J B Q I H V K P U F O G T N E D S Z X C M L R Y A W J B Q I H V K P U F O G TA f f J B Q I H V K P U F O G T N E D S Z X C M L R Y oo
[Plain text
Cipher
A C V X Z f f M T F N U I O S E H J R D L G K Q B P Y A C V X Z W M T F N U I O S EA C V X Z W M T F N U I O S E H J R D L G K Q B P Y
fPlain text5|Cipheer
A E T J U D M V Z N W H X O G Y K F R L Q B P C I S A E T J U D M V Z N W H X O GA E T J U D M V Z N f f H X O G Y K F R L Q B P C I S
7/30/2019 Index Coincidence
13/101
Note now that the first set of 5 plain-text letters, P R E P A , yields the same set of 5 cipherletters, M E K J R , that we found on page 2 by using the disks. The only thing which these fivepairs of independent sliding alphabets have in common in figure 2 is the fact that each pair hasbeen slid apart the same number of letters, viz, 8; if we consider the upper alphabet in each pairas the stationary alphabet, then the lower one has been shifted 8 intervals to the right, or 18intervals to the left, of the upper alphabet. This corresponds to the position of number 1 infigure 1, for the latter number occupies the eighteenth segment to the right of the plain segment,or the eighth to the left. The relative positions of the numbers in the numerical key, therefore,correspond to the numbers of intervals the primary alphabets in the form ofsliding strips wouldhave to be displaced in order to produce the same results as the disks.
Now the sliding against itself of a primary sequence containing 26 letters will give rise toa series of 25 secondary cipher alphabets;' likewise, each primary concentric sequence willgive rise to a series of 25 secondary alphabets. If the numerical key consists of 25 numbers,all these secondary alphabets will be employed; if it consists of less than 25 numbers, then acorrespondingly decreased number ofsecondaries will be employed.
Since each primary sequence can give rise to a setof 25 secondaries, the total number ofpossible secondary alphabets in the whole system is 126; but if the numerical key consists ofless than 25 numbers, then the total number ofsecondaries will be less than 125 by exact multi-ples of 5, since the absence of one or more numbers from the key affects all five primary concen-tric sequences. For example, if the key consists of 21 numbers, then there will be involved21X5, or 105 secondary alphabets. In a message of exactly 105 letters, then, each letter willbe enciphered by a different secondary alphabet. If the message contains more than 105 letters,then all the letters after the 105th will be enciphered by the same secondary alphabets as atthe beginning of the message and in the same sequence.
In the explanation of the method ofencipherment it was made clear that the substitutionproceeds in a regular manner, taking successive groups of 5 letters; the cipher equivalents aretaken from the successive segments, proceeding in a clockwise direction from any given initialsegment. It follows, therefore, that in a single long message wherein the complete enciphermentrequires the passing through ofthis sequence ofsegments more than one time, there exist periodicor cyclic phenomena of a type found in various ciphers, due to the presence of a definite or regu-larcycle. In this case, the length of this cycle in terms of groups of 5 letters corresponds exactlywith the length of the numerical key; its length in terms of individual letters is five times thelength of the key. For the sake of clarity, we shall refer to this cycle when stated in terms ofletters as the period. Thus, with a key of 21 numbers, the length of the cycle is 2 1 groups, andthe length of the period is 105 letters. If a message consists of 315 letters, for example, theletters would pass through three complete cycles; the 1st, 106th, and 211th letters would beenciphered in exactly similar positions, and therefore by exactly the same secondary alphabet.The 2d, 107th, and 212th letters would likewise be enciphered by the same secondary alphabet,but of course not the same as the preceding secondary alphabet. With a key of 23 numbers,the length of the cycle is 23 groups, the length of the period, 115 letters; the 1st, 116th, and231st letters would be enciphered by the same secondary alphabet; the 2d, 117th and 232dletters by a different secondary, and so on. If we represent the length of the period by n,then the 1st, (n + l)th, ( 2 n - f - l ) t h , (3n-f l)th, . . . letters fall in the same secondary alphabet;the 2d, (n-f-2)th , (2n+2)th, (3n+2)th, . . . letters fall in another secondary alphabet; andso on. If a message be longer than the period, therefore, it will follow that the 1st, 2d,3d, . . . nth secondary alphabets must contain repetitions of cipher letters, representing
1The twenty-sixth secondary alphabet coincides with the normal alphabet, since each plain-text letter
would be represented by itself in that secondary alphabet.
7/30/2019 Index Coincidence
14/101
10
repetitions of plain-text letters, for these secondary alphabets, are after all only single mixedcipher alphabets, and the repetition of high-frequency letters in ordinary pla in text is a necessarycharacteristic of all alphabetica l languages. Such repetitions will be evidenced by repetitionsin the cipher text at n, 2n, 3n intervals, and they may be used to determine the length of th eperiod. Exactly how this is done will presently be demonstrated.
But the determination of the length of the period is only a slight step forward in the analysis.It is true that it will give us the length of the numerical key, but that is all. What we mu st knownext is the sequence of numbers, or rather, the relative positions of the numbers in this key.
We may ascertain this by fur ther scrutiny of the theoretical and actual results of the methodof encipherment. It is often the cae with various ciphers that the method of encipherment isexcellent in principle, and will yield practically indecipherable messages when th e messages ar cvery few in number, but the weaknesses in the method are quickly disclosed when it is used forregular traffic such as that necessary in military cryptography, where many messages arc to besent each day in the same key. In the cipher under examination, the weakness is introduced bythe fact tha t the initial segment for each message of the day's activity is determined by the serialnumber of the message.1 Now there are as many initial segments for each numerical key as thereare numbers in that key. Once the starting point is determined, all the messages pass through thesame cycle; diff eren t messages merely begin at differen t points in the cycle. Now, since thenumbers applying to these starting points constitute the sequence of numbers in the key, thesuccessive initial segments constitute a series or sequence which, when properly reconstructed;will give us the sequence of numbers in the key.
After the numerical key has been reconstructed, we are yet a long way from solution, for w^are still confronted by the more complex problem of reconstructing, or solving, the cipheralphabets.
We have so far analyzed the solution of the problem into the following three steps or phases;1. The determination of the length of the period.2. The reconstruction of the numerical key.3. The reconstruction of the cipher alphabets.
Let us proceed, therefore, to perform each step.
1. The determination of the length of the per io d . I t was explained above how the ciphersystem will result in the production of repetitions in the cipher text at definite intervals dependentupon the length of the period. The first, (n.-f-l)th, ( 2 n - f - l ) t h , . . .letters fall in the same second-ary alphabet; the second, (n + 2 ) t h , (2n + 2)th, . . . letters fall in another secondary alphabet;and so on. If there are repetitions in the plain text at n intervals apart, t here will be correspond-ing repetitions in the cipher text. There would be involved here only a slightly modi fied caseof the ordinary process of factoring the intervals between repetitions in the cipher text, as appliedin the solution of typical periodic multiple-alphabet ciphers. Thus, in this case, if it hap p e ns thatthe first, second, and third letters of a message, and also the (n+l)th, (n + 2 ) t h , and (n + 3 ) t h ,are the letters THE, then there must be a repetition of the initial trigraph of the cipher t e x t ,representing THE, at a distance ofn letters. Bu t in a cipher involving BO man y alphabets as thisone, th e repetition of trigraphs and polygraphs would natural ly be rather i n f r e q u e n t , except in uvery long message.
However, the paucity of trigraphs and polygraphs, and even of digraphs, need n ot pro veto be a great obstacle, for the repetitions of individual letters may be used with groat accuracyfor the same purpose, viz, the determ inat ion of the length of the period. The m e t ho d is based
1 However, were th e initial segments determined in some o t h e r m a n n e r, the final results would be t h e m ) me , andthe cipher could be so l v e d by a slight modification of method. Even if the init ial segments w e r e s u b j e c t to nolaw, the cipher c o u ld still be solved by the m e t h o d h e r e in a ft e r set f o r t h , with s o m e m o d i f i c a t i o n s .
7/30/2019 Index Coincidence
15/101
XFYKU NDBFZFDBPB SEBMAUFRNA APKQPHTAW OTTTORLETI DJBCWIQKZO EPLQU
u a o i tKNDMF GBVIJ TKNDASRUEU RUWKE RIYITKAXUF CACVC LNICDRFTHL HQWVZ XZUABIHSAM RHCMJ RAQBJXJZKI JNSSI HCHXX
F I O D H E 3. M et sage 26 transcribed upon the attumption of a period of 100 Uttersa O J 5 4 0 4 4 O M < ) 0 7 0 7 5 8 0 8 5 g O S 1 0
LGPKS CWPSK BSNHH LTTXN VDCYO GYZLI TRURC GBIHL VEKYG YOOHV IAUNO LPSJJ UTNIE ZAECE XQDYBDffVG MZNGB TRJTG PZJCM LSFXW ANHJO UZHRQ ULXFU XSINC MVKNT ZWXXQ VWRIH MPMHG DURHG IJKIESOWP MJQEC GGJBF SGUFC TQRGT JODEE XHRXE KIOZG DLRZV DLZCS EMVZL GAMFQ ULGXC WIZWK IIZYGLHMH ZFTIQ OTJEP VZXCL YFYOE LSIJU SYBMC YQXGD SECCE CIJTI BUILZ ZUAPV QRYXB VHMWL LPXGIWORY OISIE RESKE PAUJD SMQVB VEASF TZCWL AUHKA MSTZZ DDQYB RCMUS UXffQX WDPDB BYGMQ XRLCZMZKW PVUGD QCVQ
Number of coincidences W W tf U W Ml If U [ H I III/
7/30/2019 Index Coincidence
16/101
11
upon the construction of what we have called a "Table of coincidence", which will show usmathematically the most probable length of the period. We may as well use the text ofo u rseries of messages to illustrate the process.
If we assume the numerical key to consist of 20 numbers, then the length ofthe period wouldbe 10 0 letters. Let us write the longest message of our seriesviz, message 2 6 i n exactly super-imposed lines c o n t a i n i n g 100 letters each, and then make a count of the recurrences, or moreaccurately, the coincidences, ' of letters within the individual columns thus formed.
Note the repetition of the letter F in the second column. This fact is indicated by placinga checkmark in the tabulation ofcoincidences. Where a letter appears three times within thesame c o l u mn (B, in column 8), three check marks are recorded, for there we have a coincidencebetween the first and second, second and third, and first and third occurrences. Where a letterappears four times in the same column, six check marks are recorded. The number of coinci-dences for each case corresponds to the number of combinations of two things that can be madefrom a total ofn things.2
We note that on the assumption of a period of 100 letters there is a total of 39 coincidences.Now, if the period is really 100 letters in length, then the repetitions of letters within columnsare not mere coincidences brought about by chance superimposition of identical letters but areactual recurrences in the restricted sense of being the resultants of the encipherment ofsimilarplain-text letters by the same secondary alphabet. But there is no way of determining fromthis single tabulation whether the assumption of a period of 100 letters is correct or not, andtherefore we do not know whether the repetitions in this case are recurrences or coincidences.This we can determine, however, by a comparison of tabulations of coincidences made uponvarious assumptions of length of period. Theoretically, the correct assumption should yield ahigher total of coincidences than the incorrect assumptions, because the recurrence of high-frequency plain-text letters (in English, E, T, 0, A, N, I, R, S, H, D) is to be expected, and thenumber of such causally produced repetitions should certainly be greater than the numberof repetitions produced by mere chance in the superimposition.*
Let us proceed, therefore, to make a table of coincidence for the various probable lengths ofperiod, first transcribing message26 into lines corresponding to hypotheses of 105, 110, 115, 120,and 125 letters.4 Before doing so, however, we find it necessary to introduce a few remarksupon the desirability of using a slight correction factor for this table.
1 We draw the distinction between recurrences and coincidences on the g r o u n d s that the former term sho u l d ,and will here be used to indicate repetitions ofletters in the cipher text causally related to each other by beinge n c ip h e n n c n t s ofidentical plain-text letters by identical alphabets; whereas the latter term indicates r e p e t i t i o n snot causally related to each other in this manner but simply the result of chance. A repetition may therefore beeither a recurrence or a coincidence. The process of factoring in ordinary multiple-alphabet ciphers of the periodictype has for its purpose the separation and classification of repetitions into the two kinds. Until proved otherwise,all r e p e t i t io n s must be considered coincidences.1
The formula is: C = n ( n 1 ) / 2 . Thus, when n = 5, the numbe r of coincidences is 10; when n =6, thenumber is 15.3 Since this paper was written, a further study of the concept of coincidences has made it possible to p r e d i c t ,w i t h a fair degree of accur acy, just how many coincidences should be expected for correct and incorrect assump-t ions. The mathematical and statistical analyses are given in detail in W . F. Friedman, Analysis of a Mechanico-Rlectrical Cryptograph, Section VI; S. K u l l b a c k , Statis tical Methods in Cryptanalysis, Section VII (TechnicalP u b l ic a t io n s , S. I. S., 1934). I t, results from these mathematical studies that the ratio of the n u m b e r ofactualcoincidences to t h e total number ofpossible coincidences is .038 for an incorrect case and .066 for a correct one.This knowledge eliminates th e necessity for t a b u l a t i o n s corresponding to every possible case and gives a reliablemeans of d e t e r m i n i n g the correct assumption as soon as it is made. (See N o t e s 1 and 3 on pages 13 and 14respectively.)'The message need not be written out more than once if long strips ofcross-section paper are used, writinga line on each strip. Each line should contain 125 letters, and the various strips can then be arranged to bringthe proper letters into supcrimposition according to each hypothesis in turn.
7/30/2019 Index Coincidence
17/101
12
For really accurate comparison, the totals of coincidence obtained for the various hypothesesshould be corrected in order to make proper allowance for the differences in totals due solelyto the v a r i a t i o n in the number of letters in each column when transcribed according to eachh y p o t h e s i s . 1 From a c r y p t o g r a p h i c point of view, a total of 100 coincidences in an arrangementwhere there are G letters in each column represents a slightly greater degree ofcoincidence thanin an arrangement of the same message also yielding 100 coincidences, where there are 7 lettersin most of the columns; there is less opportunity for coincidences to be produced in the formercase. We should, therefore, reduce all the totals of coincidence to some common basis. Thereasoning we have followed in the establishment of a correction factor to be applied is as follows:M t ^ s . i^ t . 2 0 c o n t a i n s exactly 539 letters. W h e n transcribed into lines of 100, 105, . . . , 125letters, the columns in each of these five set-ups have the following number of letters:
T A B L E I
39 c o l u m n s of 0 l o i t e r s and 61 c o l u m n s of 5 letters.14 columns of G letters and 91 columns of 5 letters.90 c o l u m n s of f ) l o i t e r s u n d 11 columns of 4 letters.70 c o l u m n s i ; f .
r> K ' t t o r s und 36 columns of 4 letters.59 c r h m i n s of 5 leHors and 61 columns of 4 letters.
39 columns of 5 letters and 86 columns of 4 letters-
Period(letters)
100105110
115120126
Assuming that perfect coincidence can occur in each column (all letters identical), then ina c o l u m n of G letters we can have 6X5/2=15 coincidences; in a column of 5 letters, 5X4/2=10coincidences; and in a column of 4 letters, 4X3/2 = 6 coincidences.
If now we find the total number of chances for coincidences for each of the arrangementsgiven in table I, we have the following:
T A B L E I I
Period(letters)100
1051101 1512 012 5
C o nd i t i o ns
39 c o l u m n s of 15 chances each, 61 c o l u mn s of 10 chances each..14 columns of 15 chances each, 91 columns of 10 chances
each.. _.99 columns of 10 chances each, 11 columns of 6 chances each
79 columns of 10 chances each, 36 columns of 6 chances each59 columns of 10 chances euch, 61 columns of 6 chances each39 c o l u m n s of 10 chances each, 86 c o l u m n s of 6 chances each
Totalchances
1, 195
1, 1201 0561, 006
956906
Choosing for our basis of comparison the hypothesis of a period of 100 letters, the variousproportions of chances for coincidences for each of the remaining hypotheses will constitutecorrection factors to be applied in each case. They are as follows:
T A B L E I I IPeriod
(letters)
100
105110116120125
C h a nce s forcoincidence1, 1951, 1201,0561,006
956906
Correctionfactor
1.001.071. 131. 191.251.32
1 See f o o t n o t e 3 on the preceding page. This correction factor is unnecessary if the number of actual coin-cidences is reduced to a p e r c e n t ag e basis, in terms of the total possible number ofcoincidences.
7/30/2019 Index Coincidence
18/101
13
We are now ready to establish the tables of coincidence for the various hypotheses. Spaceforbids the actual demonstration of the several arrangements of message 26 to correspondto the various hypothetical key lengthsthat sh o wn in figure 3 is typical of them all. Weshall give only the final result in table IV.
T A B U B I VPeriod(Tetters)
100
105
110
115120
125
Coincidence* on each hypothesis
t H J M M I M I n u r t u n u m inuwwtHiwwwwwwtHiwmwnumwwiiWWlHtWMWWMWttUmWWWWWWWWIWWWWWWIII
Total
39
45
47
60
3633
Correctionfactor
1.001.071. 13
1. 191.251.32
Correctedtotal
39.048.253 . 171.445.043.6
There seems to be no doubt but that the period of 115 letters is correct. The cycle, there-fore, consists of115-f-5=23 groups, and the numerical key contains 23 numbers. This meansthat the two final segments bear no numbers, and are therefore blanksegments.
2. The reconstruction of the numerical key.Having ascertained the length of the period,and thus the length of the numerical key, the next step is to reconstruct the sequence ofnumbers constituting the key. A s stated before, this process is made possible in this caseby the method ofencipherment which is such that all the messages of the day's activity gothrough exactly the same cycle, but the successive messages begin at different initial pointsin this cycle, and these points coincide with the relative positions of the numbers making upthe sequence of numbers in the key.
We do not know the absolute position of any numbers in the numerical key, hut we mayproceed first to find their relative positions, regarding the key in the n a tu r e of a continuouscycle or chain. Later we may find the absolute positions of the numbe r s in this cycle, i.e., weshall have reconstructed the numerical key itself.
Now, as stated before, all messages proceed through the same cycle; it is only the initialpoints for the messages wlu'ch are different. Hence, if we can determine the relative positionsin which messages 1, 2, and 3 should be superimposed in order to make all three messagescoincide as regards the portion of the cycle through which they pass simultaneously, we shallthus have determined the relative positions of the numbers 1, 2, and 3 in the cycle. Forexample, if we should find that the first group of message 2 belongs under the twelfth groupofmessage 1, and the first group of message 3 under the sixth group of message 2, we wouldconclude that the relative positions of these numbers in the cycle are these:
1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 3 4 5
1 2 . . . . 31 A s a verification of Note 3, page 11 a b o v e , the percentages of the actual number of coincidences to the
total possible number has been calculated:Period Percentage100 0.033106 . 040110 044115 .060120.126.
. 038
.036
7/30/2019 Index Coincidence
19/101
14
How are these relative positions for messages 1, 2, and 3 to be determined? Clearly, wemay use the same basis for this determination as for that concerned with the length of thep e r i o dv i z , a table of coincidences. For, when messages 1 and 2 are correctly superimposedwe should get a higher degree of coincidence between the letters of the superimposed columnsthan when they are not correctly superim posed . The reasons are the same as in the precedingcase: The successive groups of cipher letters in one message represent encipherments by thesame secondary alphabets as apply in the other message; hence, repetitions of plain-text letterswithin columns will result in repetitions of cipher letters within those columns . We can there-fore determine the correct superimposition by experiment and the recording of coincidences.
Having found that the k ey contains 23 numbers, it is obvious that message 24 has thesame starting point in the cycle as message 1; we may proceed at once to combine them bydirect superim position . We do likewise with messages 2 and 25, and 3 and 26. The purposeof this stej) is merely to afford greater accuracy through the increased number of letters withwhich we shall have to deal in finding the correct relative superimposition of these three setsof messages.
Since message 26 contains the greatest amount of text, we may regard it' as our base an dtry to find the relative position of messages 1 and 24 with respect to it. We now place messages1 and 24 beneath message 26, beginning the first group of the former beneath the second groupof the latter1 . A tabulation of the number of coincidences in each column is then made. Mes-sages 1 and 24 are again placed beneath message 26, beginning the first group of the formerbeneath the third group of the latter and again the total number of coincidences is ascertained.In other words, messages 1 and 24 are moved successively 1, 2, 3 . . . 22 intervals 2 to the rightof message 26, and a table of coincidences is constructed . The greatest total number of coin-cidences, as s h o w n in table V, is given when messages 1 and 24 are placed three intervals to theright of message 26. This means , then, that the n umbers 1 and 3 occupy these relative positions:
12 3 1
T A B L E V 3Intervals 1 2 ~ ~ 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22C oincidences 59 49 86 40 53 39 55 60 59 47 55 45 61 64 53 54 46 58 50 48 52 40
S h o w i n g v a r i o u s totals of c o i n c i d e n c e when messages 26 and 1 and 24 are superimposed at differenti n t e r v al s c o r r e sp o n d i n g to the successive hypotheses of relative position.Messages 2 and 25 are next taken for experiment and a similar table of coincidences is made
for the various superimpositions with message 26, omitting, of course, the three-interval trialsince the position corresponding to that test we found to be occupied by the number 1. As soonas th e re l a t i ve position of eacli number is found, the subsequent trials corresponding to thosenumbe r s may be omitted . The labor is, of course, somewhat tedious bu t may be done by
1 It was t h o u g h t unnecessary to use both messages 3 and 26 as a base, since the la tter a l o n e seemed suffic ient lylong to give reliable information.2
An interval in this case is equal to a g r o u p of 5 letters, because the enciphernicnt proceeds in segments of5 letters each. We need try only the first 22 in t e r va l s since the cycle consists of only 23 positions, un d messages 1and 24 cannot have the same beginning point as messages 3 and 20.3 W i t h regard to th e m a t h e m a t i c a l n o t i o n discussed i n N o t e 3 o n p . 1 1 , t h e f o l l o w i n g r e m a r k i s p e r t i n e n t :The total number ofcoincidences possible when messages 1 and 24 are placed three intervals to the right ofmessage 20 is 1253. The nu mb er ofac t u al coincidences, 8(5, w h e n divided by 1253 g i v e s .068. Since the expect edresult for a correct assumption is .060, it is at once evident that this assumption is correct and it is consequentlyunnecessary to consider any further cases.
7/30/2019 Index Coincidence
20/101
F I O C R B 'nf correct tuperimpori t ion of M E S S A G E S 1 A N D 24
1MLVXKYPITOJ
24|MOOKI E T V U P
2!"""iQFXUE25
FPETJDGAMLIYLMKW1 YRAYU
26
XFYKURUWKEMJQECYFYOE
AUHKA
QNXVDDFTJVOKWHRQHTJYGXAEUHPZHKCDVNYQBTFKCBGSF
NYUCSNDBFZRIYITGGJBFLSIJUMSTZZ
GIRIEZHTWBMPXQIOP
DTEILMQZZTLMKQUVYIGTVGABP
LNEMVKNDMFBDffVGSGMFCSYBMCDDQYB
IUNEEWXTMFPIQMN
UZBRWKMURWCDALXFLUVKHOZFVVNSXNGBVIJMZNGBTQRGTYQXGDRCMUS
FEXVPOZDOJALWNK
GJZSSEJVBFIYGRFJYAZ
QQNSQWGVMETKNDATRJTGJODEE
SECCE
UXWQX
HPVZR
HZIKQ
QLUOXHQMIPYQNVONQLQLMXPDFLGPKSPZJCMXHRXECIJTIffDPDB
UKSEKXQNUY
PTFOO
FAONRWSMSSDIGXMWGTZECBPSKLSFXWKIOZGBUILZBYGMQ
MVQCIGQZNG
NWSHD
QCAVJCCMFYXCWAIKRLGUBSNWHANHJODLRZVZUAPV
XRLCH
VXSFWDFTFO
BPTJO
MFIACKVKRAQFJOQZJFZJLTTXNUZHRQDLZCSQRYXBIOKZO
GVART
YJODO
HQRYY
YXDKRY
TYDELWVDCYOULXFUEMVZLVHMWLEPLQU
M E S S A G E S 2 AND 25PTFOO NWSHD BPTJO HQRYY YAXRZ KTEMP UAYMK ISRDZ VUVKW HXAYD YAGSM CURBZ LBXOV EBBPI BMLCB UMAXF ZS
GBMQL FQMEJ KOBMS ZURAN ULZFE YDLOT UZMJM SE7NP GWILM FGCVS NCZGB HUIRT XUMEB8AOE6 3 AND 26MBMJB SEPSO DHREM ELKIP KXNMW QYBIHGYZLI TRURC GBIWL VEKYG YOOHV IAUNOXSINC MVKNT ZWXXQ VWRIH MPUHG DURHGGAMFQ ULGXC WIZWK IIZYY HTAW OTTTOLPXGC RLETI DJBCK IHSAM RHCMJ RAQBJXJZKI JNSSI HCHXX ZMZKW PVUGD QCVQ
BHFDC GLYWC YGMMP EEXZH UBBSB SBONG ULPSJJ UTNIE ZAECE XQDYB FDBPB SEBMA SIJKIR UFRNA APKQP KAXUF CACVC LNICD ERFTHL HQtVZ XZUAB GLHMH ZFTIQ OTJEP VIWORY OISIE RES KE PAUJD SMQVB VEASF T
7/30/2019 Index Coincidence
21/101
1
1
15
clerks. This process is continued insimilar manner for all the remaining messages. The data formessages 2 and 25 show that they belong five intervals to the right of messages 1 and 24, and therelative positions of the numbers 1,2, and 3 in the cycle are therefore these:
2 3 1
1 2 3 4 52
The data for this determination of position for all messages are given in table VI.TABLE VI. Data for determination of the position ofthe numbers in cycle
M E S S A G E 2 0 U S E D A s A B A S K{Position.... 2 3 4 5 6 7 8 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23I N umber of coincidences 59 49 86 40 53 39 65 60 59 47 65 45 61 64 53 54 46 68 50 48 52 40{Position 2 3 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23[ N u m b e r of coincidences 45 49 58 67 62 61 99
.{Position 2 3 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 21 22 23I N umber ofcoincidences 34 35 38 36 16 33 27 23 19 24 28 24 24 21 19 25 20 42 35 32{Position 2 3 5 6 7 8 10 11 12 13 14 16 16 17 18 19 20 22 2361 Number of coincidences 44 31 33 31 33 29 26 22 16 23 49{Position. 2 3 5 6 7 8 10 11 12 13 16 16 17 18 19 20 22 23I N umber ofcoincidences _ 25 33 37 25 40 31 20 11 66 -
-{Position 2 3 5 6 7 8 10 11 13 15 16 17 18 19 20 22 237[Number ofcoincidences.,._. 23 33 26 27 25 20 24 31 31 16 19 28 28 28 46[Position 2 3 5 6 7 8 10 11 13 15 16 17 18 19 22 23I N umber of coincidences 36 27 53{Position .... 2 3 6 7 , 8 10 11 13 15 16 17 18 19 22 23I N umber of coincidences 27 32 26 17 21 21 30 17 13 21 27 58{Position _ 2 . 3 6 7 8 10 11 13 16 16 17 19 22 23{ N u m be r ofcoincidences... 18 18 15 13 23 19 14 18 13 20 10 15 18 52{Position 2 3 6 7 8 10 11 13 15 16 17 19 22J N umber of coincidences 16 28 10 15 14 16 16 26 17 28 20 31 19
uj{Position 3 13 16 19I N umber ofcoincidences 21 30 22 1612jPosition 2 .3 6 7 8 10 11 15 16 17 19 22( N u m b e r of coincidences., 18 27 56^{Position....-- 2 3 7 8 10 11 15 16 17 19 22[ N u m b er of coincidences 20 1915 18 18 19 16 38
{Position 2 3 7 8 10 ll 15 17 19 22[ N u m b e r of coincidences '29 17 32 30 18 20 23 33 24 42{Position r 2 3 7 81011161719[Number of coincidences 38 32 37 27 24 '69
1Secondary test, using messages 1 and. 24 plus 2 and 25 as the base.
7/30/2019 Index Coincidence
22/101
16T A B L E V I . Data for determination of th e position of the numbers in cycleC ontinued
M E S S A G E 26 U S E D A s A B A S E( P o s i t i o n 2 3 7 8 10 15 17 19[ N u m b e r of coincidences 24 23 47[Position 2 3 8 10 15 17 19I Number ofcoincidences 15 44 30
18 [Po sit ion 2 8 10 15 17 19[ N u m b e r of coincidences 48 31 26[Position 8 10 15 17 19[ N u m b e r ofcoincidences 28 37 66[Position 8 10 17 19\ N u m b e r o f c o i n c id e n c e s 65 14
21[Position - 10 17 19J N u m b e r of coincidences 26 33 47[Position 10 17[ N u m b e r of coincidences 48 22
23[Position 17( N u m b e r of coincidences 50
When in any trial the total of coincidences for a certain position stands out prominentlyfrom the preceding ones, subsequent trials for the message concerned are omitted.
The final result of carrying out this work for all the messages tried against message 26 isthat the following cycle is established:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 233 18 17 1 8 12 16 20 2 22 15 6 11 5 19 13 23 9 21 7 4 14 10
This reconstructed cycle represents, as stated before, the relative, not the absolute, posi-
tions of the numbers in the key, because there is as yet no indication as to what number occupiesany given segment on the base disk. Furthermore, we must remember that there is a breakof two intervals somewhere within this cycle, representing the two blank segments on the basedisk. This fact may cause some difficulty later on but we shall find a way of overcoming it.In the meantime, we may content ourselves with the cycle as established and proceed to an analy-sis of the results of its reconstruction.
The immediate result is to enable us to superimpose all the messages of the day's activity,as shown in figure 5. We may begin with message 1, or with any other message, but for thesake of convenience in analysis, we may as well transcribe them in regular order.
The letters of each of these 11 5 columns belong to a corresponding number of secondaryalphabets, all different, but all single, mixed, substitution alphabets. Individual frequency tablesare made, therefore, and are s h o w n in table VII. The tables are given in groups of five, labeledA, B, C, D, and E, corresponding to the five primary alphabets of the system. The groups ofalphabets are given in their proper cyclic sequence, so that each set of five alphabets is accom-panied by a number which identifies its position in the sequence ofsegments. Thus, we may referto any of these secondary alphabets by number and letter,aa for example, 5B, meaning the secondsingle alphabet under segment 5. The A alphabets all apply to the outermost primary alphabet,or alphabet 1 ; the B alphabets to primary alphabet 2, and so on. We are now ready to attempt ananalysis of the cipher text with the object of solving these secondary alphabets and reconstructingthe primaries.
7/30/2019 Index Coincidence
23/101
11. MLVXK
YPWDJ
LBXOV
HOZFVVNSXN
WGILB
GWCEM
DCBUK
MZPRX8.
YXVNKNNCMP
YVJID
OZCJE
HZIVGVNMNZWRZRX
D
CWULT
OZIGBHWEQDCMXOZBZUDHO
WBQHMMZVLK
DCIGXZWRAD
QEHQR
YZAJE24. OMOOK
ETVUPGWILH
GBVIJUZNG8TQRGTYQXGDRCHUS
8 1210 11QNXVD GIRIE
DFTJV ZHTWB
EBBPI BHLCBQQNSQ NQLQLWGVME HXPDFJNXTD BPREX
JPNVB GRJGBOQUVA WHSXESWNFC DKXXRNWWVY UMHVBAWDAA EQLVJSOJKI GDPQZ
JBNXF JBLXU
AQSJY WZCMO12. OHTXX
ESGOX YYCEO
UWVYQ DJPFYDQTAR AEKMY
PKWEE UUHTF16.
TQXND HMHEWGWTJI PQTNSNAYEY GQHIDVSJFE MEHXW
UYJPT CHCCBYUJPZ BHLDY
QESJD LIZVMHGWBI HHCGM
CXEUF UKCTB
NFDEX GDZGAOKWHR MPXQIQHTJY OPFGCVS NCZGB
TKNDA LGPKSTRJTG PZJCMJODEE XHRXESECCE CIJTIUXWQX WDPDB
l e 2030 29IHNEE FEXVP
WXTIIF OZDOJ2.
UMAXF ZSLXVDIGXM XCWAIWGTZE KRLGUMMWHB OBFVOIBWOH YMOAPVGQSW OGHPI
TOSVL HNHNORPQHF XOHQNMYYTH GSEYAIIPHV ZOCBO
KSOXR KNHJK
TOXVM KEXYAWUXZE YEHOJB
KETHN CLEQSMOXLT YVWFL
FOQOK KNRPIVHXVD RSRYIVHTLTAEFZH TOFQH
EUWZT OKWNORLNVH OLDQAWPMD KFHPB
20. OBHOKAOPME YNEXIFDGMN XIQDN
BETEX ZSLRK
JBZET XYHALPIQMN ALWNK
25.
HUIRT XUWAICWPSK BSNWHLSFXW ANHJOKIOZG DLRZVBUILZ ZUAPVBYGMQ XRLCH
2 2230 UHPVZR UKSEK
ULJCY GXAEUQFXUE HPZMKQFJOQ TYDELZJFZJ WTFGSJ SLXEH
IZYNX GXMSB
XCYIO SUAEU
R2RTX QIPFNIPATI CFJIZTKLXHR NCVNYUCKXR SEPEM
UFRXL WELQJ
VWONX GTCWTALJCO EPLPJDQXNA GEYHCDXZSK ZPWNI
15.
PLLXQ DRDEMPVNWQ QEDMJOOXNQ NPDDT
OOSIL TASEG
ZZDCV UWMNZ
VCYUUWRON AJDFHHOXJK KUYHKNPLYQ NPJNZ
22. SFDFM
QFXUV CLRIL
DGECA CDPYM
HZIKQ XQNUYFPETJ CDVNYDGAML QBTFK
LTTXN VDCYOUZHRQ ULXFUDLZCS EMVZLQRYXB VHMWLIOKZO EPLQU
i s e40 45MVQCI VXSFW
DTEIL UZBRWHQZZT KMURW
MBMJB SEPSO
RTZMI LLUUX
OZZGK FVURN6. XGORF
BQAPY RMDMW
HOUNL FGUVODIMQI IRUJIXSQMC XVXJCOTZNM AMHWX
QJKFW RLSSF11 .
TGLOI IWORTRBCVX WARWJBCPG RNNNHJUULI TPUJROSULC WZAFWJMCFB WLHVXLTKFN RMDGT
QVSJR RAABX
OVQVX PMKKW
OPRFC RDONY
FRQMI ULOTGAORUY LVD
OWFW KMGVHAQMQW FFONY
LHIGG FVDSQ
BTMYK LCSMIGQZNG DFTFOLMKQU CDALXVYIGT FLUVKGYZLI TRURCXSINC MVKNTGAMFQ ULGXCLPXGC RLETIXJZKI JNSSI
F11 6M U
GVART YBZKJ
GJZSS QLUOXEJVBDHREM ELKIP
FIFWC PGSBA5. VJZCQ
NJFGQ DTPLVGCHAX DUEIQFWGSQ HYMRQ
TPOWI VHNCQNSBWR VTEGJTJVSC UJEGZNDEUH YUEIP
BQJWR BZKYNSEYBZ MGSOZ
JJVQE HYNKDYBDQ FTZDA
BSYZR STGBV
TPQGH RJZCQ
SQAUC WPRBIFIFZR STVJVDNMBM JDGVP
ZCVWR UUJEG
BQRBI VGDGJ19.
RCZAM ZYNVQ
XXIEV HMAKVKHCUC STNVZGVFZH JTPSYRJCSH JJMIV
YVSWW DQNAP
YJODO XIRIXFIYGR HQMIPFJYAZ YQNVO
GBIWL VEKYGZWXXQ VWRIHWIZWK IIZYYDJBCW IHSAMHCHXX ZMZKW
IODRB 51* 130 W
WVUPV XZCBD
PTFOO NWSHD
KXNMW QYBIHKRCAS XWKQVKZJJK DCVQDSTOID DWVLRXAOWK BKBUHLPUVN LNGQT
RTDCO LNTTMIAFEO BMUUTTFONC HNPBRFOHI QLQHG
EAUWP AYSKOCMPSQ BASFH
XEUJG ROHUP13. QNKIT
E
XCNHU NOKMIWCFUO JKQHYXDCAS BEHQVKJMND RRQDM
NMJHQ CHZUM
JDAHW RZDIAYTVUL TWEVDWFONZ CTTES
PVMAV OITKDCGXHV LZZXXIAONR TWSZJJFCEU QFPMO
23.
KFPAO FIQTRKDCBX UEQTUFAONR QCAVJWSMSS CCMFYYOOHV IAUNOMPMHG DURHGHTAVV OTTTORHCMJ RAQBJPVUGD QCVQ
23 n nBDOLS GHIHZ
BPTJO HQRYY
BHFDC GLYWC
SLDKS NTESDKSSXY TSUXETXTBH WNWIESKWWW XNEYZIPJRI HUMAD
HXMXR TUIIGVPSKO HUYNA
9. NYMPEIJFRT UTNQC
UWJDX CQRVWVFSCG CHKSB
OFSGQ PONLWFZHNC GJBSA
KKGEF FJWWROIWVX KSMSXOPWEY UTISESTTAL GKPPO
TTSRU GHELW
WQAXB FBRLAMBHKV IHTPIIRWER GKETA
LDQIW UYVWI21.
SVTKG POEAC
OJUGA HOGXWTFSOQ XKESCLHKGP KUKNYKQQIE HHYCAVPIWA NFWOHMFIAC YXDKRKVKRA YLPSJJ UTNIEIJKIR UFRNARFTHL HQWVZIWORY OISIE
21 7M 85LJCTE KSLPY
YAXRZ KTEMP
YGMMP EEXZH4.
QVQBN RZDMBGRROH PXKZFYJXXW BKOJQJAKUK BEQMGDZUTW BFMO
7. UFUCLZOJCU BTXJKVPRXS SCUZBYZYRZ AWLPPJEGAF SQRWO
ZVXXH NZHCWZOPRZ CZEVH
RAIBP KACIBJIQBX PFTAO
ZXROX PZUGGMBXOZ QQDCX
NZFCU BXIPZTCU BNCEE
HUMFH LZHNH
AHJEP PUMEUGNXBQ XAUAQYUSUK TFECD
JEJRQ MCUZPOMXOW LTWEW
OJYQU MEULHYJOUN AEGQGCJCFI BNWJROY XZHPKGHEBF PFTYFGffXXY EKBMGGWMQL FQMEJ
ZAECE XQDYB
APKQP KAXUFXZUAB GLHMHRESKE PAUJD
4n
VPBYD
UAYMK
UBBSBUFHUJJQLYHZKFMSFOUHOIWTVU
HJYDY
KGUEJJHCDE
IMPBBDAJMTSVEYHOCADC
GMYSB
RJLUD14.
HMYNBCBIMTHAWMB
CJEDW
HJQJP
OUFVO
BMQVB
KGUDNKFJHNKHYDDHTZPB
JJIHUDZAUO
MGETDKOBMSFDBPB
CACVC
ZFTIQSMQVB
II
14MWTRJK
ISRDZ
SBONG
LTMKJLTMXBVGDWUT
NPCLU
ZTHHAMSJBSWUHJVVWAXZYKXXQ
10.
NEANWLGRXO
VKHEUIKPKIZJRKKDUHQZDGIAS
KRGNU
AKHZA
ZSPPQ
GSMKBVBWW
QSPBFZMSKKETPDNYRNEI
LUVBU
W
WFKKZ
ZURAN
SEBMA
LNICDOTJEP
VEASF
10 to oBDDFA
VUVKW3.
URQKWPONFGBSVWLXLTBW
KQDIG
NWJWJESVWJGXHRPQSVFGVRQUWTEPDA
G
XXCKK
XSPFG
TNUWUCLYZK
BJDFABAQRK
ZXCWQ
SDZIE
VZWDLBAKIGVWCZP
TQVZPGXFDL
QYZVUBWIVZ
ZRUDA
PXXZL
ULZFE
26.
SRUEU
ESOWPVZXCL
TZCWL
3 101ANJXE
HXAYDYLMKWYRAYURIUGGILKYUEXHEF
YLMNC
IFMZIIKGSLOIHWFITZMTXXHHC
MVQGJ
WKZTEEVGHURINOKFJAYURUHHF
VEZTC
HIHWG18.
HECDA
YRNAF
TWCTJ
IZJTUYLGPT
HIMGGJNPXGXXHRF
XFRWL
YDLOT
XFYKURUWKE
MJQEC
YFYOE
AUHKA
18 no XGHED E
YAGSM CCBGSF V
NYUCS LOZGWS UNMFEB KAWQWF ZKYJJF S
VZIJE SQ
APALBTKF QKZNFC DHFYMFK Q
XYUOC F
KZYIZ ZLJUFZ UNBTAK NRKHKB ZJYUJG D
DIPSG ZU17. UM
LBUFZ DVLALNH QULPJJS Z
BKIJC ZJFHFEH VN
RBUFZ JUYYYNQ CYRBHEW CU
SXXSZ HV
MOAFT XUFHTEG IUZMJM S
NDBFZ KN
RIYIT BDGGJBF SLSIJU SYMSTZZ DD
7/30/2019 Index Coincidence
24/101
17
I
The first thought that comes to one is that these individual mixed alphabets may be solvedupon the basis offrequency alone, as is commonly done with such frequency distributions. Forexample, we might assume the most frequently occurring letter in each alphabet to be the equiva-lent of plain-text letter E, the next, of T, and so on; then substitute the A ssumed values in the textand try to build up words. But each of these alphabets contains only an average of 36 letters, sothat hardly any assumption would carry a considerable degree of certainty. This is especiallythe case in English text where the letter E does not always stand out prominently as the mostfrequently used letter in small amounts of text. Were an analysis of this kind absolutely necessaryto solution, it is doubtful whether this particular set ofmessages could be solved except after along period of patient labor. But it will be shown now that such an analysis is in fact not essen-tial, because we may be able to effect a direct reconstruction of the five primary alphabets, whichwill not only lead to the solution of all these messages, but will also give us every one of the possible125 secondary alphabets of the entire system.
T A B L E V I I
1
A
B
C
D
E
F
OHI
JKLM
NO
P
QK8
T
T JV
WX
Y
Z
A
/
/ / / /
/
I I I
I I I
1W1I I I
111I I
I I I
mH
B
I l l lI I I !11
11I I I
I I1I I11I I
I l l l1Mil
O
I1I I I
1mii niiiiH IiiMl1I I
I I
D
/
I I
1M1I I
I I
I l l lI I1I I
1I I
I l l l1I I1
E
I I I
I l l l
I I
1
I I
M
I I I11I I111I I
I I
I I I
1
8
A
B
C
D
E
F
OH
I
JK
LM
N
O
PQ
B
8
T
ITV
W
X
Y
Z
A
I I
1I I
I I
111I l l l
I I I
I I
1I l l l
I I I
I I I
I I I
111
B
/
I I
I I
I I
I I I
1
I I I
I I
I I
1M1I I
1M
I I1
C
// /I I I111M
M
I I
I I I1I I
mI I I1
D
I I111I l l lI I
1M1111/ ///
/I I1m
ii
E
1 1 11/
I l l l
MI I1
I l l l
I I
I I111I I I
I l l l1
C3
7/30/2019 Index Coincidence
25/101
18
TABLE VIIContinued12
ABCDEPOHI
J
KLMN
0
P
QK8
TX TVWXTZ
A
/////////
I t UI I
1I II I I1I II I
I I I
I I I111
B
///////
/m ii nii iIlllH Im iiiiii i
c
n u
n ui nin u
n u
i nii ii ii n
D
/ / // / // / /// / / /
/ /
/
/ //
/ / /
/ / /
/ / //I H l ' l
E
/
mm
ii ni ii iiii nnii iinni ni
16
ABODBF
OHI
J
XLMK TOP
QB
8
TT TVWXYZ
A
///////
/
//
////////
////
/
//
///
///////
B
//
////
////////
////
W1I I
I l l l111I I
c
/
/ /// /
//
////// //// / /
/n u
i nn uii
D
///
//
///MillI I
1I I1Il ll
Illl
E
/
//////////////
//
///
///
/
/
I H l1I I11
7/30/2019 Index Coincidence
26/101
19VIIContinued
20
ABODXFOHIJKLMNOPQK8TT TVWXYZ
A
II11111
M
1M l
1
MlI I I !Illl
B
II1II I1111Illl
1MI I I
I IM1I I1
1
O
11I II I II I
Mllll1I I II II I
11I I I111M lIll lI I
D
Illl11111
II
II II I IIlllIlll1
II1II
II I1
B
II I1
1II I
Ml1lltlI I
I IMI I
11I I I
9
ABCDEFOHXJ
KLMNOPQB8TV
VWXYZ
A
/
m iii ni niii
HHm iniMnin
B
II I
MI I
1M
Il llIll l1111
I I1I
o
I I
1I I1111I I II In
iii niii
MlIl llI I
B
m i
iniiim inii im ii niM
I I I
E
II1
1
1II I
II
II I
II
II I
MilI I I1I I
i ni
7/30/2019 Index Coincidence
27/101
20
TABLEVIIContinued22
ABODEF0
HI
J
X
LMK0
PaB
8
TT C TVWXTZ
A
/
////////
////
/
/
///
///
//////////////
/
B
//////////II
1111II I1
Ml111
I I
1I I1
oI I
I Im i
ii nI I I !1I I I
1I I1I I I1I I
I I1
D
/w t i i im ininMin
/
in
nn
E
//
/
II
1IIIIlllIlll
Ill l11
I I I
Il ll
I I II I
16
ABODEF
aHI
3
K
LMIT
0
P
QR8
TUVWXYZ
A
////
//
/////
///
///////
MlI II I
11
I I I
B
////
/////
//
////////////M1I I I1I I1
o
/
// /
//
// /
// /
/I t t i
Ill lI I1I I I
1Mil
D
/
mm iinii ni nm iini nii nii
E
//W
I I I
m m
nni/
iinii1 !i tn
7/30/2019 Index Coincidence
28/101
21
TABLE VIIContinued6
AB0DBFOHIJ
XLMNOPQB8TT TVWXTZ
A
/
//
I H 1I I1I I
I I I
11Mil1I II I II II II I
B
/
/II1II
II
mnn u niii iMliii i
c
Illl1Il ll1I II I
I I
1M1till1Mill1
D
/
1 1 11
III11IIIl ll
MilIl llI I1Ill l11
E
III
II
1II
Il ll
1
1III
11II I
MilMI I
11
ABODEF0
HI
JKLMNOPQB8
TUVWXYZ
A
III
Il ll1MM111
I I I
I II II I I
11I II I
B
/Il ll11
II
Il llM i n iinH I
ni nnii
c
I II I II I
1I l l l1I II I11n
inim iim ii n
D
III
Il ll1II
II
III
11Il ll
II I
MillI I
I I I
E
Il ll
1
t i nn
ini nim iMliiiin
n
7/30/2019 Index Coincidence
29/101
22
TABLE VIIContinued6
ABODEFGHI
JKL
MNOP
QB8
TT TVWXT
Z
A
/
/////
/////////
/
//////
//Ml111 1I I
B
/
//
///////////
//
//
/
///
MilI I I
1I I I1
c
1I Im ii iii ni nM
I I
I I II I I
11
Ml
D
II
II I
II I11II I
Mil1I I I !1I I
11I I I !
Il l
E
//
////
///
///
MMl
M1I II I
I I I
19
ABCDEF
OHI
J
KL
MNOP
QB8TT JVWXYZ
A
//
///
/////n u tiiii ni niim im in
B
M
I II I I1M111
I I
1I I
11M
I I I
I I
c
//
M1I I I
I I
/ I / II IMillI I
M11
D
//////
//
/Mil1I I
I IM11
I I
I II II I
E
////
//
/////
/
M1I IIII I I
I IMI Ii1
7/30/2019 Index Coincidence
30/101
23
TABLB VIIContinued
18
ABODBFOHI
JKLMNOfQB8TT TVWXYZ
A
/n nni niiiii nnnm in ninin
B
II I
Illl
II11
II
II
11IlllII
1II1m
ni n
c
/II1111II
II I
1
I Im iin nm ini nn
D
II
1'1
II
Illl
II I
II
1II I1
m
n nn nn
ii
E
/
m
ni nii ni nn ninii nin
in
23
ABO
PEFOHX
J
KLMKOP
QB
S
TtrVwXyz
A
II I
1
IlllII I
II I
Illl
1IlllII I1II I1
B
/
I I
m
I I1I I I1 11miI I1II I
I I I1I I
1
c
/
/
I I1I II II II I I
1I I
I I
mnmin n
D
,///II
II I
II I1IlllII
Illl111
Illl
1II
H i
E
////
lit11II111II
II1II I
II
II111II
II
III
7/30/2019 Index Coincidence
31/101
24
TABLE VIIContinued
9
ABODEFaHijKLMNOPaB8
TX TVWXYZ
A
II
II
MlW1I I
I I I1I I
I Imini t
B
/
II
mini nini ni nnIlllIlll1
I I
c
/
/n u tii nn
H IH I
im iiiiim iH I
r >//
/
// //
/// /
// ///I I I1miini nnH
E
Ml1Ill lI I
Ml
11I I
1I I
Il ll1I I
I I I
31
ABODEP
aHijKI.MNOP
QB
8
TITVW
XYZ
A
//
//
n uim iiiiniiniii
MlM
B
/////
III
1II1Ml
11I II I
1I I
I Ili1M
c
I I
I I111
I I I
I I
I I I
1II IIl llI III I
m i li t
D
II
MIl ll
1I I
11
M
I I I
Ill l
I I
I I I
Il ll
E
/
III
II
III1II1II
IlllII
1Ill l
ill
II
II
Illl
7/30/2019 Index Coincidence
32/101
25
TABLE VIIContinued7
AB0r >Ey0HXJKX .MN0
PQ&8TT TVWXTZ
A
IIn u nii iitIlllI I
I I1m iiiniii n
B
Illl
II
Ill lM
I I1I I
Illl
1I I II1I I
Ml
c
/I II I II I I
1I ll l11I I
I I I111I Imili
D
II
Illl
Illl
11
II I
1miii niiiinm i
E
m
nii nm imininH Ininii
4
ABCDEF0
HI
J
KX .MNOPaB8TTTVWXYZ
A
/II I
II
II1mnH Im
ii iin
i nin
B
IlllII11II I
III
II
W/
N UI I1111111
c
I IIll lI I
I I II I
1I II I
I I
tI I I
I II I I
1mi
D
/m iii niim iinniH Im
H I
E
m i n inm iin
ninH Iiini inii
7/30/2019 Index Coincidence
33/101
26
TABLE VIIContinued14
ABCDEPGHI
JKX .MNOPaB
8
TT TVWXT
Z
A
/
///
/
//
/
I H III I11
I I1MIlllI IW
B
//
///////
/
/// /
/
/
/
//
MMlIll l
1
a
I I I111
1W
I II I1I I I11Il ll
n uiiii
D
//////////
//
//m iiiH I
iiiiim ii
E
III
III
1II1II1III
II1II
II
I H II I1
Il ll
10
ABCDEPOHI
J
XLMNOP
OB8TtrVWXYZ
A
miiH
i niin
H
nt i nnmn
n
B
//
///
/
///
//
///// /n uii nmii n
O
I l l lI I I
1111111
I II I I
1I I Imii iii n
D
//
/// //m i lH I
i n
ninMilIll l
E
/// /
II
M
I II I I
Ml
Ill l1
I I I
Il ll
1
7/30/2019 Index Coincidence
34/101
27
TABLE VIIContinued3
ABC
D
E
FOH
ZJX
LMN
OpQ&
8T
T TVW
X
Y
Z
A
//
I I
1I l l ln ui
niI l l l
111I l l lM i l
3
1I I
I l l l
MI I
I I
I l l l
I I
I I1I I II I1I l l l
1
c
I I I
I I
I I IM i lI I11I l l lI I
1I I1
11I I
I I I
D
/
/
I I
I I I
I I I
I l l l
J1I I I111I l l l
I I I
I I
I l l l1
E
I I
1 1 1 11I l l l
MI l l l
1I I1I I
I I IMl1
18
A
B
C
D
E
FO
HIJ
X
LMN
O
P
Q
B
B
T
VVW
X
Y
Z
A
I I
111I I I
1
1I I I
Ml
I II l l l1
I l l l111I I
I I
B
I I
Ml
1
1I II II I I
I I
11I I
I I
11Ml
M
oI I1
I II I I
I I I
I I I
I I I
1I I111
I l l l
Ml
1I I I
D
11I
f H JM i l
I I
Ml
I I I
I I1
I l l l
I I
1
E
I I
I I I
11MI I I
I I
I I
I I
1I I I
I I11
M i l
17
AB
0
D
E
V
0
H
1
JX
LM
A
/
I I I
I l l l
11I I
1111
B
///
//////////
/
//
C
/
///
/ / / //
I I I
11I I
D
I I I
I
1
I I
I I I
M i l l !
E
/ / / /
///I I I111I
1I I I
11
17
N
0
P
Q
B
8
T
T JV
WX
Y
Z
A
/
I l l l
I l l l
I I I
I I
1Ml
B
////
I I I
I I
MlI I
1I I I
c
// /
////M i l l1
I I
I I
D
/I I
1
1MM
I I
E
/
//////
/
/
I I
I1I I
7/30/2019 Index Coincidence
35/101
28
The method which we are about to demonstrate is based upon the fact that the segmentsfrom which the cipher groups are taken follow one another from a given initial point in aregular succession, uninterrupted in this case except for a break of three segments representingthe two blank segments of the key plus one blank which is always present representing theplain segment. To explain the principle of this method in detail, attention is directed to thefact that, as a result of the system ofencipherment, the series of successive cipher equivalentsfor any given plain-text letter in any one of the five primary alphabets coincides with thesequence of letters in that alphabet. The series will coincide with the complete alphabet exceptfor the omission of 1, 2, 3, or more letters depending upon the number of blank segments.For example, turn to figure 1 and note that, in alphabet 1, the sequence of letters beginningwith A i s as follows: A U F Q Z E R H Y G W J . . . .
Now it is patent that if we place letter A of the first primary alphabet in the plain segment ,its series of successive cipherequivalents coincides with the sequence of letters succeeding A in thesame alphabet, viz, U F Q Z E R H Y G W J O I D M N C . . . .
If we place another letter of the same primary alphabetfor example Zin the plainsegment, its series of successive cipher equivalents constitutes exactly the same sequence,except with a different initial point, viz, E R H Y G W J O I D M N C . . . . In otherwords the successive cipher equivalents for these 2 plain-text letters come from one and the samecycle or sequence. Now, the same is true with respect to every other letter of alphabet 1, andalso of the other primary alphabets. O f course, the sequence is different for each primaryalphabet.
Since this cycle or sequence of letters is the same for all the letters of each primary alphabet,only the series of successive cipher equivalents for one letter of each primary alphabet is necessaryin order to effect a complete reconstruction of that alphabet. In other words, if we can selectwith accuracy the cipher equivalent for one andonly one plain-text letter in each of the successive115 secondary alphabets, we can then arrange these equivalents into five sequences of letterswhich will coincide with the live primary alphabets, thus resulting in their reconstruction.The reconstructed sequences will be complete except for the omission of one or more lettersrepresenting the blank segments. If the numerical key consists of 23 numbers, three letterswill be missing from each sequence. These letters will be known, of course, but their relativepositions in the omitted section will have to be found later.
Obviously, the letter which will lend itself best to such a procedure is E, for it is the mostfrequently occurring letter in English text. If, therefore, by a careful study of the individualfrequency tables applying to the columns of the superimposed messages, we can select the cipherequivalent of only the letter E with certainty in the successive secondary alphabets, we shallat once have the sequences of letters in the five primary alphabets and the solution of theproblem will be at hand. For example, if in a hypothetical sequence of these alphabets weselect the letters K, N, Q, and V, respectively, as the four successive cipher equivalents of E,then this will mean that in primary alphabet 1 there is a sequence . . . K N Q V . . .,providing a break in the numerical key does not exist between the m em b er s of the sequenceof key numbers applying to the segments concerned. Continuing this process, ultimately thefive primary alphabets can be completely reconstructed. But we must remember always thatthis process is dependent upon the correct assumptions for the cipher equivalent of E in eachof the 11 5 secondary alphabets, or columns of cipher text.
Let us attempt such a reconstruction. Turning to the series of secondary alphabets givenin table VII, we try to find in each alphabet the letter which undoubtedly represents plain-textletter E. At the very start we encounter difficulties. In alphabet 1A, the letters M and Y areof equal frequency. There is no way of telling which letter represents E , so that we shall have
7/30/2019 Index Coincidence
36/101
29
to consider both M and Y as possibilities. In alphabet 8A again we have difficulties, for bothJ and Q have the same frequency. It begins to look like a very doubtful procedure. As wego further along, the difficulties in selecting the representative of E increase rather than decreaseand the cryptanalyst becomes lost in a multiplicity of possibilities. Evidently this method,as the preceding one, while theoretically correct, is practically out of the question becauseof the limited size of each frequency table. In fact, it is doubtful whether we can select therepresentative of E with certainty in any one of the A alphabets,1 and certainly, if we cannotdo this with the letter that theoretically occurs the most frequently, we cannot do it withany other letter.
It was at this point, when apparently a blank wall confronted the writer, and there seemedlittle hope of solution, that he evolved the method which finally resulted in solution, and whichembodied such new principles that he was led to describe them in this paper. This methodhad recourse to some simple mathematics, easy of comprehension and apph'cation when theunderlying principles have been grasped.
First, let us make what we have termed a "consolidated frequency" table for all of thesecondary alphabets applying to the first, or A, primary alphabet. This is done by collectingthe data contained in the individual frequency tables shown in table VII into one large table,taking only the data applying to the letters of primary alphabet 1. This larger table is shownbelow (table VIII).
1 It was found later that the cipher equivalent of E has the greatest frequency in only 3 out of the 23alphabets. In o ne alphabet E did not occur at all, and in six cases it occurred only two times. It will be ofinterest to the reader to study these tables for the information they contain with regard to the extreme degreesofvariation from the normal that small frequency tables can exhibit.
7/30/2019 Index Coincidence
37/101
T A B L B VIII.Consolidated frequency table for alphabet 1
Cipherletter
ABCDKp< jHijKLMNOP
QBSTtrVwXYz
Segment
1
1
4
1
3
3
1
513
11
1
2
3
52
S
2
1221
11
4
321
4
33311
1
12
1
32
21
52
1
23122
3
3111
18
2411
12
1
31
31
3
1
2
214
3
20
2111
11
5
1
6
1
1
644
2
1
6
1
33
11
12242
1
52
1
2
22
1
41
3
4
113
3
423221
1
i:
22
2
1
31
3
43
6
22
1
13
6
1
1
1
5
21231
1
71
23222
11
3
41
551
11
3
22
3
1
1
22
5
1
311
423
1
11
1
3
2
61
1
22
IB
2
21
12261
1
1
3
31i
442
13
142
3
1
111
3
22
64121
2
23
3
1
5
4
33
4
143131
l
2
2651
2
31
2
2
122
21
2
11
61
4
11
12112
1
1165
7
271
211
4221
611
211
3
4
1
32
21
5235
1
2
1
2
31
2
14
1
21
1
2
15121
1
21
54
25
10
61
2
3
1
1
22
2425
3
2
1
2
22
451
2
1
4
11
147
18
2111
31
1362
41
41
1
1
22
17
1
341
1
21
11
11
4
4
32
1
6
Total fre-quency
22372799203040
3529274034
3128352230383231373928383634
841
Num-ber ofsegmentsoccupied
1413161713171415121415141615141112161417151614171212
Averagefrequencypersegment
1.582,851.692.301.641.762 . 8 62. 3 42. 421.932 . 6 72. 431.941.862.502.002.502. 3 72.281.822.472.452.002 . 2 43.002.83
841Average frequency per cipher le t te r =- 2g- =3 2. 4 occurrences.
7/30/2019 Index Coincidence
38/101
31
This consolidated frequency table is of a rather peculiar n a tu r e . Each column gives thefrequency of the cipher letters in a particular segment and there are 23 such columns, corre-sponding to the 23 segments of the numerical- key. The numbers of the columns are determinedby, and coincide with,.the sequence of numbers in the cycle as given on page 16, viz, 1,8, 12, 1 6,etc. Each row gives the frequency of a particular cipher letter in the successive segments andsince the columns succeed one another in the cyclic sequence, it follows that the frequenciesin the successive segments on a line with any given cipher letter form a definite sequence oj
frequencies. There being 26 cipher letters, there are 26 such rows or sequences of frequencies.The total frequency for each cipher letter is g i ve n in the column labeled as such and theaverage frequency for all cipher letters is then found to be 8 4 1 -r -2 6 = 32 . 4 occurrences. Thenumber of different segments in which the cipher letter applying to any given line occurs isindicated in the next column; and the average frequency per segment for each cipher letteris given in the last column.
Before we can proceed it will be advisable to establish certain principles which will enableus to follow the subsequent reasoning more easily. We shall make use of alphabet 1 shownin table VIII, calling attention to the fact that the same principles apply to the other fourprimary alphabets. In order to make the illustration comparable in all its details with thereal situation in the test problem, let us make the numerical key 23 numbers in length byadding numbers 22 and 23 at the end of the key shown in figure 1 .Let us see what successive plain-text letters the cipher letters A, B, and C represent in thesequence ofsegments.
0. Plain textA K P L B S X T C N M D I O J W G Y H R E Z QBS X T C N M D I O J W G Y H R E Z Q F U A V KC M D I O J W G Y H R E Z Q F U A V K P L B S
It "will be noted that the successive plain-text letters which cipher letters A , B, and C repre-sent constitute almost exactly the same sequence in the three lines. This follows from thenature of the cipher system itself, and the cause of it has already been pointed out. In the Bline there is a section not present in the A line, consisting of the letters F U A ; in ihe A line, thesection not present in the C line consists of the letters X T C ; and in the C line, the section notpresent in the B line consists of the letters PLB. This is due to the interruption in the numericalkey; the section omitted will consist of 3 sequent letters in each case, but these letters will bed ifferent for every cipher letter.
Let us now accompany the sequence of the plain-text letters opposite each of the lettersA i B, and C, with a sequence of frequencies corresponding to their normal theoretical frequen-cies * for English text.
1 These theoretical frequencies are given by Hitt on the basis of 200 letters of plain text. See Hitt, Parker.Manual for the Solution of Military Ciphers, 1918, p. 0.
.3S
7/30/2019 Index Coincidence
39/101
t 17 4 14 10 7 21 11 IdV K P L B S X T C
32
F I G U R E 7A
12I
6 JO 1U
N M Da 6 is 2 i w is a 23 aO J W G Y H K E Z Q
8 X T C N M D J W O Y H B, E Q F T J A VK
N M D O J W Q Y H B E Z Q F T 7 A V K P I . B
Now, since the sequences of plain-text letters represented by these sequences of frequenciesare the same, it follows that we can so arrange the latter as to make the successive individualfrequencies coincide; and if we make due allowance for the break in the sequences caused bythe omitted sections of 3 letters, the three
sequences shouldcoincide exactly. Thus:
7/30/2019 Index Coincidence
40/101
33
FlCUKK 8A
V K P L B S X T C N M D I O J W G Y H R E Z Q t J V K P L B S X T C N
6 X T O N M D I O J W O Y H R E Z Q F t T A V K [ W ] S X T C N5* 5z^-3z"^ = 3:5>: ^^^StStSz "^Sz rt 5r 5r--3f
M D I O J W O Y H B E Z Q F t T A V K P L B S [ M==2:3 ^^^S^SrSz ^ 3z =: "~ " 5^ &
In order to make the sequences coincide, we displaced the B sequence five intervals tothe right of the A sequence, and the C sequence four intervals to the right of the B sequence.Let us reverse the order of these letters, A, B, and C, and space them in accordance with thenumber of intervals which each sequence of frequencies has been shifted relative to the others.Thus: 4 3 3 1 8 6 4 3 3 1
C ... B . . . . A
Refer now to the illustrative cipher alphabet in figure 1 and note that this corresponds to theorder of these letters A, B, and C in this primary alphabet. We have determined the order ofthese letters in our alphabet merely by correctly superimposing or shifting the three sequences offrequencies relative to one another so as to make the individual frequencies coincide.
Now had we not known what letters these individual frequencies in each sequence of fre-quencies represented but had merely been given the sequence of frequencies themselves, it wouldstill have been just as easy to find the correct relative positions of the three sequences from acomparison of the positions of high and low points in each sequence of frequencies. In otherwords, we do not need to know wha t letters the indiv idual frequencies in each sequence offrequenciesrepresent; it is still possible to determine the relative positions (in the primary alphabet) of theletters applying to each sequence in the cipher alphabet by a study of the positions of the high andlow points in each sequence offrequencies. No analysis whatever of the individual frequenciesis necessary, the entire frequency table being treated as an ordinary statistical curve. This, inits final analysis, is the meaning of the proposition stated in the opening paragraph of this
7/30/2019 Index Coincidence
41/101
34
paper.1 It thus follows that the five alphabets of our problem may be reconstructed, withouta knowledge of what letter any individual frequency in the sequences of frequencies (as sh o wn intable VIII) represents, by an analysis of these frequency tables considered as true statisticalcurves.Let us return now to the test messages. Table VIII represents a set of 26 sequences of fre-quencies similar in origin to those for A, B, and C in the illustrative alphabet above. We couldsuperimpose these sequences in the same manner and as easily as we did in the messages them-selves were it not for two circumstances: First, we know that there is an interruption of threeblanks in the cycle wliich we have reconstructed but do not know where these blanks must beinserted. Consequently, some allowance must be made for the blank segments in each sequenceof frequencies. Secondly, the individual frequencies in each sequence offrequencies in our prob-lem do not exactly correspond to the theoretical frequencies of the plain-text letters to whichthey apply but only correspond approximately to the theoretical. In some cases this approxima-tion is far from close because of the paucity of text, and this will make the determination ofthe correct relative positions of two sequences a much more difficult process than was thecase with the illustrative sequences above.
We are, therefore, confronted with the problem of superimposing the sequences of frequenciescorrectly without a knowledge of these two factors, and this we shall accomplish by a slight modi-fication of method and a recourse to some simple mathematics.
First, as to the modification of method due to our ignorance of the exact location of thebreak of three intervals in the numerical key: this consists in superimposing sequences, not tofind the relative positions ofany pair ofsequences but to find such sequences as are one and onlyone interval apart; i.e., sequences wliich represent a relative displacement of only one interval.The reason for this step is now to be explained.
Let us consider the sequence of theoretical frequencies corresponding to the cipher letter Aand the letter which immediately follows it in the illustrative alphabet, viz, U, arranging the twosequences as though we had only reconstructed the cycle and had not as yet determined thenumerical key. Let us begin both sequences with segment 9, the first segment in the key.
1 The ordinary frequency table applying to a plain text or a cipher alphabet does not correspond to theordinary frequency distribution ofstatisticalwork. In the latter, the position of the points along one ofthe axesof the graph and their extension along the other axis are either causally related, or the curve treats of data which,being subject to the operation of the laws of probability, form the normal, or Quetelet, curve of error. In theformer, the positions and extensions of the coordinates are not related in any way unless one considers the arbi-trary order of the letters of the alphabet as constituting a cause. The positions of the coordinates in a crypto-graphic curve were determined many centuries ago when the English language was first evolved.
But the sequences of frequencies in table VIII are not similar in origin to the ordinary plain-text or cipheralphabet frequency tablesof cryptographic work. They are, in fact, closelyrelated to certain frequency distribu-tions ofstatistical data because the position and extensions of the coordinates are absolutely determin ed by a causeother than the arbitrary order of the letters of the English alphabet. These two characteristics of the curves ofa aeries ofsecozidary alphabets may be varied at will by changing the sequence ofletters in the prim ary alphabet.An y set of frequency distributions applying to a series of secondary alphabets derived from a variable primaryalphabet may be treated in the same mathematical manner as these will be treated in the subsequent pages.
7/30/2019 Index Coincidence
42/101
35
FldDMl. 0A
9 17 4 14 10 7 21 11 1 (I 20 ID 12 3 8 1.1 2 1 13 18 8 22 23V K P L B S X T C N M D I O J W G Y H R E Z Q 1 8 17 4]VKP 1 4 10 7 21 11 1 L B 8 . . .Krtrt
17 4 14 10 7 21 11 18 90 18 12 3 6 14 2 1 IS 18 8 22 23A V X P L B S X T C N M D I O J W
Recommended