Mutually Uncorrelated Codes - Israel Institute of Technology · Maya Levy and Eitan Yaakobi...

Preview:

Citation preview

Mutually Uncorrelated Codes for DNA Storage

Maya Levy and Eitan Yaakobi

Technion - Israel Institute of TechnologyCoding Seminar 1

Outline

• Motivation

• Mutually Uncorrelated Codes• Well- known construction analysis

• Non fixed Run Length Limited constraint

• Efficient construction

• 𝒅𝒉, 𝒅𝒎 − Mutually Uncorrelated Codes• Upper bound on cardinality

• Efficient construction

• Ongoing and future work

2

DNA Storage

3

DNA Storage

• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in

synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting

codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia

Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 4

DNA Storage

• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in

synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting

codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia

Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 5

DNA Storage

GCCTCAAAGTTACACCGTGCATTT

…ACGTAC

• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in

synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting

codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia

Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 6

DNA Storage

GCCTCAAAGTTACACCGTGCATTT

…ACGTAC

• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in

synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting

codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia

Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 7

DNA Storage

GCCTCAAAGTTACACCGTGCATTT

…ACGTAC

• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in

synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting

codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia

Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 8

DNA Storage

GCCTCAAAGTTACACCGTGCATTT

…ACGTAC

• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in

synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting

codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia

Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 9

Random Access

• S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic, “A rewritable, random-access DNA-based storage system,” Nature Scientific Reports, vol. 5, no. 14138, Aug. 2015

• J. Bornholt, R. Lopez, D. M. Carmean, L. Ceze, G. Seelig, and K. Strauss, “A DNA-based archival storage system,” ASPLOS, pp. 637–649, Atlanta, GA, Apr. 2016

10

Random Access

• S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic, “A rewritable, random-access DNA-based storage system,” Nature Scientific Reports, vol. 5, no. 14138, Aug. 2015

• J. Bornholt, R. Lopez, D. M. Carmean, L. Ceze, G. Seelig, and K. Strauss, “A DNA-based archival storage system,” ASPLOS, pp. 637–649, Atlanta, GA, Apr. 2016

11

Addresses Set Constraints

• Mutually uncorrelatedness of sequences• Large minimum Hamming distance

S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic, “A rewritable, random-access DNA-based storage system,” Nature Scientific Reports, vol. 5, no. 14138, Aug. 2015 12

Mutually Uncorrelated Codes

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

13

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

𝒂 =

𝒃 =

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

14

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

15

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

16

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

17

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

18

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

𝒂 =

𝒃 =

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

19

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

20

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

21

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

22

𝒂 =

𝒃 =

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

23

Mutually Uncorrelated Codes

0 0 0 1 0

1 1 0 0 1

𝒂 =

𝒃 =

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

24

Mutually Uncorrelated Codes

A code 𝑪 ⊆ 𝔽𝒒𝒏 is a mutually uncorrelated (MU) code if any two not necessarily

distinct codewords of 𝑪 are mutually uncorrelated.

u1 u2 u3 u4 un

v1 v2 v3 v4 vn

Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-

trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa

25

Construction of MU Codes

• MU codes were studied from the 60’s for synchronization purposes

• We present results over 𝔽𝟐𝒏. Most of the results can be extended to

larger fields

• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013

26

Construction of MU Codes

• MU codes were studied from the 60’s for synchronization purposes

• We present results over 𝔽𝟐𝒏. Most of the results can be extended to

larger fields

• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013

27

Construction of MU Codes

• MU codes were studied from the 60’s for synchronization purposes

• We present results over 𝔽𝟐𝒏. Most of the results can be extended to

larger fields

• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013

0 0 0 0 0 0 0 0

𝒌 𝒛𝒆𝒓𝒐𝒔

28

Construction of MU Codes

• MU codes were studied from the 60’s for synchronization purposes

• We present results over 𝔽𝟐𝒏. Most of the results can be extended to

larger fields

• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013

0 0 0 0 0 0 0 0 1 1

𝒌 𝒛𝒆𝒓𝒐𝒔

29

Construction of MU Codes

• MU codes were studied from the 60’s for synchronization purposes

• We present results over 𝔽𝟐𝒏. Most of the results can be extended to

larger fields

• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013

0 0 0 0 0 0 0 0 1 1

𝒌 𝒛𝒆𝒓𝒐𝒔 𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌

30

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1 1

Construction of MU Codes

31

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1 1

Construction of MU Codes

32

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1 1

Construction of MU Codes

33

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1 1

Construction of MU Codes

34

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1

Construction of MU Codes

35

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1

Construction of MU Codes

36

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1

Construction of MU Codes

𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

37

0 0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 0 1

Construction of MU Codes

𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

38

Construction of MU Codes

0 0 0 0 0 0 0 0 1 1

𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱

𝒌|𝑪 𝒏,𝒌 |?

Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌

𝑪 𝒏, 𝒌 ≳𝟐𝒏

𝟐𝒆𝒏

39

Construction of MU Codes

0 0 0 0 0 0 0 0 1 1

𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱

𝒌|𝑪 𝒏,𝒌 |?

Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌

𝑪 𝒏, 𝒌 ≳𝟐𝒏

𝟐𝒆𝒏

𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏= 1

𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏≥1

40

Construction of MU Codes

0 0 0 0 0 0 0 0 1 1

𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱

𝒌|𝑪 𝒏,𝒌 |?

Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌

𝑪 𝒏, 𝒌 ≳𝟐𝒏

𝟐𝒆𝒏

𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏= 1

𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏≥1

41

Construction of MU Codes

0 0 0 0 0 0 0 0 1 1

𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱

𝒌|𝑪 𝒏,𝒌 |?

Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌

𝑪 𝒏, 𝒌 ≳𝟐𝒏

𝟐𝒆𝒏

Our contribution:

𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏= 1

𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏≥1

42

Construction of MU Codes

0 0 0 0 0 0 0 0 1 1

𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱

𝒌|𝑪 𝒏,𝒌 |?

Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌

𝑪 𝒏, 𝒌 ≳𝟐𝒏

𝟐𝒆𝒏

Our contribution:

(*) 𝐦𝐚𝐱𝒌

|𝑪 𝒏,𝒌 | ≈𝟐𝒏

𝒏𝟐𝑭 𝒏 ≤

𝟐𝒏

𝟐𝒆𝒏𝐹 𝑛 = Δ𝑛 −min 2Δ𝑛 log𝑒 + 1, 2Δ𝑛+1 log𝑒 , Δ𝑛= log𝑛 − ⌈log𝑛⌉

for𝒏 = 𝟐𝒊, 𝐦𝐚𝐱𝒌

|𝑪 𝒏,𝒌 | ≈𝟐𝒏

𝟐𝒆𝒏

𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏= 1

𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏≥1

43

Construction of MU Codes

0 0 0 0 0 0 0 0 1 1

𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘

For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱

𝒌|𝑪 𝒏,𝒌 |?

Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌

𝑪 𝒏, 𝒌 ≳𝟐𝒏

𝟐𝒆𝒏

Our contribution:

(*) 𝐦𝐚𝐱𝒌

|𝑪 𝒏,𝒌 | ≈𝟐𝒏

𝒏𝟐𝑭 𝚫𝐧 ≤

𝟐𝒏

𝟐𝒆𝒏

Δ𝑛= log 𝑛 − ⌈log 𝑛⌉ , 𝐹 Δ = Δ − min 2Δ log 𝑒 + 1, 2Δ+1 log 𝑒

for𝒏 = 𝟐𝒊, 𝐦𝐚𝐱𝒌

|𝑪 𝒏,𝒌 | ≈𝟐𝒏

𝟐𝒆𝒏

(*) Thanks to Ron Roth for his contribution to this proof

𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏= 1

𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞

𝒇 𝒏

𝒈 𝒏≥1

44

2𝑛

2𝑒𝑛

Cardinality

Construction cardinality, 𝑛 = 2𝑖

Mutually Uncorrelated Codes

0 0 0 0 0 0 0 0 1 1

𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔

45

Mutually Uncorrelated Codes

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

0 0 0 0 0 0 0 0 1 1

𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔

46

Mutually Uncorrelated Codes

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

0 0 0 0 0 0 0 0 1 1

𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔

47

Mutually Uncorrelated Codes

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy

0 0 0 0 0 0 0 0 1 1

𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔

48

Mutually Uncorrelated Codes

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy

0 0 0 0 0 0 0 0 1 1

𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔

49

Mutually Uncorrelated Codes

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy

Efficient construction?

0 0 0 0 0 0 0 0 1 1

𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔

50

Mutually Uncorrelated Codes

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy

0 0 0 0 0 0 0 0 1 1

⌈𝐥𝐨𝐠𝒏⌉ + 𝟏 𝒛𝒆𝒓𝒐𝒔 𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < ⌈𝐥𝐨𝐠𝒏⌉ + 𝟏

Efficient construction?

51

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

52

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

53

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

𝒊 = 𝟏

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

54

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

𝒊 = 𝟐

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

55

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

𝒊 = 𝟒

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

56

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

𝒊 = 𝟒

0 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

57

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

𝒊 = 𝟒

0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 10 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

58

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

𝒊 = 𝟒

0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 10 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

59

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 10 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0

𝒊 = 𝟒

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

60

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

0 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0

61

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

62

Zeros Run Length Limited Encoding

0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Uniquely decodable

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

63

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Zeroes run length ≤ ⌈log 𝑛⌉

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

64

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Zeroes run length ≤ ⌈log 𝑛⌉

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

65

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Zeroes run length ≤ ⌈log 𝑛⌉

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

66

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Zeroes run length ≤ ⌈log 𝑛⌉

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

67

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

𝒊 ≠ 𝟎

Zeroes run length ≤ ⌈log 𝑛⌉

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6

68

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6

Zeroes run length ≤ ⌈log 𝑛⌉

69

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6

Zeroes run length ≤ ⌈log 𝑛⌉

𝒊 𝒋≤

70

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6

Zeroes run length ≤ ⌈log 𝑛⌉

𝒊 𝒋≤

71

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6

Zeroes run length ≤ ⌈log 𝑛⌉

𝒊 𝒋≤

72

Zeros Run Length Limited Encoding

0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0

Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6

Zeroes run length ≤ ⌈log 𝑛⌉

73

MU Codes Summary

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy

74

MU Codes Summary

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy log 𝑛 + 4

2𝑛

16𝑛

Efficient construction, 𝑛 = 2𝑖

75

MU Codes Summary

2𝑛

2𝑒𝑛

Cardinality

Upper bound by Levenshtein , ’70

2𝑛

𝑒𝑛Construction

cardinality, 𝑛 = 2𝑖

log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy log 𝑛 + 4

2𝑛

16𝑛

Efficient construction, 𝑛 = 2𝑖

• W. Kautz, "Fibonacci codes for synchronization control," IEEE Transactions on Information Theory, vol. 11, no. 2, pp. 284-292, 1965

• C. Schoeny, A. Wachter-Zeh, R. Gabrys, and E. Yaakobi, “Codes for correcting a burst of deletions or insertions,” in Proc. IEEE Int. Symp. Inf. Theory, pp. 630–634, Barcelona, Spain, Jul. 2016 76

Non Fixed Zero Run Length Analysis

• 𝑺 𝒏,𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘, 𝒔 𝒏,𝒌 = 𝑺 𝒏,𝒌

• 𝑪 𝒏,𝒌 = 𝒔 𝒏− 𝒌 − 𝟐, 𝒌

• The capacity of (0, 𝑘 − 1)-RLL constraint:

𝐸0,𝑘−1 = limℓ→∞

log 𝑠 ℓ,𝑘

ℓ, for fixed k

𝒔 𝒏, 𝐥𝐨𝐠𝒏 + 𝒂 ≈ ?

𝒂 ∈ ℤ

0 0 0 0 0 0 0 0 1 1

77

Non Fixed Zero Run Length Analysis

𝑆 𝑚𝑛, 𝑘 ⊆ 𝑆 𝑛, 𝑘 𝑚

𝑠 𝑚𝑛, 𝑘 ≤ 𝑠 𝑛, 𝑘 𝑚

𝑚 times

𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ ?

𝑺 𝒏,𝒌• 𝑺 𝒏, 𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘, 𝒔 𝒏, 𝒌 = 𝑺 𝒏, 𝒌

• 𝑻 𝒏,𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘 that does not

contain ⌈𝐤⌉

𝟐zeroes in the first or last indexes, 𝒕 𝒏, 𝒌 = 𝑻 𝒏, 𝒌

• 𝑡 𝑛, 𝑘 ≥ 𝑠 𝑛, 𝑘 − 2 ⋅ 2𝑛−𝑘

2+1

∈ 𝑆 𝑛, 𝑘 𝑚

∈ 𝑆(𝑚𝑛,𝑘)

78

Non Fixed Zero Run Length Analysis 𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ ?

𝑻 𝒏,𝒌• 𝑺 𝒏, 𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘, 𝒔 𝒏, 𝒌 = 𝑺 𝒏, 𝒌

• 𝑻 𝒏,𝒌 - 𝑺(𝒏, 𝒌)\{vectors that contain ⌈𝐤⌉

𝟐zeroes in the first or last indexes},

𝒕 𝒏, 𝒌 = 𝑻 𝒏,𝒌

• # of removed ≤ 2 ⋅ 2𝑛−

𝑘

2+1

𝑇 𝑛, 𝑘 𝑚 ⊆ 𝑆 𝑚𝑛, 𝑘

𝑠 𝑛, 𝑘 − 2 ⋅ 2𝑛−

𝑘2+1

𝑚

≤ 𝑡 𝑛, 𝑘 𝑚 ≤ 𝑠 𝑚𝑛,𝑘

𝑚 times

∈ 𝑇 𝑛, 𝑘 𝑚

∈ 𝑆(𝑚𝑛,𝑘)

79

Non Fixed Zero Run Length Analysis

• 𝑠 𝑛, 𝑘 − 2 ⋅ 2𝑛−

𝑘

2+1

𝑚

≤ 𝑠 𝑚𝑛, 𝑘 ≤ 𝑠 𝑛, 𝑘 𝑚

• The capacity of 0,𝑘 − 1 −RLL constraint:𝐸0,𝑘−1 = limℓ→∞

log 𝑠 ℓ,𝑘

= lim𝑚→∞

log 𝑠 𝑚𝑛,𝑘

𝑚𝑛

• 2𝑛𝐸0,𝑘−1 ≤ 𝑠 𝑛, 𝑘 ≤ 2𝑛𝐸0,𝑘−1 + 2𝑛−

𝑘

2+1

• 𝑠 𝑛, 𝑙𝑜𝑔 𝑛 + 𝑎 ≈ 2𝑛𝐸0,𝑘−1

𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ ?

80

Non Fixed Zero Run Length Analysis

𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ 𝟐𝒏𝑬𝟎, 𝐥𝐨𝐠 𝒏 +𝒂−𝟏

A. Kato and K. Zeger ’05:

lim𝑘→∞

1−𝐸0,𝑘

log 𝑒 2−𝑘−2= 1 𝟐𝒏𝑬𝟎, 𝐥𝐨𝐠 𝒏 +𝒂−𝟏 ≈

𝟐𝒏

𝒆𝟐𝚫𝒏−𝒂−𝟏

, Δ𝑛= log𝑛 − ⌈log𝑛⌉

𝒔 𝒏, ⌈𝐥𝐨𝐠𝒏⌉ + 𝒂 ≈𝟐𝒏

𝒆𝟐𝚫𝒏−𝒂−𝟏

Δ𝑛= log𝑛 − ⌈log𝑛⌉ ∈ (−1,0]

81

Construction of MU Codes

∀𝑘 = log 𝑛 + 𝑎:

𝑪 𝒏,𝒌 ≈𝟐𝒏

𝒏𝟐𝒇𝒂(Δn), Δ𝑛= log𝑛 − ⌈log𝑛⌉ ,

𝑓𝑎Δ = Δ −

log e

2a+12Δ − 𝑎 − 2

𝐦𝐚𝐱𝒌

|𝑪 𝒏, 𝒌 | ≈𝟐𝒏

𝒏𝟐𝑭 Δ𝑛 ≤

𝟐𝒏

𝟐𝒆𝒏,

Δ𝑛 = log𝑛 − ⌈log𝑛⌉ ,𝐹 Δ = Δ −min 2Δ log𝑒 + 1, 2Δ+1 log𝑒

82

Outline

• Motivation

• Mutually Uncorrelated Codes• Well- known construction analysis

• Non fixed Run Length Limited constraint

• Efficient construction

• 𝒅𝒉, 𝒅𝒎 − Mutually Uncorrelated Codes• Definition

• Upper bound on cardinality

• Efficient construction

• Ongoing and future work

83

𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes

84

𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes

• Minimum Hamming distance 𝑑ℎ

85

𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes

• Minimum Hamming distance 𝑑ℎ• Each prefix of length 𝑖 ∈ [1, 𝑛 − 1] differs from each suffix of length 𝑖

by min 𝑖, 𝑑𝑚 symbols

86

𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes

• Minimum Hamming distance 𝑑ℎ• Each prefix of length 𝑖 ∈ [1, 𝑛 − 1] differs from each suffix of length 𝑖

by min 𝑖, 𝑑𝑚 symbols

u1 u2 u3 u4 un

v1 v2 v3 v4 vn

87

Theorem: Let 𝐶 be a 𝑑ℎ, 𝑑𝑚 -MU code.

|𝑪| ≤𝑀 𝑛, 𝑑

𝑛𝑑𝑚

, 𝑑 = min{𝑑ℎ, 2𝑑𝑚}

𝑴 𝒏, 𝒅 - size of a maximal code of length 𝒏 with Hamming distance 𝑑.

Proof: Let 𝐶 be a 𝑑ℎ, 𝑑𝑚 -MU code. Generate 𝐶′ = 𝑎 𝑖 𝑎 ∈ 𝐶, 𝑖 = 𝑥 ⋅ 𝑑𝑚 +1}

a1 adm a2dm a3dm a4dm

a2dm a3dm a4dm a1 adm

a3dm a4dm a1 adm a2dm

a4dm a1 adm a2dm a3dm

𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound

88

Let 𝒂′, 𝒃′ ∈ 𝐶 ′

s.t 𝒂′ is a shift of 𝒂 ∈ 𝐶, 𝒃′ is a shift of 𝒃 ∈ 𝐶

adm+1 a1 adm

bdm+1 b1 b𝑑𝑚

Hamming distance ≥ 𝑑ℎ

𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound

𝒂′

𝒃′

89

a4dm a1

b4dm b1

Hamming distance ≥ 2𝑑𝑚

𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound

𝒂′

𝒃′

Let 𝒂′, 𝒃′ ∈ 𝐶 ′

s.t 𝒂′ is a shift of 𝒂 ∈ 𝐶, 𝒃′ is a shift of 𝒃 ∈ 𝐶

90

Let 𝐶 be a 𝑑ℎ , 𝑑𝑚 -MU code. Generate a new code: 𝐶 ′ = 𝑎 𝑖 𝑎 ∈ 𝐶, 𝑖 = 𝑥 ⋅ 𝑑𝑚 + 1}

• 𝐶′ =𝑛

𝑑𝑚⋅ 𝐶

• 𝑑𝑚𝑖𝑛 𝐶′ ≥ 𝑑 =min{𝑑ℎ, 2𝑑𝑚}• 𝐶′ ≤ 𝑀 𝑛, 𝑑 where 𝑴 𝒏,𝒅 - size of a maximal code with Hamming distance 𝑑

|𝑪| ≤𝑀 𝑛, 𝑑

𝑛𝑑𝑚

, 𝑑 = min{𝑑ℎ , 2𝑑𝑚}

a1 adm a2dm a3dm a4dm

𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound

91

0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1

𝑘 zeros A code of minimum distance 𝒅𝒉The weight of every length-k window is ≥ 𝒅𝒎

𝑢 vector 𝒅𝒎 ones 𝒅𝒎 ones

𝑑ℎ , 𝑑𝑚 −MU Codes Construction

• Minimum Hamming distance 𝑑ℎ• Each prefix of length 𝑖 ∈ [1, 𝑛 − 1] differs from each suffix of length 𝑖

by min 𝑖, 𝑑𝑚 symbols

92

Window Weight Limited Encoding

Problem: encode a length-𝑛 vector to 𝑛 + 𝑑𝑚 bit s.t. every window of length < ⌈log 𝑛⌉ + (𝑑𝑚−1)log log 𝑛 + 𝐶 has weight ≥ 𝑑𝑚 .

1 1 1 1 1

93

Problem: encode a length-𝑛 vector to 𝑛 + 𝑑𝑚 bit s.t. every window of length < ⌈log 𝑛⌉ + (𝑑𝑚−1)log log 𝑛 + 𝐶 has weight ≥ 𝑑𝑚 .

1 1 1 1 1

Window Weight Limited Encoding

94

Window Weight Limited Encoding

Problem: encode a length-𝑛 vector to 𝑛 + 𝑑𝑚 bit s.t. every window of length < ⌈log 𝑛⌉ + (𝑑𝑚−1)log log 𝑛 + 𝐶 has weight ≥ 𝑑𝑚 .

1 1 1 1 1 0 1

Index of the windowlog 𝑛 bits

(𝑑𝑚−1) indexes of log log 𝑛 bits of the ones within the window

95

0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1

A code of minimum distance 𝑑ℎThe weight of every length-𝑘 window is ≥ 𝑑𝑚

𝑥 ∈ 𝔽2𝑛′

Window Weight Limited

encoding

𝑦 ∈ 𝔽2𝑛′+𝑑𝑚

Systematic BCH

𝑧 of length

𝑛′ + 𝑑𝑚 +𝑑ℎ − 1

2𝑙𝑜𝑔𝑛′

𝟎𝒌𝒖𝟏𝒅𝒎𝒛𝟏𝒅𝒎

𝑑ℎ , 𝑑𝑚 −MU Codes Efficient Construction

Theorem: There exists a 𝑑ℎ , 𝑑𝑚 -MU code with redundancy𝑑ℎ+1

2log 𝑛 + 𝑑𝑚 − 1 log log𝑛 + 𝑂(1) and linear time and space complexity

96

RedundancyLower bound

(for 𝟐𝒅𝒎 ≥ 𝒅𝒉)

𝑑ℎ + 1

2log 𝑛 + 𝑂(1)

𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes

97

RedundancyLower bound

(for 𝟐𝒅𝒎 ≥ 𝒅𝒉)Efficient

Construction

𝑑ℎ + 1

2log 𝑛 + 𝑑𝑚 − 1 log log 𝑛 + 𝑂(1)

𝑑ℎ + 1

2log 𝑛 + 𝑂(1)

𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes

98

99

• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes

• Explore the 2 factor gap between MU upper and lower bound

• Analyze the window weight limited constraint with non fixed window length

• Extend to additional DNA motivated constraints such as balanced codes and edit distance

2𝑛

2𝑒𝑛

CardinalityUpper bound by Levenshtein , ’70

2𝑛

𝑒𝑛

Construction cardinality, 𝑛 = 2𝑖

Ongoing and Future Work

100

• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes

• Explore the 2 factor gap between MU upper and lower bound

• Analyze the window weight limited constraint with non fixed window length

• Extend to additional DNA motivated constraints such as balanced codes and edit distance

0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1

The weight of every length-k window is ≥ 𝒅𝒎

2𝑛

2𝑒𝑛

CardinalityUpper bound by Levenshtein , ’70

2𝑛

𝑒𝑛

Construction cardinality, 𝑛 = 2𝑖

Ongoing and Future Work

101

• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes

• Explore the 2 factor gap between MU upper and lower bound

• Analyze the window weight limited constraint with non fixed window length

• Extend to additional DNA motivated constraints such as balanced codes and edit distance

0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1

The weight of every length-k window is ≥ 𝒅𝒎

2𝑛

2𝑒𝑛

CardinalityUpper bound by Levenshtein , ’70

2𝑛

𝑒𝑛

Construction cardinality, 𝑛 = 2𝑖

Ongoing and Future Work

102

• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes

• Explore the 2 factor gap between MU upper and lower bound

• Analyze the window weight limited constraint with non fixed window length

• Extend to additional DNA motivated constraints such as balanced codes and edit distance

0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1

The weight of every length-k window is ≥ 𝒅𝒎

2𝑛

2𝑒𝑛

CardinalityUpper bound by Levenshtein , ’70

2𝑛

𝑒𝑛

Construction cardinality, 𝑛 = 2𝑖

Ongoing and Future Work

103

THANK YOU

Thanks to Ryan Gabrys and Olgica Milenkovic for helpful discussions on DNA storage104

Recommended