A two-error-detecting arithmetical check system for decimal identification numbers

378 European Journal of Operational Research 52 (1991) 378-381 North-Holland

Theory and Methodology

A two-error-detecting arithmetical check system for deomal identification numbers

Patrick Stevens

Generale Bank, Warandeberg 3, BIO00 Brussels, Belgium

Received October 1989

Abstract: This paper proposes an arithmetical check computation, applicable to any decimal identification number consisting of at most twenty information digits.

We first describe the check system for identification numbers of ten or less information digits; afterwards we generalize the idea for identification numbers consisting of more than ten but less than twenty information digits. In respective cases the number of redundant check digits to be added is two or three, such that the information rate is ever about 85%. The computed check digits are capable of detecting the following types of errors which may possibly corrupt the identification number through the operating system: any single error, any double error, any combination of one altered information digit and two interchanged information digits.

We assert that this paper may be of practical interest to be taken into consideration whenever a new application involving identification numbers is set up.

Keywords: Information, control, error detection

1. Introduction

In recent years, many administrative services use some system to add redundant check digits to their proper information numbers in order to detect errors which may possibly corrupt an identification number. These numbers are, for instance, applied to identify persons (banking, hospitals, Sick Fund, Pension Fund), goods (books, con- sumers' products, chemicals, parcels) or immova- bles (Land Registry Office).

In [1], some methods of computing check digits are compared and it turns out that some of them have rather poor error-detecting capacities. An effective scheme is the ISBN-code, described in [4], which is able to detect all single errors and all transpositions of two digits by appending a single check digit to a nine-digit book number. As the

method involves modulo 11 arithmetic, a character X is used to represent a possible check number equal to 10. This will also be true in the system that we propose in subsequent sections.

Recently, G u m m [2] discovered a group theo- retical method, involving the dihedral group of order 10 and some well-chosen permutation, that uses a single check digit and is foolproof in detecting any single error or any transposition of two digits. Moreover Gumm's method detects a great deal of double errors as well.

A method that uses two check digits is applied in banking in Belgium. An account number is composed of a ten-digit usernumber N, followed by a two-digit check number c (0 < c < 96), just being the remainder of dividing N by 97, so that N - c (mod 97). All single errors and transpositions of any two distinct digits are detected in this

0377-2217/91/$03.50 © 1991 - Elsevier Science Publishers B.V. (North-Holland)

P. Stevens / Arithmetical check system for identification number 379

way and only a very small percentage of double errors are undetectable by this modulo 97-control. In [3] we present two minor modifications to the current system such that the number of possible undetectable double errors yet decreases, al though some of them inevitably remain undetectable.

In this paper we introduce a check system which is capable of detecting any single or any double error or a triple error which is a combination of one altered information digit and two interchanged information digits. We actually present two subcases. In the first case, an identification number may consist of ten (or less) information digits followed by two check digits, such that the information rate is at most R = 10 /12 ~ 0.83. In the second case, an identification number may consist of twenty (or less) information digits followed by three check digits, such that the information rate is at most R = 20 /23 = 0.87. Throughout this paper, a check digit Cp (1 < p < 3) is computed by some arithmetical rule modulo 11, implying that 0 _< cp < 10. Consequently, we accept the character X to represent the integer 10. just as it is done in the ISBN-code.

2. Identification numbers composed by I0 information digits and 2 check digits

We denote the information digits by a i, 1 < i _< 10, 0 _< a, < 9, and the check digits by cp , 1 < p < 2, 0 < cp < 10. They are defined as follows:

10

c 1 - Y'~ i a i (mod 11), i = 1

10

e 2 - ~ a i (mod 11). i = l

We will discuss different cases of one, two or three errors which may possibly corrupt a number a ~ a 2 . . , a~oC~C z and we will ascertain that the proposed check system is foolproof of detecting all of them.

( a ) S i n g l e e r r o r

Let a k be changed into a~ (1 < k < l O ) , de- t noted by: a k ~ a k. It is easily verified that c~ as

well as c 2 detects this single error.

( b ) T r a n s p o s i t i o n o f t w o d i g i t s

Let a k and a h be transposed ( l _ < k < h < 1 0 , a k 4: ah), denoted by: a k ~ a h. This is detected by check digit c l , as the congruence

k a k + h a h = - k a h + h a k (mod 11),

being equivalent to

( k - h ) ( a k - - a h ) - - 0 (mod 11),

has no solutions for 1 ~ ] k - h ] , ]a k - a h] <_9.

When a k and c~ (respectively c 2 ) are transposed, this kind of error can be regarded as a single error: a k -~ a k = c 1, and is thus detected by c 2 (respectively cl).

When cl and c 2 are transposed, this is detected as both check sums do not balance any more.

(c ) D o u b l e e r r o r

When a k and c 1 (resp. c2) are corrupted, this kind of error is viewed as a single error a k ---, a~ in the information part of the number ; it is thus detected by c 2 (resp. cl) and possibly, though not necessarily, yet by the corrupted cl (resp. c 2).

When c 1 and c 2 are corrupted, this is detected as both check sums do not balance any more.

We now consider the principal case where a k r t

and a h are corrupted: a k ---, a k , a h --* a h, 1 < k <

h_<10. W e s e t

a k = a~ - a k , a h = a ' h -- a h ,

so 1-< [ak] , l a b [ < 9 .

This kind of double error is not detected by c 1 whenever

k a k + h a h = 0 (mod 11),

and is not detected by c 2 whenever

a k + a h = - O ( m o d l l ) .

As the set of congruences

k a k + h a h - 0 (mod 11),

a t + 0t h ~ 0 (mod 11),

l _ < k < h _ < l O , l _ < [ a k [ , lah[<9,

has no solutions (see lemma below), all these double errors are detected by the check digits c 1 a n d / o r c 2.

We conclude that the proposed check system is capable of detecting all single and double errors.

380 P. Stevens / Arithmetical check system for identification number

Lemma. Le t a, b, c, d be four integers, all relatively

p r i m e to the integer m. The set o f congruences

ax + by - 0 (mod m ) ,

cx + dy - 0 (mod m ) ,

has solutions (x , y ) where x ~ O (mod m ) and

y ~ O (mod m ) i f f a d - b c (mod m).

The proof is straightforward and is omit ted here.

(d) One altered information digit and two transposed

information digits

It is not difficult to verify that this particular type of triple error is detected by the check sum

C 2 •

3. Identification numbers composed by 20 information digits and 3 check digits

We denote the information digits by a~, 1 _< i _< 20, 0 _< a i _< 9, and the check digits by Cp, 1 < p <_ 3,

0 <_ Cp <_ 10. They are defined as follows:

10

Cl = E i ( a , + a m + i ) (mod 11), i=1

2O

c 2 - ~ aj (mod 11), j = l

10

c 3 - Y'~ i ( a , - a,o+i ) (mod I]). i = l

A general representation is 20

Cp ~ E # P ) a j ( m o d 1 1 ) , j = l

where the weight coefficients y)P), 1 _<p < 3, 1 _<j _< 20, are defined as follows:

3,)1, = {~ i f l < j < e o ,

- 1 0 if 1 1 < j < 2 0 ,

#2) = 1, 1 < j < 20,

= f j i f l < j < l O , #,, 2 1 - j if 1 1 < j < 2 0 .

We will discuss different cases of one, two or three errors which may possibly corrupt a number a l a 2 . . , a19a2oqC2C 3 and we will ascertain that the

proposed check system is foolproof of detecting all of them. The notat ions to denote an error at some digit or a transposit ion of two digits are taken from the foregoing section.

(a) Single error

t Any single error: a k ~ a k , 1 < k < 2 0 , is detected as well by c1, c2 and c 3 as the congruence Y~P) ( a'k - ak ) =- 0 (mod 11), a~ 4: a k, has no solutions.

(b) Transposition o f two digits

When a k and Cp, 1 < k < 2 0 , l < p < 3 , are transposed, this kind of error can be regarded as a single error: a k ~ a ' k = Cp, and is thus detected by the other two check digits C q, q 4= p , 1 < q < 3.

When Cp and Cq, Cp~Cq, l < p < q < 3 , are transposed, this is detected as bo th weighted check sums do not balance any more.

When a k and a h are transposed, 1 < k < h < 20, a k 4= ah, this is not detected by Cp when

yk (p) - - y(hP)( a k - - ah ) -- 0 (mod 11),

or, as a k - a h ~ 0 (mod 11), when

~k (p) ~ "}th(P) ( m o d 1 1 ) .

For p = l : 3,k (a)-Y~x) ( m o d 1 1 ) iff 1_<k_<10 and h = k + 10, meaning that the check digit c 1 does not detect the transposit ions a k ~ ak+ 10, 1 _< k_<10.

For p = 2: y~ p) = y~ p) = 1 for all k, h implying that c 2 does not detect any transposition.

For p = 3 : y~3)_3,~3) ( m o d 1 1 ) iff k + h = 2 1 , meaning that the check digit c 3 does not detect the transpositions a k ,~, a2a-k , 1 <_ k <_ 10. Nei ther c, nor c 3 would be able to detect a t ransposit ion a k ~, a h iff h = 10 + k = 21 - k for some k, 1 _< k < 10, which is contradictory. We conclude that either q or c 3 (or both) detect indeed any transposition at( ~, a h.

(c) Double error

When a k and Ce, 1 _ < k < 2 0 , l _ < p _ < 3 , are corrupted, this kind of error is viewed as a single

i error a k ---, a k in the informat ion part of the number; it is thus detected by %, q 4= p, 1 < q < 3 and possibly, though not necessarily, yet by the corrupted Cp.

P. Stevens / Arithmetical check system fl:r identification number 381

When cp and % , 1 < p < q < 3, are cor rupted ,

this is detected as these weighted check sums do

not ba lance any more. W e now cons ider the pr inc ipa l case where a k

and a h are cor rup ted : a k --+ a~, a h--+ a~, 1 < k < h_<20. W e s e t

p !

OZ k : a k -- a A , O~ h : a h - - ah ,

s o l - < I%1 , l ah l -<9. This kind of doub le error is not de tec ted by c e

whenever

y~P 'a , + y~Pla h - 0 (mod 11).

Unl ike in the check system that we descr ibed in the foregoing section, there do now exist some pa t t e rns of doub le errors which are not detected

nei ther by c 1 nor by c 2. This occurs whenever the set of congruences:

{ y~"a k + y~"a~ -= 0 (mod 11),

a k + a h - 0 ( m o d 11),

has solut ions ( a k, a h), a k 4= 0, a h v~ 0. Accord ing to the above-men t ioned lemma, the

necessary and sufficient condi t ion therefore is: y~l)___ y~l} (rood 11), i.e. h = k + 10, 1 < k < 10.

However , the double errors that are not detected nei ther by c 1 nor by c 2, are well de tec ted by c 3 (this is the reason why we have to add a third check digit in o rder to es tabl ish a two-error-detect - ing check system). Indeed, the set of congruences

/ . ~31 - 0 (mod 11) Y~3)°/k+D"+l°°tk+l° ' 1 < k < 1 0 , / ak + % + m -= 0 ( rood 11),

is equivalent to

{ k a k + (11 - k ) a k + m--- 0 (mod 11),

ak + %+m-- - 0 ( m o d 11),

and has no solut ions (ak , ak+m) , % ~ 0 (mod 11), a a+m ~ 0 (mod 11), as the congruence: k = 11 - k (mod 11) has no solut ion k: 1 < k < 10.

We have shown that ei ther c 1 or c 2 or c~ (or two of them, or all three of them) detects indeed any doub le error.

We conc lude that the p roposed check system is capab le of de tec t ing all single and double errors.

(d) O n e a l t e r e d i n f o r m a t i o n d ig i t a n d t w o t r a n s p o s e d

i n f o r m a t i o n d ig i t s

It is not diff icult to verify that this par t icu la r type of t r iple error is de tec ted by the check sum %. Indeed: a k ~ a h, a,,,--+ a~, is not detected by

Cp when

+ 7~mP~(a ", - a , , ) - 0 ( m o d 11)

which is con t rad ic to ry for p = 2.

4. Conclusive remarks

We have in t roduced two s imilar a r i thmet ica l check systems for dec imal iden t i f ica t ion numbers . Both methods are capab le of de tec t ing any single error, any double error and those t r iple errors which are a combina t ion of one a l tered in forma- t ion digit and two in te rchanged in fo rma t ion digits. The maximal in fo rmat ion rates of the two presented systems are respect ively 1 0 / 1 2 = 83% and 2 0 / 2 3 = 87%.

The first method , app l i cab le to any ident i f ica- t ion number consis t ing of at most ten i n fo rma t ion digits, is more efficient than the current check system used in bank ing in Belgium.

The second method, app l i cab le to any identif i - ca t ion number consis t ing of at most twenty informa t ion digits, is a va luable e r ror -de tec t ing check system for future app l ica t ions in which ra ther long ident i f ica t ion numbers will be involved. We th ink for example of new in te rna t iona l iden t i f ica t ion numbers that will poss ib ly be es tab l i shed towards 1992, when far- reaching European s t anda rd i za t i on will p r o b a b l y give rise to the need for un i fo rm ident i f ica t ion numbers in var ious domains .

References

[1] Gallian, J.A., and Winters, S., "Modular arithmetic in the marketplace", American Mathematical Monthly 95 (1988) 548-551.

[2] Gumm, H.P., "A new class of check-digit methods for arbitrary number systems", IEEE Transactions on Informa- tion Theory 31 (1985) 102-105.

[3] Stevens, P., "'Two suggestions to improve on the efficiency of the check computations in the banking system in Bel- gium", European Journal of Operational Research 42 (1989) 52-58.

[4] Tuchinsky, P.M., "'International Standard Book Numbers", The UMAP Journal 5 (1985) 41-54.

Documents

A two-error-detecting arithmetical check system for decimal identification numbers