10
528 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006 Vectorial Boolean Functions and Induced Algebraic Equations Jovan Dj. Golic ´ , Member, IEEE Abstract—A general mathematical framework behind algebraic cryptanalytic attacks is developed. The framework relates to finding algebraic equations induced by vectorial Boolean func- tions and, in particular, equations of low algebraic degree. The equations may involve only a subset of input variables and may or may not be conditioned on the values of output variables. In addition, the equations may have a constrained form interesting for the so-called fast algebraic attacks. A possible divide-and-con- quer effect is pointed out and the notion of algebraic immunity order, naturally extending the notion of correlation immunity order, is defined. An application of general results to stream ci- phers known as combiners with or without memory, with possibly multiple outputs, is studied in particular detail and the concept of divide-and-conquer algebraic attacks is introduced. Special properties of combiners with finite input memory, such as non- linear filter generators, are also established. It is also pointed out that Gröbner basis algorithms may be used for finding low-degree induced algebraic equations. Index Terms—Algebraic attacks, algebraic equations, combiners with memory, Gröbner basis algorithms, vectorial Boolean func- tions. I. INTRODUCTION A LGEBRAIC attacks are cryptanalytic attacks that recon- struct the secret key in cryptosystems by manipulating and solving the underlying algebraic equations. Solving algebraic equations is generally a difficult problem, but it has been demon- strated that for particular ciphers, finding solutions faster than by the exhaustive search may sometimes be possible. This is especially the case if the algebraic equations have a relatively low algebraic degree when considered as multivariate polyno- mial equations. If the number of such equations is sufficiently large, then it may be possible to apply the well-known lineariza- tion algorithm, which consists of replacing the products of vari- ables, that is, the monomials by new variables and of solving the resulting system of linear equations, in new variables, for example, by Gaussian elimination. It is shown in [17] that even if the number of low-degree equations is not large enough for the linearization algorithm to work, one can derive new equations of increased degree by multiplying the original equations by monomials of a limited degree so that, provided that the original system is overdefined, the new system may contain sufficiently Manuscript received January 18, 2005; revised July 15, 2005. This work was supported in part by the Italian Ministry for Education and Research (MIUR) within the framework of the PRIMO FIRB project. The author is with Security Innovation, Telecom Italia, 10148 Turin, Italy (e-mail: [email protected]). Communicated by E. Okamoto, Associate Editor for Complexity and Cryp- tography. Digital Object Identifier 10.1109/TIT.2005.862101 many linearly independent equations for the linearization algo- rithm to work. This algorithm is known as the XL algorithm. It is demonstrated in [4] that the linearization and XL algorithms may even be applicable if the low-degree algebraic equations do not hold with certainty, but with probabilities sufficiently close to . It is shown in [5] that the XL algorithm can be applied to block ciphers by associating low-degree algebraic equations with S-boxes and that the algorithm can be improved for sparse systems of low-degree equations. For product block ciphers, the complexity then grows polynomially with the number of rounds, but for real ciphers such as the Advanced Encryption Standard (AES) the algorithm is on the borderline to be ef- fective, because of a large number of intermediate variables introduced and because of possible linear dependences among the derived equations. In [3], it is shown that the XL algorithm can be viewed as a special case of the Gröbner basis algorithms and that these algorithms may be more efficient than the XL algorithm. In [6], it is pointed out that algebraic attacks may be effec- tive against stream ciphers known as memoryless combiners and nonlinear filter generators, in which the output is produced by a Boolean function applied to the internal state sequence gener- ated by an autonomous linear finite state machine, for example, by a number of linear feedback shift registers (LFSRs). In this case, the degree of the algebraic equations cannot increase when the inputs to the Boolean function are expressed as linear func- tions of the secret key and may even decrease if these func- tions are not linearly independent. Note that an application of the linearization algorithm for cryptanalyzing a stream cipher of a similar type is given in [16]. The main points of [6] are that a Boolean function of a given algebraic degree may induce algebraic equations of lower degrees and that for any Boolean function of variables, there exists an induced algebraic equa- tion of degree at most . As a consequence, if the number of such algebraic equations induced by known keystream bits is sufficiently large, then the secret key can be recovered by the linearization algorithm. This result is extended to binary combiners with inputs and bits of memory in [1], where it is proven that there exists an induced algebraic equation over consecutive -bit inputs of degree , when conditioned on consecutive output bits. A generalization to combiners with memory and multiple outputs is given in [8], but the existence of nontrivial equations involving only the input variables is not guaranteed. This is due to the specific unconditional scenario considered, where the algebraic equations are allowed to involve the output variables, with no restriction regarding their algebraic degree. 0018-9448/$20.00 © 2006 IEEE

Vectorial Boolean functions and induced algebraic equations

  • Upload
    jdj

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Vectorial Boolean functions and induced algebraic equations

528 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

Vectorial Boolean Functions andInduced Algebraic Equations

Jovan Dj. Golic, Member, IEEE

Abstract—A general mathematical framework behind algebraiccryptanalytic attacks is developed. The framework relates tofinding algebraic equations induced by vectorial Boolean func-tions and, in particular, equations of low algebraic degree. Theequations may involve only a subset of input variables and mayor may not be conditioned on the values of output variables. Inaddition, the equations may have a constrained form interestingfor the so-called fast algebraic attacks. A possible divide-and-con-quer effect is pointed out and the notion of algebraic immunityorder, naturally extending the notion of correlation immunityorder, is defined. An application of general results to stream ci-phers known as combiners with or without memory, with possiblymultiple outputs, is studied in particular detail and the conceptof divide-and-conquer algebraic attacks is introduced. Specialproperties of combiners with finite input memory, such as non-linear filter generators, are also established. It is also pointed outthat Gröbner basis algorithms may be used for finding low-degreeinduced algebraic equations.

Index Terms—Algebraic attacks, algebraic equations, combinerswith memory, Gröbner basis algorithms, vectorial Boolean func-tions.

I. INTRODUCTION

ALGEBRAIC attacks are cryptanalytic attacks that recon-struct the secret key in cryptosystems by manipulating and

solving the underlying algebraic equations. Solving algebraicequations is generally a difficult problem, but it has been demon-strated that for particular ciphers, finding solutions faster thanby the exhaustive search may sometimes be possible. This isespecially the case if the algebraic equations have a relativelylow algebraic degree when considered as multivariate polyno-mial equations. If the number of such equations is sufficientlylarge, then it may be possible to apply the well-known lineariza-tion algorithm, which consists of replacing the products of vari-ables, that is, the monomials by new variables and of solvingthe resulting system of linear equations, in new variables, forexample, by Gaussian elimination. It is shown in [17] that evenif the number of low-degree equations is not large enough for thelinearization algorithm to work, one can derive new equationsof increased degree by multiplying the original equations bymonomials of a limited degree so that, provided that the originalsystem is overdefined, the new system may contain sufficiently

Manuscript received January 18, 2005; revised July 15, 2005. This work wassupported in part by the Italian Ministry for Education and Research (MIUR)within the framework of the PRIMO FIRB project.

The author is with Security Innovation, Telecom Italia, 10148 Turin, Italy(e-mail: [email protected]).

Communicated by E. Okamoto, Associate Editor for Complexity and Cryp-tography.

Digital Object Identifier 10.1109/TIT.2005.862101

many linearly independent equations for the linearization algo-rithm to work. This algorithm is known as the XL algorithm. Itis demonstrated in [4] that the linearization and XL algorithmsmay even be applicable if the low-degree algebraic equations donot hold with certainty, but with probabilities sufficiently closeto .

It is shown in [5] that the XL algorithm can be applied toblock ciphers by associating low-degree algebraic equationswith S-boxes and that the algorithm can be improved for sparsesystems of low-degree equations. For product block ciphers,the complexity then grows polynomially with the number ofrounds, but for real ciphers such as the Advanced EncryptionStandard (AES) the algorithm is on the borderline to be ef-fective, because of a large number of intermediate variablesintroduced and because of possible linear dependences amongthe derived equations. In [3], it is shown that the XL algorithmcan be viewed as a special case of the Gröbner basis algorithmsand that these algorithms may be more efficient than the XLalgorithm.

In [6], it is pointed out that algebraic attacks may be effec-tive against stream ciphers known as memoryless combiners andnonlinear filter generators, in which the output is produced bya Boolean function applied to the internal state sequence gener-ated by an autonomous linear finite state machine, for example,by a number of linear feedback shift registers (LFSRs). In thiscase, the degree of the algebraic equations cannot increase whenthe inputs to the Boolean function are expressed as linear func-tions of the secret key and may even decrease if these func-tions are not linearly independent. Note that an application ofthe linearization algorithm for cryptanalyzing a stream cipherof a similar type is given in [16]. The main points of [6] arethat a Boolean function of a given algebraic degree may inducealgebraic equations of lower degrees and that for any Booleanfunction of variables, there exists an induced algebraic equa-tion of degree at most . As a consequence, if the numberof such algebraic equations induced by known keystream bitsis sufficiently large, then the secret key can be recovered by thelinearization algorithm.

This result is extended to binary combiners with inputs andbits of memory in [1], where it is proven that there exists an

induced algebraic equation over consecutive -bit inputs ofdegree , when conditioned on consecutiveoutput bits. A generalization to combiners with memory andmultiple outputs is given in [8], but the existence of nontrivialequations involving only the input variables is not guaranteed.This is due to the specific unconditional scenario considered,where the algebraic equations are allowed to involve the outputvariables, with no restriction regarding their algebraic degree.

0018-9448/$20.00 © 2006 IEEE

Page 2: Vectorial Boolean functions and induced algebraic equations

GOLIC: VECTORIAL BOOLEAN FUNCTIONS AND INDUCED ALGEBRAIC EQUATIONS 529

The low-degree algebraic equations induced by consecu-tive output bits can be found by solving the underlying systemof linear equations, for example, by Gaussian elimination, withcomplexity . Some complexity reductions are proposedin [15].

It may be interesting to point out the analogy between findinglow-degree induced algebraic equations and finding linear cor-relations in combiners with memory, where the objective is tofind linear equations, i.e., algebraic equations of degree thatare allowed to hold with probabilities different from one half.For example, see [14], [10], and [11], whereas for combinerswith multiple binary outputs, see [12].

In [7], it is shown that the algebraic attacks can be faster ifa found low-degree equation in the input and output variableshas a special form, that is, if the monomials with the highestdegrees in input variables do not depend on output variables.As a result, these monomials can be eliminated by using thelinear complexity properties of the corresponding stream cipher,so that the algebraic degree is thus effectively reduced. Somefurther complexity clarifications and reductions can be found in[13].

An approach to formalizing the existing results regarding theexistence of low-degree algebraic equations for combiners witha single binary output, with or without memory, and for S-boxesin block ciphers, in terms of the so-called annihilators [15] ofBoolean functions, along with some algorithmic considerationscan be found in [2].

The first objective of this paper is to provide a general, uni-fied, and simple mathematical framework for finding inducedalgebraic equations of low algebraic degree, for arbitrary vec-torial Boolean functions in the conditional, unconditional, andconstrained unconditional scenarios, where a subset of inputvariables may be required to be eliminated from the equations,thus achieving a divide-and-conquer effect. The conditions re-late to whether the output of the function considered is assumedto be known or not, whereas the constraints relate to fast correla-tion attacks. The framework is based on the concept of algebraicequations induced by subsets of binary vectors.

The second objective is to use the divide-and-conquer effectof induced algebraic equations for generalizing the notion ofcorrelation immunity order [18] to that of alebraic immunityorder, for any vectorial Boolean function, and to study its prop-erties. The third objective is to apply the general results to com-biners with or without memory and thus obtain new results re-lating to the three scenarios considered, to combiners with finiteinput memory, to the influence of multiple outputs, and to the di-vide-and-conquer effect achievable.

Sections II and III are devoted to the first two objectives,respectively. In Section IV, an application of the theory ofideals and Gröbner basis algorithms for finding induced alge-braic equations is outlined. The third objective is dealt within Sections V and VI, for general combiners with memoryand for combiners with finite input memory, respectively. Theresults given in Sections V and VI relate to the general caseallowing a divide-and-conquer effect as well as to the specificcase without a divide-and-conquer effect. Conclusions aregiven in Section VII. Compact proofs of the basic mathematical

results are provided in Sections II and III, whereas the proofsof the derived results from the remaining sections are left to thereader.

II. INDUCED ALGEBRAIC EQUATIONS

Consider a vectorial Boolean function

denoted as . Let , where

denotes the range of . Further, let

and

Also, let , where

(Here and throughout, the logarithms are taken to the base .)Our objective is to study algebraic equations induced by ,

that is, by the equation . In the conditional sce-nario, is fixed and the algebraic equations involve only ,whereas in the unconditional scenario, is variable and the al-gebraic equations involve both and . Accordingly, in neitherof the two scenarios, is allowed to be involved in the algebraicequations.

An algebraic equation is defined by a multivariate polyno-mial required to be equal to zero. In the binary case, there isa one-to-one correspondence between Boolean functions andmultivariate polynomials with binary coefficients in which theexponents of the power terms take integer values or . Namely,any Boolean function can be uniquely represented by its alge-braic normal form. For induced algebraic equations, Booleanfunctions are required to be equal to zero on appropriate subsetsof the values of involved variables.

More precisely, for any nonempty subset , an al-gebraic equation induced by is an equation of the form

, , where or denotes a multivariate polynomialin , i.e., a Boolean functionspecified by its algebraic normal form. We equivalently say thatsuch an algebraic equation is satisfied on . We also say that

defines an algebraic equation on . An inducedalgebraic equation is called trivial if is the zero polynomial,and nontrivial otherwise. Given a subset , the set of all suchor, equivalently, the set of all induced algebraic equations is avector space, as it is closed under addition.

In particular, we are interested in algebraic equations of low(algebraic) degree, which are defined by multivariate polyno-mials of degree at most , where . The set of allsuch polynomials, i.e., the corresponding algebraic equations isalso a vector space, for any given and . Let denote theset of all the multivariate polynomials in variables of degree

Page 3: Vectorial Boolean functions and induced algebraic equations

530 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

at most . Any such polynomial can be characterized as a linearcombination of (linearly independent) monomials ofdegree at most . Let be an matrix whoserows and columns are indexed by the vectors from and by themonomials from , respectively, and whose entries are definedby evaluating the monomials on these vectors, respectively.

The following lemmas are essentially behind all the theoremsto be derived in the sequel, which in fact correspond to particularsubsets . As such, they are also behind the known results oncombiners with or without memory from [6], [1], [7], and [8].

Lemma 1: For any , there exists a nontrivialalgebraic equation on iff .

Proof: By definition, a nontrivial algebraic equation onis defined by a nonzero Boolean function , ,such that , . Such a function exists iff the set

is not empty.

Lemma 2: For any and , defines anontrivial algebraic equation on iff the columns ofcorresponding to the monomials in add up to zero.

Proof: As each is a linear combinationof different monomials from , it

follows that the equations and ,, are equivalent. The claim then follows from the defini-

tion of the matrix .

Lemma 3: For any , there exists a thatdefines a nontrivial algebraic equation on if

(1)

Proof: If (1) is true, then the number of columns ex-ceeds the number of rows in . As a consequence, thecolumns are linearly dependent and the claim then follows fromLemma 2.

If the binary coefficients defining any as a linear com-bination of different monomials of degree at mostare represented as a binary one-column matrix , then the vectorspace of all algebraic equations induced by can be obtainedby solving the system of linear equations in ,for example, by Gaussian elimination with time complexity

and space complexity

The dimension of this vector space equals

A. Conditional Algebraic Equations

In the conditional scenario, given a vectorial Boolean func-tion , the objective is to find algebraic equa-tions over when is assumed to be fixed and known. Vari-ables in should thus be eliminated. This scenario is partic-ularly interesting for the cryptanalysis of stream ciphers in theknown keystream sequence scenario, provided that is used asan output function producing a binary keystream sequence froman internal state sequence generated by a next-state function.

A nontrivial conditional algebraic equation over inducedby and any given is defined by a nonzeromultivariate polynomial such that , for

.

Theorem 1: For any vectorial Boolean functionand any given , there exists a nontrivial algebraic

equation over iff , and a polynomialdefines a nontrivial algebraic equation over iff the columnsof corresponding to the monomials in add upto zero. There exists a value of inducing a nontrivial algebraicequation over of degree at most if and

(2)

and, in particular, if and

(3)

If and is an integer, then (3) holds if .Proof: The first claim immediately follows from Lem-

ma 1, by setting and . By the same token,the second claim follows from Lemma 2.

The third claim is proven by applying Lemma 3. Namely, thetotal number of rows in all the matrices corre-sponding to different values of is exactly . There-fore, among them there must exist a value of defining a ma-trix whose number of rows is at most . By setting

, for such a , and by applying Lemma 3, we then obtainthat there exists a nontrivial algebraic equation on of de-gree at most if (2) is satisfied, becauseholds for such a . In addition, for (2) to hold, it is necessarythat .

The fourth claim is true due to , whereas thelast claim follows from the fact that if and is aninteger, then , so that .

B. Unconditional Algebraic Equations

In the unconditional scenario, given a vectorial Boolean func-tion , the objective is to find algebraic equationsover and . Variables in should thus be eliminated. Thisscenario is particularly interesting for the cryptanalysis of blockand stream ciphers where is an internal function whose outputis unknown, e.g., corresponding to an S-box of an intermediateround of a product block cipher or to an intermediate stage of aniterated construction of an output function in a stream cipher.

Page 4: Vectorial Boolean functions and induced algebraic equations

GOLIC: VECTORIAL BOOLEAN FUNCTIONS AND INDUCED ALGEBRAIC EQUATIONS 531

A nontrivial unconditional algebraic equation over andinduced by is defined by a nonzero multivariatepolynomial such that , for .

Theorem 2: For any vectorial Boolean function, there exists a nontrivial algebraic equation over

and iff , and a polynomialdefines a nontrivial algebraic equation over and iff thecolumns of corresponding to the monomials in

add up to zero. There exists an induced nontrivial algebraicequation over and of degree at most ifand

(4)

and, in particular, if and

(5)

Also, if , then (5) holds if .Proof: Now, we set and , where

. Then the first three claims directly follow fromLemmas 1–3, respectively. Note that for (4) to hold, it is neces-sary that . The fourth claim is due to

, whereas the last claim is a consequence of.

In a special case, by assuming that is empty ( ), weget and , and Theorem 2 is then aboutinduced algebraic equations over the output variables, in .

C. Constrained Unconditional Algebraic Equations

In the constrained unconditional scenario, given a vectorialBoolean function , the objective is to find alge-braic equations over and having certain specific properties.As before, variables in should be eliminated. In particular, welook for algebraic equations having any degree in such thatthe monomials with the highest degrees in do not depend on

. This scenario is interesting for the so-called fast algebraicattacks on stream ciphers [7], as the high-degree monomials in

can be eliminated by using the linear complexity propertiesof the corresponding stream cipher.

More precisely, let denote the set of all the multi-variate polynomials in and , , of degree at mostin and of any degree in such that the monomials whosedegrees in are larger than do not depend on , where

. A nontrivial constrained unconditional algebraicequation over and induced by is definedby a nonzero multivariate polynomial suchthat , for . Let denotea matrix whose rows and columns are indexed by the vectorsfrom and by the monomials from , respectively, andwhose entries are defined by evaluating the monomials on thesevectors, respectively.

Theorem 3: For any vectorial Boolean functionand , a polynomial defines a

nontrivial algebraic equation over and iff the columns ofcorresponding to the monomials in add up to

zero. There exists an induced nontrivial algebraic equation overand defined by a polynomial if

and

(6)

and, in particular, if and

(7)

Furthermore, if , then for any such , there exists avalue of for which is a nonzero polynomial in .

Proof: As in the proof of Theorem 2, we setand , where . However, instead of

considering the algebraic equations defined by , wenow consider the algebraic equations defined by ,where the number of monomials in is equal to

Then the first two claims follow from analogs of Lemmas 2and 3 pertaining to and , respectively. As inTheorem 2, for (6) to hold, it is necessary that ,whereas the third claim is due to . The lastclaim is proven by contraposition. Namely, if and foreach , a polynomial is equal to zero forevery , then it follows that is the zeropolynomial in and , which contradicts the assumption.

III. ALGEBRAIC IMMUNITY ORDER

Consider a vectorial Boolean function, denoted as , and let ,

where is the range of . Here, andcan be regarded as sets of binary variables.

Let denote a generic subset of variables from of sizeand let denote the subset of the remaining

variables from having size . For simplicity, we canthen write , with an appropriate ordering ofvariables, and consider algebraic equations involving thatare induced by .

Lemma 4: There are no nontrivial algebraic equations overinduced by any value of iff the set

does not depend on , that is, iff for eachfixed value of , the set of output values of , forall different values of , is the same and hence equal to .

Proof: Theorem 1 implies that for any given , thereexists a nontrivial algebraic equation over induced byiff . Consequently, there are no nontrivialalgebraic equations over induced by any value of iff

for every , which in turn holds iff for

Page 5: Vectorial Boolean functions and induced algebraic equations

532 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

each and , there exists a value ofsuch that . Further, this is true iff

for every .

If the property characterized by Lemma 4 holds for some ,then it also holds for any subset of . Lemma 4 thus justifiesthe following definition.

Definition 1: The maximal such that for every subsetof size , the set

is the same for each value of is called thealgebraic immunity order of . In particular, if ,that is, if the range of is maximal, then such is called thealgebraic resiliency order.

Recall that the correlation immunity/resiliency order of isdefined analogously [18], with a difference that

is regarded as a multiset, sothat it is the distribution of output values that matters, not onlythe output values themselves. In particular, for the correlationresiliency order, this multiset is required to be balanced, i.e.,uniformly distributed over .

Proposition 1: If is the algebraic immunity order of and, then there exists a subset of of size

and an output value such that there is a nontrivial algebraicequation over induced by .

Proposition 2: If is the algebraic immunity order of a vec-torial Boolean function of variables, thenand is greater than or equal to the correlation immunity orderof . Further, iff , i.e., iff is a constant function.

A Boolean function of variables is said to effectively de-pend on all the variables if it is not degenerate in any of thevariables, that is, if for any variable, there exist values of the re-maining variables such that changes if that variable changes.Equivalently, the algebraic normal form of contains all vari-ables. It follows that any such is nonconstant, i.e., that

.

Proposition 3: If is the algebraic immunity order of aBoolean function of variables, then iff isa nonconstant affine function effectively depending on allvariables, i.e., iff , whereis a binary constant.

Proof: Assume that . Then Proposition 2 impliesthat . If we fix the values of any subset of variablesand vary the remaining variable, then the corresponding restric-tion of attains both output values iff it is nonconstant affinein this variable. Further, if for any fixed value of the chosensubset of variables, the corresponding restriction of isnonconstant affine in the remaining variable, then, due to theShannon decomposition, is nonconstant affine in this variable.As this holds for any variable, it follows that is a nonconstantaffine function effectively depending on all variables, i.e., that

, where is a binary constant.

Conversely, assume that is a nonconstant affine function effec-tively depending on all variables. If we fix the values of anysubset of variables and vary the remaining variable, thenthe corresponding restriction of attains both output values,since it is nonconstant affine in this variable. Hence, .

Note that is nonconstant affine iff its algebraic degree isequal to . So, if a Boolean function of variables is not affine,that is, if the algebraic degree of is at least , then

. Except in this particular case, there is no tradeoff betweenthe algebraic immunity order and algebraic degree of similarto that between correlation immunity order and algebraic degreeof [18], due to a less restrictive definition. This can easily beseen from counter-examples.

Example 1: The so-called multiplexer function

is balanced and has algebraic degree . Its correlation resiliencyorder is , while its algebraic resiliency order is .

Definition 2: If is the algebraic immunity order of a vecto-rial Boolean function of variables, then for each

, we can find the minimal degree of induced nontrivial alge-braic equations over all subsets of size and over all outputvalues . The resulting sequence is nonincreasingand is called the induced algebraic degree profile of . The min-imal value , that is, the minimal degree of nontrivial inducedalgebraic equations over all and , is called the induced al-gebraic degree of .

The name induced algebraic degree appears to be more ap-propriate than the name algebraic immunity introduced in [15].In the analogy between algebraic and correlation attacks, non-trivial induced algebraic equations correspond to nonzero linearequations holding with a probability different from one half,when conditioned on the output values. Note that the deviationfrom one half is measured by the so-called correlation coeffi-cients. In this analogy, algebraic immunity/resiliency order cor-responds to correlation immunity/resiliency order, with respectto the divide-and-conquer effect, whereas the degrees of inducedalgebraic equations and the correlation coefficients play anal-ogous roles influencing the effectiveness of the correspondingalgebraic and correlation attacks, respectively. In particular, theinduced algebraic degree corresponds to the maximal absolutevalue of the correlation coefficients.

From Theorem 1, we obtain the following upper bound on theinduced algebraic degree.

Proposition 4: The induced algebraic degree is smaller thanor equal to the minimal value of satisfying

(8)

For nonconstant Boolean functions, and hence[6], [15]. For invertible functions, and hence

. This is natural as any known output value uniquelydetermines the input value, and that can be expressed by linear

Page 6: Vectorial Boolean functions and induced algebraic equations

GOLIC: VECTORIAL BOOLEAN FUNCTIONS AND INDUCED ALGEBRAIC EQUATIONS 533

equations in the input variables. In general, an application of thewell-known inequality

(9)holding for , whereis the binary entropy function, then results in the following nu-merical approximation to the upper bound on , for

:

(10)

On the other hand, it directly follows that cannot be greaterthan the minimum of the algebraic degrees of componentBoolean functions of .

For cryptographic functions, it is reasonable to require thatthe algebraic immunity order is close to maximum, , thatthe induced algebraic degree is also close to maximum, which isthe minimal value of satisfying (8), and that the induced alge-braic degree profile is close to being strictly decreasing. Furtherstudy of the algebraic immunity order is an interesting area forfuture investigations.

IV. IDEALS AND GRÖBNER BASIS ALGORITHMS

For any nonempty subset , if is a binarymultivariate polynomial defining an induced algebraic equation

on , then, for any binary multivariate polynomial ,the polynomial also defines an induced algebraic equation

on . As the set of all such is closed underaddition, it follows that this set is an ideal in the ring of binarymultivariate polynomials. For the basic concepts regardingideals and related algorithms, see [9]. To keep a one-to-onecorrespondence between the polynomials and Boolean func-tions, the polynomials should be considered modulo the ideal

generated by the polynomials definingthe binary field equations over variables. (Note that in thebinary field subtraction is the same as addition.) The modularreduction is performed by replacing positive integer-valuedexponents of the power terms by . The ideal of the resultingpolynomials (i.e., Boolean functions in the algebraic normalform) that define all the algebraic equations induced by isdenoted by .

The vector subspace of all of degree at most isdenoted by , , where consists of the zeropolynomial. The induced algebraic degree of , , is definedas the minimal such that the dimension of is positive.

It follows that is a principal ideal generated by the indi-cator function (polynomial) of defined by , ,and , , that is,

(11)

It also follows that

(12)

or, in terms of ideals associated with sets

(13)

where the sum of ideals and is defined as

Note that (11) and (13) hold because the polynomials are con-sidered modulo the ideal . In general, wewould have and .

For example, according to Section II, consider a vectorialBoolean function ,

, , and al-gebraic equations over induced by any given value of .The set of all the polynomials defining such equations is then

, and in view of (11) and (13), we then have

(14)

The characterization of the set of induced algebraic equationsby ideals does not have any impact on the main results of thispaper, which are about the existence of induced algebraic equa-tions with certain properties. However, it may have an impacton the algorithms for finding such equations. Recall that thebasic, monomial evaluation algorithm described in Section IIfor finding low-degree induced algebraic equations is based onLemma 2, uses Gaussian elimination, and requires the set tobe precomputed.

If the set is defined implicitly, e.g., in terms of a vectorialBoolean function specified by a set of polynomials, then some ofmany different polynomial bases generating the ideal maybe useful for finding low-degree induced algebraic equations.In particular, the Gröbner bases with respect to the graded (re-verse) lexicographic ordering of monomials appear to be veryinteresting in this regard. Note that given a monomial ordering,the so-called reduced Gröbner basis of a given polynomial idealis unique.

There exist a number of algorithms for computing Gröbnerbases, starting from the well-known Buchberger’s algorithm(e.g., see [9] and [3]). The ones that can produce the Gröbnerbasis elements in order of increasing degree, possibly viaa homogenization of the underlying ideal, are especiallyinteresting, since low-degree polynomials in can beobtained as linear combinations, with polynomial coefficients,of low-degree elements of a Gröbner basis. One may thusobtain the sets in order of increasing degree as wellas the induced algebraic degree . To apply Gröbner basisalgorithms, the polynomials should not be reduced modulothe ideal , which means that the ideal

should be considered instead. Forexample, instead of given by (14), one should con-sider the ideal .Note that the Gröbner basis elements produced by such analgorithm do not contain power terms, in variables ,with exponents bigger than , so that they do not have to bereduced modulo . It is likely that such

Page 7: Vectorial Boolean functions and induced algebraic equations

534 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

algorithms may be more efficient than the basic, monomialevaluation algorithm.

V. COMBINERS WITH MEMORY

A binary combiner with inputs, outputs, and bits ofmemory is a nonautonomous finite-state machine defined by

and , , whereis a next-state vectorial Boolean

function, is an output vecto-rial Boolean function, is a state (memory)vector at time , is an initial state, isan input vector at time , and is the outputvector at time . The input sequences are generated by a linearautonomous finite-state machine, typically, by LFSRs.

One objective may be to find induced algebraic equationsover the input sequences when the output sequences are as-sumed to be known, in the conditional scenario, and over theinput and output sequences, in the unconditional scenario. Inboth scenarios, the state sequence has to be eliminated from theequations. This can achieved by studying the associated vecto-rial Boolean function , fora generic , that defines a block of consecutive outputsas a function of the corresponding block of consecutive inputsand the preceding state, that is, . Namely, if

and denoteblocks of consecutive output and input vectors starting at time, respectively, then it follows that , ,

where is independent of . This function is used in [10] and[11] for studying the linear correlations in general combinerswith memory.

More generally, if a binary combiner with or without memoryconsists of a number of different LFSRs, then the objective ofalgebraic attacks may be to reconstruct the initial states of achosen subset of LFSRs, , by using the inducedalgebraic equations involving only the input sequences cor-responding to this subset, where the remaining input se-quences have to be eliminated from the equations. This divide-and-conquer effect may thus significantly reduce the complexityof algebraic attacks. Accordingly, it is interesting to study theconditions under which such algebraic equations exist by ap-plying the results from Section II, where, for simplicity, weconcentrate only on the sufficient conditions given in Theorems1–3. However, we also bear in mind the necessary and sufficientconditions from these theorems, which show that the inducedalgebraic equations may also exist if these sufficient conditionsare not satisfied.

To this end, we consider the associated vectorial Booleanfunction , with an appropriate partition of inputvariables. Namely, let denote the vector of input variablescorresponding to the chosen subset of input sequences, where

, and let denote the vector of the remaininginput variables. We can then writeand consider algebraic equations involving that are inducedby , where both and have to be eliminated fromthe equations.

Theorems from Section II are then applied to the vectorialBoolean function , by setting , ,

, and . Consequently, ,, and , while the value of denoted as

can depend on in various ways. In particular, to satisfyfor any , it is necessary that . Induction over

immediately shows that to satisfy for each , it issufficient that the output function achieves all values forany fixed value of the state vector. This is a generalization of thecondition from [1], which holds for . Note that for

and linear algebraic equations, the set , the matrix, and an application of the Gaussian elimination

method are proposed in [12], when considering combiners withmemory and single or multiple binary outputs.

Theorem 4: For any binary combiner with inputs,outputs, and bits of memory having the associated vecto-rial Boolean function , where

, there exists a value of inducing a non-trivial algebraic equation over of degree at most if

and

(15)

If , , and ,then (15) holds if .

Theorem 5: For any binary combiner with inputs,outputs, and bits of memory having the associated vecto-rial Boolean function , where

, there exists an induced nontrivial algebraicequation over and of degree at most if ,

, and

(16)

In particular, if and ,then (16) holds if .

Theorem 6: For any binary combiner with inputs,outputs, and bits of memory having the associated vecto-rial Boolean function , where

, and for , there exists an inducednontrivial algebraic equation over and defined by apolynomial of degree at most in and ofany degree in , such that the monomials whose degrees in

are larger than do not depend on , if ,, and

(17)

Furthermore, if and , then thereexists a value of for which such a polynomial isa nonzero polynomial in .

With respect to the sufficient conditions, it thus turns out thatmultiple binary outputs ( ) generally reduce the number

Page 8: Vectorial Boolean functions and induced algebraic equations

GOLIC: VECTORIAL BOOLEAN FUNCTIONS AND INDUCED ALGEBRAIC EQUATIONS 535

of consecutive outputs required and the algebraic degreeinduced. Moreover, they also enable a divide-and-conquer effectwhich facilitates algebraic attacks. In particular, pay attention tothe condition involved, where is the numberof inputs to be eliminated. However, even if a combiner has asingle binary output ( ), a divide-and-conquer effect maystill be achievable. For example, it is shown in Section III that ifa Boolean function of input variables is not affine, then theremust exist a subset of input variables allowing a nontrivialalgebraic equation.

In a special case when a divide-and-conquer effect is not re-quired, we set , and Theorems 4–6 then assume specialforms shown below as Corollaries 1–3, respectively. Note thatin this case, the condition is satisfied for any .

Corollary 1: For any binary combiner with inputs,outputs, and bits of memory having the associated vectorialBoolean function , there exists a value ofinducing a nontrivial algebraic equation over of degree atmost if and

(18)

If , , and , then (18) holds if.

Corollary 2: For any binary combiner with inputs,outputs, and bits of memory having the associated vectorialBoolean function , there exists an inducednontrivial algebraic equation over and of degree at most

if and

(19)

In particular, if , then (19) holds if.

Corollary 3: For any binary combiner with inputs,outputs, and bits of memory having the associated vectorialBoolean function and for , thereexists an induced nontrivial algebraic equation over and

defined by a polynomial of degree at mostin and of any degree in , such that the monomials

whose degrees in are larger than do not depend on , ifand

(20)

Furthermore, if and , then there exists a valueof for which such a polynomial is a nonzero poly-nomial in .

Corollary 1 is a generalization of the corresponding theoremfrom [1], which holds for and . This theorem

itself generalizes an earlier theorem from [6], which holds for. Note that the corresponding theorem from [8], holding

for any and , is not a proper generalization of the the-orem from [1] as it holds in the unconditional scenario wherethe algebraic degree in can be arbitrary and as such does notguarantee the existence of a value of yielding a nontrivialalgebraic equation only in . In this regard, Corollary 1 in factverifies in a mathematically precise way the conjecture from [8]that a nontrivial algebraic equation exists if “the output bits arefully independent and not related by some algebraic equationand if the output takes all possible values.”

Corollary 3 is a generalization of the corresponding theoremfrom [7], which holds for , , and , that is, fora memoryless combiner with a Boolean output function. In thiscase, for (20) to hold, it is sufficient that . Note that thelast claim in Corollary 3 is new and is important as it establishesthe existence of nontrivial algebraic equations only in .

VI. COMBINERS WITH FINITE INPUT MEMORY

A combiner is said to have finite input memory if the statevector is composed of a finite number of consecutive precedingbits in each of the input sequences, so that current output bits canbe expressed as a function of the current input bits and a finitenumber of preceding bits in each of the input sequences. Ex-amples include a well-known binary nonlinear filter generator,which can be regarded as a binary combiner with finite inputmemory and with one input and one or more outputs and, moregenerally, a binary combiner composed of a number of LFSRsin which the outputs of a number of stages from each of theLFSRs are taken to the input of a vectorial Boolean function toproduce the output.

Consider a binary combiner with inputs, outputs, andbits of finite input memory with a general objective to find

induced algebraic equations involving only a chosen subset ofinput sequences . In this case, the algebraic

equations can involve the , , state bits corre-sponding to these input sequences. Namely, we can then write

where ,, and both and have to be elimi-

nated from the equations. Since the induced algebraic equationsare allowed to involve a number of state bits, their degrees canbecome smaller in comparison with the case when the combinermemory is not finite with respect to the input. In view of (9) and(10), an estimate of the reduced degree in terms of the binaryentropy function is also provided.

Theorem 7: For any binary combiner with inputs, out-puts, and bits of finite input memory having the associated vec-torial Boolean function ,where and , there exists a value ofinducing a nontrivial algebraic equation over and of de-gree at most if and

(21)

Page 9: Vectorial Boolean functions and induced algebraic equations

536 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

If , , andthen (21) holds if as well as if

(22)provided that .

Theorem 8: For any binary combiner with inputs, out-puts, and bits of finite input memory having the associated vec-torial Boolean function ,where and , there exists an induced non-trivial algebraic equation over , , and of degree at most

if , , and

(23)

If and , then (23)holds if as well as if

(24)provided that .

Theorem 9: For any binary combiner with inputs, out-puts, and bits of finite input memory having the associated vec-torial Boolean function ,where and , and for , there ex-ists an induced nontrivial algebraic equation over , , and

defined by a polynomial of degree at mostin and and of any degree in , such that the monomialswhose degrees in and are larger than do not depend on

, if , , and

(25)

Furthermore, if and , then there existsa value of for which such a polynomial is anonzero polynomial in and .

The divide-and-conquer effect provided by Theorems 7–9 hasinteresting practical implications for combiners with finite inputmemory and multiple binary outputs ( ). For example, if

, then any subset of at most LFSRs can beeliminated from algebraic equations, regardless of how manyinputs to the combining function are effectively taken from eachof the involved LFSRs.

In a special case, when a divide-and-conquer effect is not re-quired, we set and , and Theorems 7–9 then assumespecial forms shown below as Corollaries 4–6, respectively. Inthis case, nontrivial equations exist for any and their de-grees may be reduced.

Corollary 4: For any binary combiner with inputs, out-puts, and bits of finite input memory having the associated vec-torial Boolean function such that and

for any , there exists a value of inducing a nontrivialalgebraic equation over and of degree at most if

(26)

and, in particular, if . If and ,then (26) holds if

(27)provided that .

Corollary 5: For any binary combiner with inputs, out-puts, and bits of finite input memory having the associated vec-torial Boolean function and for any ,there exists an induced nontrivial algebraic equation over ,

, and of degree at most if

(28)

and, in particular, if . Also, (28) holdsif

(29)provided that .

Corollary 6: For any binary combiner with inputs, out-puts, and bits of finite input memory having the associated vec-torial Boolean function , , and forany , there exists an induced nontrivial algebraic equationover , , and defined by a polynomial ofdegree at most in and and of any degree in , suchthat the monomials whose degrees in and are larger than

do not depend on , if

(30)

Furthermore, if and , then there exists a valueof for which such a polynomial is a nonzeropolynomial in and .

Corollary 4 can be considered as a generalization of the corre-sponding theorem from [6] when applied to nonlinear filter gen-erators, which can be obtained by setting and ,where is equal to the number of inputs to the combiningoutput function if these inputs are taken from consecutive stagesof the underlying LFSR. In particular, Corollary 4 enables oneto consider a nonlinear filter generator ( ) forand thus analyze the impact of a number of output bits, greaterthan one, on the algebraic equations induced.

VII. CONCLUSION

Some general concepts and results on vectorial Boolean func-tions and induced algebraic equations are presented. The con-

Page 10: Vectorial Boolean functions and induced algebraic equations

GOLIC: VECTORIAL BOOLEAN FUNCTIONS AND INDUCED ALGEBRAIC EQUATIONS 537

ditional scenario considered is more useful for stream ciphers,whereas the unconditional scenario may be useful for productblock ciphers and for stream ciphers whose output function de-pends on a large number of state variables and is obtained by aniterated construction. The constrained unconditional scenario isuseful for the so-called fast algebraic attacks on stream ciphers.A divide-and-conquer effect of induced algebraic equations ispointed out and the notion of correlation immunity/resiliencyorder [18] is accordingly extended to that of algebraic immu-nity/resiliency order and its basic properties are derived.

The new contributions on combiners with or without memoryinclude differentiating among the three scenarios, showing theimportance of the range of the underlying vectorial Booleanfunction, proving that a finite input memory (e.g., in a nonlinearfilter generator) can considerably reduce the induced algebraicdegree, and introducing a possible divide-and-conquer effect,especially in the case of multiple binary outputs, which can sig-nificantly reduce the complexity of algebraic attacks.

It is pointed out that the polynomials defining induced alge-braic equations form an ideal in the ring of binary multivariatepolynomials and some basic properties are established. It is thenargued that Gröbner basis algorithms may be useful for findinginduced algebraic equations of low degree. This is an inter-esting topic for future investigations and may lead to more ef-ficient algorithms than the basic algorithm using the evaluationof low-degree monomials. It is also interesting to look for algo-rithms in a more general case where the equations are allowedto hold with a high probability rather than with certainty.

ACKNOWLEDGMENT

The author wishes to thank anonymous referees for usefulsuggestions.

REFERENCES

[1] F. Armknecht and M. Krause, “Algebraic attacks on combiners withmemory,” in Advances in Cryptology—Crypto 2003 (Lecture Notes inComputer Science). Berlin, Germany: Springer-Verlag, 2003, vol.2729, pp. 162–176.

[2] F. Armknecht, “On the existence of low-degree equations for algebraicattacks,” Cryptology ePrint Archive, Rep. 2004/185, 2004.

[3] G. Ars, J.-C. Faugère, H. Imai, M. Kawazoe, and M. Sugita, “Compar-ison between XL and Gröbner basis algorithms,” in Advances in Cryp-tology—Asiacrypt 2004 (Lecture Notes in Computer Science). Berlin,Germany: Springer-Verlag, 2004, vol. 3329, pp. 338–353.

[4] N. Courtois, “Higher order correlation attacks, XL algorithm and crypt-analysis of Toyocrypt,” in Information Security and Cryptology—ICISC2002 (Lecture Notes in Computer Science). Berlin, Germany:Springer-Verlag, 2002, vol. 2587, pp. 182–199.

[5] N. Courtois and J. Pieprzyk, “Cryptanalysis of block ciphers withoverdefined systems of equations,” in Advances in Cryptology—Asi-acrypt 2002 (Lecture Notes in Computer Science). Berlin, Germany:Springer-Verlag, 2002, vol. 2501, pp. 267–287.

[6] N. Courtois and W. Meier, “Algebraic attacks on stream ciphers withlinear feedback,” in Advances in Cryptology—Eurocrypt 2003 (LectureNotes in Computer Science). Berlin, Germany: Springer-Verlag, 2003,vol. 2656, pp. 345–359.

[7] N. Courtois, “Fast algebraic attacks on stream ciphers with linear feed-back,” in Advances in Cryptology—Crypto 2003 (Lecture Notes in Com-puter Science). Berlin, Germany: Springer-Verlag, 2003, vol. 2729,pp. 176–194.

[8] , “Algebraic attacks on combiners with memory and several out-puts,” in Information Security and Cryptology —ICISC 2004 (LectureNotes in Computer Science). Berlin, Germany: Springer-Verlag, 2005,vol. 3506, pp. 3–20.

[9] D. A. Cox, J. B. Little, and D. O’Shea, Ideals, Varieties, and Algo-rithms. Berlin, Germany: Springer-Verlag, 1996.

[10] J. Dj. Golic, “Correlation via linear sequential circuit approximationof combiners with memory,” in Advances in Cryptology—Euro-crypt ’92 (Lecture Notes in Computer Science). Berlin, Germany:Springer-Verlag, 1993, vol. 658, pp. 113–123.

[11] , “Correlation properties of a general binary combiner withmemory,” J. Cryptol., vol. 9, no. 2, pp. 111–126, 1996.

[12] , “Conditional correlation attack on combiners with memory,” Elec-tron. Lett., vol. 32, no. 24, pp. 2193–2195, 1996.

[13] P. Hawkes and G. Rose, “Rewriting variables: The complexity of fast al-gebraic attacks on stream ciphers,” in Advances in Cryptology—Crypto2004 (Lecture Notes in Computer Science). Berlin, Germany:Springer-Verlag, 2004, vol. 3152, pp. 390–406.

[14] W. Meier and O. Staffelbach, “Correlation properties of combiners withmemory in stream ciphers,” J. Cryptol., vol. 5, no. 1, pp. 67–86, 1992.

[15] W. Meier, E. Pasalic, and C. Carlet, “Algebraic attacks and decompo-sition of Boolean functions,” in Advances in Cryptology—Eurocrypt2004 Lecture Notes in Computer Science (Lecture Notes in ComputerScience). Berlin, Germany: Springer-Verlag, 2004, vol. 3027, pp.474–491.

[16] S. Petrovic and A. Fúster-Sabater, “Cryptanalysis of the A5/2 Algo-rithm,” Cryptology ePrint Archive, Rep. 2000/52, 2000.

[17] A. Shamir, J. Patarin, N. Courtois, and A. Klimov, “Efficient algorithmsfor solving overdefined systems of multivariate polynomial equations,”in Advances in Cryptology—Eurocrypt 2000 (Lecture Notes in Com-puter Science). Berlin, Germany: Springer-Verlag, 2000, vol. 1807,pp. 392–407.

[18] T. Siegenthaler, “Correlation immunity of nonlinear combining func-tions for cryptographic applications,” IEEE Trans. Inf. Theory, vol.IT-30, no. 5, pp. 776–780, Sep. 1984.