Signatures of Strings

Signatures of Strings

Stephen G. Hartke and A.J. Radcliffe

Abstract. The parity of a permutation π can be defined as the parityof the number of inversions in π. The signature ǫpπq of π is an encoding

of the parity in a multiplicative group of order 2: ǫpπq “ p´1qinvpπq. Itis also well known that half of the permutations of a finite set are evenand half are odd.

In this paper, we explore the natural notion of parity for largermoduli; that is, we define the m-signature of a permutation π to bethe number of inversions of π, reduced modulo m. Unlike with the 2-signatures of permutations of sets, the m-signatures of the permutationsof a multiset are not typically equi-distributed among the modulo m

residue classes, though the distribution is close to uniform. We presenta recursive method of calculating the distribution of m-signatures ofpermutations of a multiset, develop properties of this distribution, andpresent sufficient conditions for the distribution to be uniform.

Mathematics Subject Classification (2010). 05A05 (05A15).

Keywords. inversion number, permutation of multiset, parity of permu-tation, signature.

1. Introduction

The notion of the parity of a permutation is well known. One way to definethe parity of a permutation π of t1, 2, . . . , nu is by counting inversions inthe string π1π2 . . . πn. A pair pj, kq of positions with j ă k is an inversion ifπj ą πk. Letting invpπq denote the number of inversions of π, the parity of πis simply invpπq mod 2. Additionally, the signature ǫpπq of π is an encodingof the parity in the multiplicative group of order 2: ǫpπq “ p´1qinvpπq.

It is well known that half of the permutations of t1, 2, . . . , nu are evenand half are odd. In fact, the even permutations form the alternating groupAn. In this paper, we explore the natural notion of parity for larger moduli;that is, we define the m-signature of a permutation π to be the number of

The first author was supported in part by National Science Foundation grant DMS-

0914815.

2 Stephen G. Hartke and A.J. Radcliffe

inversions of π, reduced modulo m. Additionally, we consider permutationsof multisets, not just sets. Unlike with the 2-signature of permutations ofsets, the m-signatures of permutations of a multiset are not typically equi-distributed among the m residues. However, this distribution is close to uni-form. We present here a method of calculating the m-signature distributionof the permutations of a multiset, develop properties of this distribution, andpresent sufficient conditions for the distribution to be uniform.

The distribution of the inversion numbers of a multiset has been stud-ied before. MacMahon ([7], see also Stanley [9, pp. 26–27]) showed that thegenerating function for the number of permutations of a multiset with giveninversion number is the q-multinomial coefficient (also known as the Gaussianmultinomial coefficient). Famously MacMahon showed that the q-multinomialcoefficient also is the generating function for the major index statistic for per-mutations; Foata [5] gave a bijective proof of this result.

Conger and Viswanath [4] showed that, with appropriate scaling, theinversion numbers of permutations of a multiset are approximately normallydistributed. Brunetti, Del Lungo, and Del Ristoro [3, 1, 2] used a cyclic shiftoperator to obtain results about m-signature distributions of multisets of sizem. We use this shift operator in some of our proofs.

In Section 2 we introduce the notation we will use throughout the paper.In Section 3 we give a direct proof that the m-signature distribution of thepermutations of t1s1 , 2s2 , . . . , ℓsℓu is unaffected by permuting the si. Thisallows us to completely determine the 2-signature distribution of any multiset.

In Section 4 we describe the cyclic shifting technique of Brunetti, DelLungo, and Del Ristoro, and apply it to prove that certain natural collectionsof permutations have periodic signature distributions. In Section 5 we cal-culate exactly the m-signature distribution of the multiset t1s1 , 2s2 , . . . , ℓsℓuwhere m is any prime. In particular we characterize when this distribution isequi-distributed (i.e., constant); when it is not, we give a heuristic estimatefor the deviation from equi-distribution, showing that it is very small.

The problem of determining m-signature distributions when m is com-posite is substantially harder. Our results focus on the case where the lengthof the permutation is a multiple of the modulus m. In Section 6 we give arecursive expression for the exact distribution, but unfortunately it is, thoughsubstantially simpler than a direct enumeration of the distribution, compu-tationally intractable for multisets of size much larger than 50.

In Section 7 we prove some properties of these distributions. We use therecursive expression proved in Section 6 to show that signature distributionsare always ruler functions, meaning that the number of permutations withm-signature a is a function of gcdpa,mq, and moreover that this functionis increasing on the divisor lattice of m. The reason for the name is thatthe histogram of a ruler function is reminiscent of the tick marks on a ruler(see Figure 1). As a consequence we show that for any multiset whose size

Signatures of Strings 3

is a multiple of m and any residue a, the number of permutations with m-signature a is at most the number with m-signature 0 and at least the numberwith m-signature 1.

In the final section we discuss some avenues for future work.

2. Notation

We begin with a description of the notation and terminology that we will usethroughout the paper.

Definition 1. For convenience we use the following standard notation. If A Ďt1, 2, . . . , nu is a set of positions we write sA for the complement t1, 2, . . . , nu zA.Also we write rb, cs for the interval of integers tb, b ` 1, . . . , cu.

Definition 2. A permutation π of a finite multiset S is a sequence of elementsof S such that each element of S occurs exactly as many times in π as it doesin S. We write ΣpSq for the set of all permutations of S. If π is a permutationof length n and A Ď t1, 2, . . . , nu we write πrAs for the sequence of values ofπ appearing in positions indexed by A.

The definition of an inversion of a permutation of a multiset is formallyexactly the same as that in the case of sets.

Definition 3. An inversion of a permutation π of a multiset S (with integerelements) is a pair pj, kq of positions in π such that j ă k and πj ą πk. Wedefine

invpπq “ #t pj, kq : pj, kq is an inversion of π u.

Sometimes we will want to count inversions according to where they appear.To this end we also define, for J,K Ď t1, 2, . . . , nu,

invJpπq “ invpπrJsq “ #t pj, kq : pj, kq is an inversion of π and j, k P J u

invJ,Kpπq “ #t pj, kq : pj, kq is an inversion of π and j P J, k P K u.

It is traditional to define the 2-signature of a permutation to be anelement of the multiplicative group t˘1u. For larger m we feel it is clearerfor the m-signature to be an element of a group written additively. Thuswe define the m-signature ǫmpπq of a permutation for any m ą 1 to be theelement of Zm given by

ǫmpπq “ invpπq mod m.

Here and elsewhere we write i mod m for the image of i under the naturalquotient map Z Ñ Zm. In this paper we always use the notation “mod m”to stand for this function rather than the equivalence relation on Z.

Definition 4. For a multiset S, an integer m ą 1, and a P Zm, we define

ΦampSq “ #tπ P ΣpSq : ǫmpπq “ a u.


We define the m-signature distribution of (permutations of) S (or simply itssignature distribution) to be the function on Zm

ΦmpSq “ pΦampSqqaPZm

.

In slightly more generality we consider subsets P Ď ΣpSq, and think aboutthe m-signature distribution of P . We write

ΦampPq “ #tπ P P : ǫmpπq “ a u

ΦmpPq “ pΦampPqqaPZm

.

Recall that a composition of n of length ℓ is a vector s “ ps1, s2, . . . , sℓq “psiq

ℓ1 of non-negative integers such that

ři si “ n.

Definition 5. If S is a multiset of cardinality n whose elements come fromt1, 2, . . . , ℓu and i appears si times in S, then s “ psiq

ℓ1 is a composition of

n. We refer to the si as the repetition numbers of S. We write

S “ t1s1 , 2s2 , . . . , ℓsℓu “ 1, 1, . . . , 1loooomoooon

s1

, 2, 2, . . . , 2loooomoooons2

, . . . , ℓ, ℓ, . . . , ℓloooomoooonsℓ

(,

and Σpsq for Σpt1s1 , 2s2 , . . . , ℓsℓuq. Since S and its sequence of repetitionnumbers s contain the same information we think of Φm as a function ofthe sequence s rather than the multiset S, and we write Φmpsq for the m-signature distribution.

3. Symmetry and the case m “ 2

MacMahon’s result, that the generating function for the inversion numbersis given by the q-multinomial coefficient, shows that Φmpsq is symmetric inthe si. Though this result is well known, we give here a direct combinatorialproof that conveys the flavor of our other arguments. The proof is based ona “double flip” operation that we introduce now.

Definition 6. Given an interval rb, cs of integers and a permutation σ of somemultiset with entries from rb, cs we define the double flip Drb,cspσq of σ to bethe sequence obtained by first reversing σ and then reflecting the values of σabout pb ` cq{2. Thus, if |σ| “ n,

Drb,cspσq “ pb ` c ´ σn`1íqn

i“1 .

More generally if σ is a sequence of integers then we define Drb,cspσq byapplying Drb,cs to the places where σi P rb, cs, and leaving all other entries

of σ unchanged. Set B “ σ´1prb, csq; recall that σrBs is the subsequence ofσ consisting of those entries with values in rb, cs (thought of as indexed by1, 2, . . . , |B|). We define

Drb,cspσqrBs “ Drb,cspσrBsq Drb,cspσq“ sB

‰“ σ

“ sB‰.


For instance,

Dr4,6sp5, 6, 6, 6, 4, 4q “ p6, 6, 4, 4, 4, 5q

Dr1,6sp5, 6, 6, 6, 4, 4q “ p3, 3, 1, 1, 1, 2q

Dr2,3sp1, 3, 4, 4, 3, 3, 7, 6, 2, 3q “ p1, 2, 4, 4, 3, 2, 7, 6, 2, 2q.

Note that for all σ and rb, cs we have Drb,cspDrb,cspσqq “ σ.

Lemma 7. For all multiset permutations σ and intervals rb, cs we have

invpDrb,cspσqq “ invpσq.

Proof. We first prove the special case in which σ is a sequence of values fromrb, cs of length n. In that case we have that pj, kq is an inversion of Drb,cspσqif and only if

j ă k and b ` c ´ σn`1´j ą b ` c ´ σn`1´k.

Thus pj, kq is an inversion iff j ă k and σn`1´k ą σn`1´j . This says exactlythat pn ` 1 ´ j, n ` 1 ´ kq is an inversion of σ. So the inversions of σ andthose of Drb,cspσq are in one-to-one correspondence.

For the general case we have, setting J “ σ´1rb, cs “`Drb,cspσq

˘´1rb, cs,

invpDrb,cspσqq “ invJ pDrb,cspσqq ` inv sJ pDrb,cspσqq ` invJ, sJpDrb,cspσqq

“ invJ pσq ` inv sJ pσq ` invJ, sJpσq.

The equality holds term-wise: the first terms agree by the argument above;the second terms agree because Drb,cspσq and σ agree on sJ ; and the final

terms agree because the question of whether pj, kq, with j P J , k P sJ , is aninversion depends solely on σk “

`Drb,cspσq

˘k, since all values in rb, cs have

the same order relationship with σk. l

It is a simple consequence of this lemma that Φmpsq is a symmetricfunction of the si.

Theorem 8. If s and s1 are compositions of n of length ℓ and s1 is a permu-tation of s, then Φmps1q “ Φmpsq for all m ě 1.

Proof. We prove the result directly when s1 is obtained from s by reversing acontiguous subsequence. These operations, applied successively, clearly allowus to reach any rearrangement of s, and so the result will be proved. Supposethen that for some 1 ď b ă c ď ℓ we are reversing the subsequence in placesrb, cs of s, and so have

s “ ps1, s2, . . . , sb´1, sb, sb`1, . . . , sc´1, sc, sc`1, . . . , sℓq

s1 “ ps1, s2, . . . , sb´1, sc, sc´1, . . . , sb`1, sblooooooooooomooooooooooonreversed

, sc`1, . . . , sℓq.

Lemma 7 shows that Drb,cs is an inversion-number preserving bijection be-tween Σpsq and Σps1q. Hence Φmpsq “ Φmps1q for all m ě 1. l

This preparation is already sufficient to completely analyze the 2-sig-nature distribution for any multiset.


Theorem 9. Let s “ psiqℓ1 be a composition of n. If s contains at least two

odd entries then

Φ02psq “ Φ1

2psq “1

2

ˆn

s1, s2, . . . , sℓ

˙.

Otherwise, setting

M “

ˆn

s1, s2, . . . sℓ

˙´

ˆtn{2u

ts1{2u , ts2{2u , . . . , tsℓ{2u

˙,

we have

Φ02psq “

M

2`

ˆtn{2u

ts1{2u , ts2{2u , . . . tsℓ{2u

˙

Φ12psq “

M

2.

Proof. Suppose first that s has at least two odd entries. By Theorem 8 we mayassume that s1 and s2 are odd. We will define a sign-reversing involution onΣpsq. Given a permutation π of S we consider the pairs pπ1, π2q, pπ3, π4q, . . . .The image π1 is obtained from π by reversing the first such pair, pπ2i´1, π2iqsay, for which π2i´1 �“ π2i. (There must be such a pair, else all the entries ofπ are paired up, except possibly πn if n is odd; this precludes having both s1and s2 odd.) The only pair of positions that is an inversion in one of π, π1 andnot in the other is exactly p2i ´ 1, 2iq, so ǫ2pπq “ ǫ2pπq ` 1. (Recall that thiscalculation occurs in Z2.) Moreover this map is an involution, since the firstnon-constant pair in π1 is again at positions p2i ´ 1, 2iq. Thus there are asmany even permutations as odd ones, and the claim in the theorem is proved.

If, on the other hand, we are in the case in which there is at most oneodd entry, let us suppose (by Theorem 8) that sℓ is odd if any entry is. Theinvolution above can be applied unless π1 “ π2, π3 “ π4, etc. (If n is oddthen there is no additional condition on πn.) Again, for such permutationsthe sign of π1 is the opposite of the sign of π; we have either added oneinversion or removed one inversion. Now we claim that all permutations notin the domain of the involution are even. For such permutations there are noinversions within pairs; inversions between pairs occur in multiples of 4; andinversions involving the leftover entry (if there is one) come in pairs. Thusthere is an excess of even permutations over odd ones of exactlyˆ

tn{2u

ts1{2u , ts2{2u , . . . , tsℓ{2u

˙. l

For the case ℓ “ 2, we have

Φ02ps1, n ´ s1q “

1

2

ˆn

s1

˙`

#0 if n is even and s1 is odd,12

` tn{2uts1{2u

˘otherwise.

The numbers apn, s1q “ Φ02ps1, n ´ s1q are the entries in Lozanic’s Trian-

gle1 ([6], see also [8]), a doubly-indexed sequence apn, kq that counts the

1Losanitsch is the German form of the original Serbian name Lozanic, and was the spelling

that appeared on the published article.


number of equivalence classes of t0, 1u-sequences of length n, having k 1’s,under the reversal map. Lozanic originally investigated these numbers in thecontext of the chemical structure of paraffins.

4. Cyclic shifting

Our further analysis of signature distributions is based on Lemma 11 below,first proved by Brunetti, Del Lungo, and Del Ristoro [3, 2]. For completenesswe restate and reprove it here. We begin by defining a map on permutationsof length m that has the effect of cyclically shifting to the right the positionswhere 1 appears.

Definition 10. Let s “ psiqℓ1 be a composition of m. We define a map

shift : Σpsq Ñ Σpsq

as follows. If π P Σpsq then we set I “ π´1 t1u, and define I 1 to be I shiftedcyclically to the right. That is, I 1 is I ` 1, except that if m ` 1 appears inI ` 1 then it is replaced by 1. Now π1 “ shiftpπq is characterized by

π1“I 1‰

“ πrIs , and π1“sI 1

‰“ π

“sI‰.

(Note that of course πrIs is a constant sequence of 1’s.) For example withm “ 9 and π “ 122113313, shifting twice, we have

122113313

Ñ 1¨ ¨11¨ ¨1¨loooomoooonI“t1,4,5,8u

cyclicallyÝÝÝÝÝÝÑrotate I

¨1¨ ¨11¨ ¨1loooomoooonI1“t2,5,6,9u

put values from sIÝÝÝÝÝÝÝÝÝÝÝÑ

in positions sI1212311331

Ñ ¨1¨ ¨11¨ ¨1loooomoooonI“t2,5,6,9u

cyclicallyÝÝÝÝÝÝÑrotate I

1¨1¨ ¨11¨ ¨loooomoooonI1“t1,3,6,7u

put values from sIÝÝÝÝÝÝÝÝÝÝÝÑ

in positions sI1121231133.

Thus, shiftp122113313q “ 212311331 and shiftp212311331q “ 121231133.Note that the first sequence gains 4 inversions when shifted, since each 1is in one more inversion. On the other hand, when shifting the second timewe lose 5 “ 9´4 inversions since the 1 that wraps around is no longer in anyinversions. These two cases show how the number of inversions can changewhen a sequence is shifted; we state this formally in the next lemma.

Lemma 11. Let s “ psiqℓ1 be a composition of m. For each permutation π of

S “ t1s1 , 2s2 , . . . , ℓsℓu the inversion number of shiftpπq is congurent modulom to invpπq ` s1. Recalling that we write ps1 mod mq for the residue class ofs1, we have (as an equation in Zm)

ǫmpshiftpπqq “ ǫmpπq ` ps1 mod mq.

Proof. Let π1 “ shiftpπq. In the case where I 1 “ I ` 1 (i.e., the last elementof π is not 1) each 1 in π participates in exactly one more inversion in π1

than it did in π. In the other case, where πm �“ 1, each non-1 element in π

participates in exactly one fewer inversion with a 1 in π1 than it did in π.The number of inversions between non-1 elements is the same for π1 as for


π. Thus either invpπ1q “ invpπq ` s1 or invpπ1q “ invpπq ´ m ` s1. In eithercase we have ǫmpπ1q “ ǫmpπq ` ps1 mod mq. l

We next show in Lemmas 12 and 14 that in certain special cases thesignature distributions of collections of permutations must have a lot of peri-odicity. All periodicity is, of course, defined modulo m. For instance if m “ 6then p4, 1, 2, 4, 1, 2q is a 3-periodic distribution. We start with the case of allpermutations of a multiset of size n “ m.

Lemma 12. Let s “ psiqℓ1 be a composition of m. Then Φmpsq is si-periodic

for each i.

Proof. By Lemma 11, the map

shift : tπ P Σpsq : ǫmpπq “ a u Ñ tπ P Σpsq : ǫmpπq “ a ` ps1 mod mq u

is a bijection, and so Φmpsq is s1-periodic.

For i ą 1, pick an arbitrary permutation τ of t1, 2, . . . , lu (thought ofas a bijection τ : t1, 2, . . . , ℓu Ñ t1, 2, . . . , ℓu) that takes i to 1. Let s1 be thetransformed repetition numbers s1

i “ sτ´1piq. We can think of τ as inducinga bijection from Σpsq to Σps1q by applying τ elementwise to permutations inΣpsq. By Lemma 11, Φmps1q is s1

1-periodic, but Φmps1q “ Φmpsq and s11 “ si,

so Φmpsq is si-periodic. l

Now we consider collections of permutations P such that, for some inter-val J of length m, every permutation in P has the same repetition numbersbefore J , inside J , and after J , and that moreover we can freely substituteinside J any permutation with the same repetition numbers.

Definition 13. If π is a permutation of length n and I is an interval int1, 2, . . . , nu then we define rIpπq to be the repetition numbers of πrIs (thoughtof as a multiset).

Lemma 14. Let s “ psiqℓ1 be a composition of n, and let I, J,K be a partition

of rns into three consecutive intervals, where J has length m and I,K maybe empty. Suppose that P Ď Σpsq is a collection of permutations of S “t1s1 , 2s2 , . . . , ℓsℓu satisfying

a) The repetition numbers rIpπq, rJ pπq, and rKpπq are constant for π P P.b) Whenever π P P and π1 P Σpsq satisfy π1rI Y Ks “ πrI Y Ks this implies

that π1 P P.

If we write b for the common repetition numbers rJ pπq in interval J ofeach π P P, then ΦmpPq is bi-periodic for all 1 ď i ď ℓ. In particular ifgcdpm, b1, b2, . . . , bℓq “ 1 then ΦmpPq is constant.

Proof. First note that it suffices to prove the result in the case where all thepermutations in P are identical on I Y K. In the general case P is a unionof such collections, hence ΦmpPq is a sum of such distributions. If each termin the sum is bi-periodic, then so is ΦmpPq.


Suppose then that for all π P P we have πrI Y Ks “ . The inversionsin π can be categorized into those arising inside J , of which there are invJ pπq,and the others:

invpπq “ invJpπq ` invIYKpπq ` invI,Jpπq ` invJ,Kpπq

“ invJpπq ` invpq ` invI,Jpπq ` invJ,Kpπq.

Thus invpπq differs from invJ pπq by a constant, since invI,Jpπq ` invJ,Kpπq isdetermined by and rJ . Hence, if ΦmprJ q is p-periodic for any p, then so isΦmpPq. In particular, by Lemma 12, ΦmprJ q is bi-periodic for all 1 ď i ď ℓ,and thus so is ΦmpPq. This in turn implies that ΦmpPq is periodic withperiod gcdpm, b1, b2, . . . , bℓq. If this gcd is 1 then ΦmpPq is 1-periodic, i.e.,constant. l

The following corollary was proved by Brunetti, Del Lungo, and DelRistoro in [1]. It concerns the case where n “ m, i.e., we are reducing ourinversion number modulo the length of the permutations, and, in addition,the gcd of n and the repetition numbers is 1.

Corollary 15. If s “ psiqℓ1 is a composition of n and gcdpn, s1, s2, . . . , sℓq “ 1

then Φnpsq is constant, i.e.,

Φ0npsq “ Φ1

npsq “ ¨ ¨ ¨ “ Φn´1n psq “

1

n

ˆn

s1, s2, . . . , sℓ

˙.

In particular, if n is prime then Φnpsq is constant unless only one of the siis nonzero, in which case Φ0

npsq “ 1 “`

n0,0,...,n,...,0

˘and all the other Φa

npsqare zero.

Proof. We apply the previous lemma with P “ Σpsq and J “ r1, ns. l

We can also deduce from Lemma 14 that permutations of t1, 2, . . . , nuare equi-distributed modulo m for all m ď n.

Corollary 16. If 1 ď m ď n then Φmp1, 1, . . . , 1loooomoooonn 1s

q is constant.

Proof. Let Σ “ Σp1, 1, . . . , 1q be the set of all permutations of t1, 2, . . . , nu.We split up Σ according to the elements appearing in the first m positions.If the set of values A Ď t1, 2, . . . , nu has size m we define

PA “ tπ P Σ : πrr1,mss is a permutation of A u .

Thus PA consists of all permutations of t1, 2, . . . , nu which are a permutationof A followed by a permutation of t1, 2, . . . , nu zA. Applying Lemma 14 toPA with I “ H, J “ r1,ms, and K “ rm ` 1, ns, proves that each PA isequi-distributed modulo m, and therefore Σ is also. l

The condition n ě m in Corollary 16 is clearly necessary. For instanceif m is prime and n ă m then n! “ |Σ| is not divisible by m.


5. Prime values of m

The techniques of the previous section allow us to give a complete analysisof the m-signature distribution of a multiset S when m is prime. As in thecase m “ 2, the distribution is almost uniform, with a small correction term.

Theorem 17. Suppose m is prime and s “ psiqℓ1 be a composition of n. For

each i let qi and ti be respectively the quotient and remainder when si isdivided by m. Thus si “ qim` ti with 0 ď ti ă m. Similarly write n “ qm` t

with 0 ď t ă m.

1) Ifřℓ

1 ti ě m then the permutation of the multiset t1s1 , 2s2 , . . . , ℓsℓu areequi-distributed; i.e.,

Φ0mpsq “ Φ1

mpsq “ ¨ ¨ ¨ “ Φm´1m psq “

1

m

ˆn

s1, s2, . . . , sℓ

˙.

2) Ifřℓ

1 ti ă m thenřℓ

1 qi “ q andřℓ

1 ti “ t and, setting

M “

ˆn

s1, s2, . . . , sℓ

˙´

ˆq

q1, q2, . . . , qℓ

˙ˆt

t1, t2, . . . , tℓ

˙,

we have, for each a,

Φampsq “

M

m`

ˆq

q1, q2, . . . , qℓ

˙Φa

mpt1, t2, . . . , tℓq.

Proof. We will partition most of Σpsq into equi-distributed parts. Partitiont1, 2, . . . , nu into as many consecutive intervals of length m as possible:

I1 “ r1,ms, I2 “ rm ` 1, 2ms, . . . ,

Iq “ rpq ´ 1qm ` 1, qms, Iq`1 “ rqm ` 1, qm ` ts “ rqm ` 1, ns.

Note that we have q intervals of lengthm, and the last interval Iq`1 has lengtht ă m. We wish to apply Lemma 14, and there are many candidates for theinterval J and the collection P considered there. Our aim is to partition mostof Σpsq into suitable collections P with associated intervals J . We start byassociating each permutation π to the first of the intervals I1, I2, . . . , Iq onwhich it is not constant, if such an interval exists. Thus for 1 ď j ď q set

Ej “ tπ P Σpsq : πrIj s is not constant, and πrIks is constant for all k ă j u .

Note that the Ej are disjoint. Now let

E “qď

j“1

Ej “ tπ P Σpsq : for some j ď q the sequence πrIjs is not constant u

E0 “ ΣpsqzE .

We need to split up each Ej even further in order to satisfy condition b) ofLemma 14, that rIpπq, rJ pπq and rKpπq are all independent of π. We split Ej


up into parts Pj,v,b where v specifies the values on the constant blocks beforeIj , and b gives the repetition numbers on Ij . Thus

Ej “ď

v,b

Pj,v,b

where v “ pvpqj´1p“1 is a sequence of values, b “ pbiq

ℓi“1 is a composition of m,

and

Pj,v,b “ π P Ej : πk “ vp for all k P Ip with p ă j and rIj pπq “ b

(.

Each of the Pj,v,b satisfies the hypotheses of Lemma 14. Since m is primeand, by hypothesis, there is some bi such that 0 ă bi ă m, we have

gcdpm, b1, b2, . . . , bℓq “ 1,

so ΦmpPj,v,bq is constant. Combining these, we have that each ΦmpEjq isconstant, and hence ΦmpEq is constant.

It remains only to consider the signature distribution of E0. Ifř

i ti ě m

then E0 is empty and we are done. Ifř

i ti ă m then clearlyř

i qi “ q andři ti “ t. In this case sequences in E0 have all constant blocks up to the last,

which has length t ă m. Let us define

M “ t1,m ` 1, 2m ` 1, . . . , pq ´ 1qm ` 1u

L “ rqm ` 1, ns “ Iq`1

Q “ t1q1 , 2q2 , . . . , ℓqℓu

T “ 1t1 , 2t2 , . . . , ℓtℓ

(.

The last block of π, that is πrLs, is a permutation of T , whereas πrM s is apermutation of Q. Moreover

ǫmpπq “ ǫmpπrLsq. (1)

This last follows from the fact that

invpπq “ invLpπq ` invsLpπq ` invL,sLpπq,

and inv sLpπq mod m “ invL,sLpπq mod m “ 0, as every inversion counted in

these terms comes repeated m or m2 times. There is a bijection between E0

and ΣpQqˆΣpT q defined by π ÞÑ pπrM s , πrLsq. By (1) the second componentdetermines the value of ǫmpπq. Thus the distribution of ǫmpπq over π P E0

is the same as that of ΣpT q, except that the multiplicity of each value isincreased by a factor of

|ΣpQq| “

ˆq

q1, q2, . . . , qℓ

˙.

This is precisely the claim of the theorem. l

Remark 18. The correction term in case 2 of the previous theorem, describingthe deviation from a uniform distribution, is very small, at least for large n, in


comparison to the main term. Recall that in this caseř

i qi “ q andř

i ti “ t.

Using Stirling’s approximation in the very crude form n! «`ne

˘n, we have

ˆmq

mq1,mq2, . . . ,mqℓ

˙«

ˆq

q1, q2, . . . , qℓ

˙m

.

Thus we have thatˇˇΦa

mpsq ´1

m

ˆn

s1, s2, . . . , sℓ

˙ˇˇ

“

ˆq

q1, q2, . . . , qℓ

˙ ˇˇΦa

mpt1, t2, . . . , tℓq ´1

m

ˆt

t1, t2, . . . , tℓ

˙ˇˇ

ď

ˆq

q1, q2, . . . , qℓ

˙ˆt

t1, t2, . . . , tℓ

˙

ď

ˆq

q1, q2, . . . , qℓ

˙ℓt

Æ ℓtˆ

n

s1, s2, . . . , sℓ

˙1{m

.

Hence, the deviation from a uniform distribution is (roughly) at most themth root of the total number of permutations of t1s1 , 2s2 , . . . , ℓsℓu.

6. Non-prime values of m

In this section we discuss the situation for non-prime values of m. We canlearn a lot about the m-signature distribution from our previous results.Consider a composition s “ psiq

ℓ1 of n. We will approach the problem of un-

derstanding Φmpsq by grouping elements of Σpsq according to their repetitionnumbers inside the intervals I1 “ r1,ms, I2 “ rm ` 1, 2ms, . . . as in Theo-rem 17. We will only prove results for the case where n is divisible by m, sothere is no “remainder” block. It seems likely that our results could be ex-tended to cover the case where m does not divide m, though such statementsseem likely to be messy.

Definition 19. When m divides n we set b “ n{m, which is the number ofintervals Ij . If we have

s “bÿ

j“1

spjq,

where each spjq is a composition of m with ℓ parts, then we set

Σpsp1q; sp2q; . . . ; spbqq “!π P Σpsq : πrIj s has repetition numbers spjq, j “ 1, 2, . . . , b

).

Similarly we will write Φmpsp1q; sp2q; . . . ; spbqq for the signature distributionof Σpsp1q; sp2q; . . . ; spbqq.

We start by proving a lemma about the periodicity of this distribution.


Lemma 20. Φmpsp1q; sp2q; . . . ; spbqq is d-periodic, where

d “ gcd!s

pjqi : i “ 1, 2, . . . , ℓ; j “ 1, 2, . . . , b

).

Note that since m “řℓ

i“1 sp1qi , the gcd d is a divisor of m.

Proof. The collection P “ Σpsp1q, sp2q, . . . , spbqq satisfies the conditions of

Lemma 14 for each interval Ij , j “ 1, 2, . . . , b, and therefore ΦmpPq is spjqi -

periodic for each i “ 1, 2, . . . , ℓ and j “ 1, 2, . . . , b. Hence ΦmpPq is d-periodic.l

This periodicity allows us to deduce the m-signature distribution fromthe d-signature distribution. The process of going from one to the other iscaptured in the following definition.

Definition 21. Suppose that d and m are natural numbers such that d dividesm. If f : Zd Ñ R we define fÒm, the lift of f to Zm, to be the d-periodicfunction on Zm defined by

fÒmpaq “d

mfpa mod dq.

Note thatř

aPZmfÒmpaq “

řePZd

fpeq. For example if m “ 12, d “ 3,

and f “ p8, 28, 4q (listing the function values in a vector) then fÒ12 “p2, 7, 1, 2, 7, 1, 2, 7, 1, 2, 7, 1q.

Theorem 22. If P is a family of permutations such that the m-signaturedistribution ΦmpPq is d-periodic where d is a divisor of m then

ΦmpPq “ pΦdpPqqÒm.

Proof. If a P Zm then, since ΦmpPq is d-periodic, we have ΦampPq “ Φb

mpPqfor all b P Zm with a mod d “ b mod d. We also have, for a P Zm witha mod d “ e,

m

dΦa

mpPq “ÿ

bPZm

b mod d“e

ΦbmpPq “ Φe

dpPq,

so the result follows. l

Since permutations in Σpsp1q, sp2q, . . . , spbqq arise from making indepen-

dent choices for πrIj s from the Σpspjqq, it is natural that the signature dis-

tribution Φmpsp1q; sp2q; . . . ; spbqq involves the convolution of the distributionsΦmpspjqq. The convolution computes the number of inversions within blocks,to which we must add the number of inversions between blocks. However itturns out that the distribution Φmpsp1q; sp2q; . . . ; spbqq is exactly the convolu-

tion of the Φmpspjqq.

Definition 23. Given vectors v “ pvaqm´1a“0 and w “ pwaqm´1

a“0 indexed by Zm,we define their convolution to be

v f w “

˜ÿ

c`d“a

vcwd

¸m´1

a“0

.


In general, given vectors vp1q, vp2q, . . . , vpbq their convolution is

bæj“1

vpjq “

˜ÿ

c1`c2`¨¨¨`cb“a

bź

j“1

vpjqcj

¸m´1

a“0

.

(Naturally, in this definition the arithmetic with indices is performed in Zm.)

Lemma 24. If sp1q, sp2q, . . . , spbq are compositions of m then

Φmpsp1q; sp2q; . . . ; spbqq “bæ

j“1

Φmpspjqq.

Proof. There is a natural bijection between A “ Σpsp1q; sp2q; . . . ; spbqq andthe Cartesian product Σpsp1qq ˆ Σpsp2qq ˆ ¨ ¨ ¨ ˆ Σpspjqq. The distribution ofřb

j“1 ǫmpπpjqq, where pπpjqqbj“1 Pśb

j“1 Σpspjqq, is exactlyÆb

j“1 Φmpspjqq.Any permutation π P A has

ǫmpπq “ pk mod mq `bÿ

j“1

ǫmpπpjqq, (2)

where k, the number of interblock inversions, is the same for all π P A. Wewill show that this shift by k mod m has no effect on the distribution. Letd “ gcdpsp1q, sp2q, . . . , spbqq. Since all the entries of all the spjq are divisible

by d, so also k is divisible by d. By Lemma 20, Φmpsp1q; sp2q; . . . ; spbqq isd-periodic. Thus, for any a P Zm,

Φampsp1q; sp2q; . . . ; spbqq “ Φa`pk mod mq

m psp1q; sp2q; . . . ; spbqq

“

˜bæ

j“1

Φmpspjqq

¸paq,

by periodicity and (2). l

Of course Φmpsq has contributions from many Φmpsp1q; sp2q; . . . ; spbqq.To be precise we get a contribution from all the vector compositions of susing b repetition vectors, each of sum m. The periodicity of a given splittingsp1q, sp2q, . . . , spbq depends on the gcd of all the entries of all the spjq. Let uswrite Cd

b psq for the set of vector compositions of s such that this gcd is d; i.e.;

Cdb psq “

!psp1q; . . . ; spbqq : sp1q ` ¨ ¨ ¨ ` spbq “ s, gcdpsp1q, . . . , spbqq “ d

).

Theorem 25. Suppose that m ą 1, n “ bm for some integer b, and s isa composition of n. Let d0 “ gcdpm, s1, s2, . . . , sℓq. Then the m-signaturedistribution of Σpsq is d0-periodic. Moreover the distribution satisfies

Φmpsq “ c `ÿ

d|d0

dą1

ÿ

psp1q;...;spbqqPCdb

psq

bæj“1

ΦgcdpspjqqpspjqqÒm,

where c is a constant dependent only on s.


Proof. We partition the set of all permutations of t1s1 , 2s2 , . . . , ℓsℓu into thevarious Cd

b . Note that if psp1q; sp2q; . . . ; spbqq P Cdb psq then in particular m “řℓ

i“1 sp1qi is divisible by d. Similarly each si is divisible by d; hence d divides

d0. Thus the partitioning into the Cdb gives

Φmpsq “ÿ

d|d0

ÿ

psp1q;...;spbqqPCdb

psq

Φmpsp1q; sp2q; . . . ; spbqq.

By Lemma 20 each term arising from Cdb psq is d-periodic, and hence d0-

periodic. Thus Φmpsq is d0-periodic, as claimed in the theorem. Now, applyingLemma 24 we have

Φmpsq “ÿ

d|d0

ÿ

psp1q;...;spbqqPCdb

psq

bæj“1

Φmpspjqq.

From Lemma 20 with b “ 1 we know that each factor in the convolution isgcdpspjqq-periodic, so by Theorem 22,

Φmpsq “ÿ

d|d0

ÿ

psp1q;...;spbqqPCdb

psq

bæj“1

ΦgcdpspjqqpspjqqÒm.

Finally note that the term corresponding to d “ 1 is 1-periodic, i.e., constant;this is our c. l

This theorem gives a relatively efficient recursive way of computingΦmpsq. The main difficulty in using it is that the number of vector composi-tions of s is large and the set Cd

b psq is difficult to analyze. Determining theconstant c is not a difficulty. The cardinality of Σpsq is simply a multinomialcoefficient, so knowing the contributions from terms with d ą 1 suffices.

Theorem 25 also immediately gives a simple sufficient condition forΦmpsq to be equi-distributed.

Corollary 26. Suppose that m ą 1, n is a multiple of m and s is a compositionof n with gcdpm, s1, s2, . . . , sℓq “ 1. Then Φmpsq is equi-distributed.

7. Signature distributions are ruler functions

Looking at a typical signature distribution certain features are immediatelynoticeable; see Figure 1. In particular if gcdpa,mq “ gcdpb,mq then we haveΦa

mpsq “ Φbmpsq. This is reminiscent of the heights of the tick marks at

positions i{16 on a ruler, which is determined by gcdpi, 16q. Additionally theheight increases as the gcd increases. We make these conditions precise in thefollowing definition.

Definition 27. For any integer m ą 1 we define Divpmq to be the divisorlattice of m; the poset on t 1 ď d ď m : d divides m u, with ordering given by

d ď d1 if d | d1. If f : Divpmq Ñ R is any function we define f : Zm Ñ R by

fpaq “ fpgcdpa,mqq.


2,498,640,144

+1612

−1574

+1504

−1472

+1504

−1574

+1612

−1574

+1504

−1472

+1504

−1574

0 1 2 3 4 5 6 7 8 9 10 11

Figure 1. The 12-signature distribution Φ12p12, 6, 6q rep-resented as a histogram. The dashed line marks the averagevalue of the Φa

12p12, 6, 6q, i.e.,`

2412,6,6

˘{12. Each bar is labeled

with its deviation from this average value. The fact that thedistribution is 6-periodic is a consequence of Theorem 25.

If f is non-decreasing then we say that f is a ruler function.

As an example Φ12p12, 6, 6q in Figure 1 is a ruler function. Our aim is toprove that all signature distributions are ruler functions when m divides n,the size of the multiset. This is not necessarily true when m does not dividen, even if m is prime. For instance the 3-signature distribution of

14, 2

(is

p2, 2, 1q.Given the statement of Theorem 25 it is natural to start by proving that

scalar multiples, sums, and convolutions of ruler functions are ruler functions.

Lemma 28. Suppose that f , g : Zm Ñ R are ruler functions and c P R is non-negative. Then cf , f ` g, and the constant function c are ruler functions.Moreover if d is a divisor of m and h : Zd Ñ R is a ruler function then hÒm

is a ruler function on Zm.

Proof. It is clear that cf and c are ruler functions since cf : Divpmq Ñ R

is non-decreasing and cf “ Ăcf . Similarly, f ` g is a ruler function sincef ` g : Divpmq Ñ R is non-decreasing and

f ` g “ Ćf ` g.

Now, given h : Divpdq Ñ R we define a function k : Divpmq Ñ R by kpeq “hpgcdpe, dqq. Figure 2 illustrates these functions. Since d divides m it is easy


Zm Zd

R

Divpmq Divpdq

dmk

dmh

dmh

hÒm

dmk

a ÞÑ gcdpa,mq a ÞÑ gcdpa, dq

a ÞÑ a mod d

e ÞÑ gcdpe, dq

Figure 2. A commutative diagram showing the functions in Lemma 28.

to see that

hÒm “d

mk

is a ruler function. l

Lemma 29. If f , g : Zm Ñ R are ruler functions then so is f f g.

Proof. First observe that it suffices to prove the result when f , g : Zm Ñ R

take only non-negative values, since adding a constant function to f and tog has the effect of adding a constant to f f g.

We begin with the case where m “ pr is a power of a prime. If f andg are t0, 1u-valued then the underlying functions f, g : Divpprq Ñ R take theform

fpdq “

#1 if ps | d,

0 otherwise,gpdq “

#1 if pt | d,

0 otherwise,

for some 1 ď s, t ď r. For these f and g it is straightforward to computef f g. Without loss of generality t ď s, and we have

pf f gqpaq “ # b P Zpr : ps | b and pt | pa ´ bq

(

“

#0 if pt ffl a,

prp´s if pt | a

is a ruler function. The result for general f , g : Zpr Ñ R follows since anynon-negative, ruler function can be written as a positive linear combinationof t0, 1u-valued ruler functions, and f is bilinear.

We do the case of non-prime-power m by induction. Suppose that m “qm1 where q “ pr ą 1 is a prime power and p ffl m1. We have a ring isomor-phism (by the Chinese Remainder Theorem) from Zm Ñ Zq ˆ Zm1 taking


Zm1 Zq ˆ Zm1 Zm

Divpm1q Divpqq ˆ Divpm1q Divpmq

R

a ÞÑ gcdpa,mq

ϕ

pd1, d2q ÞÑ d1d2

ff˚

b ÞÑ ppi, bq

pa, bq ÞÑ pgcdpa, qq, gcdpb,m1qqb ÞÑ gcdpb,m1q

d ÞÑ ppi, dq

fifi f

Figure 3. The functions arising in Lemma 29 in the caseof non-prime-power m.

e P Zm to the pair of residue classes pe mod q, e mod m1q. The inverse of thismap we call ϕ. Similarly Divpmq “ Divpqq ˆ Divpm1q, with the isomorphismgiven by d ÞÑ pgcdpd, qq, gcdpd,m1qq. The inverse of this map is simply multi-plication: ppi, dq ÞÑ pid. These isomorphisms are compatible in the sense theupper right hand square of the diagram in Figure 3 commutes.

Define the ith section of f , where 0 ď i ď r, to be the function fi :Zm1 Ñ R given by

fipbq “ fpϕppi, bqq,

and similarly for gi. Notice that these sections are also ruler functions. Indeed

fi is Ąpfiq where fi : Divpm1q Ñ R is given by fipdq “ fppidq. Consider then

pf f gq`ϕpa, bq

˘where a P Zq and b P Zm1 :

`f f g

˘`ϕpa, bq

˘

“ÿ

pc,dqPZqˆZm1

f`ϕpc, dq

˘g`ϕpa ´ c, b ´ dq

˘

“ÿ

pc,dqPZqˆZm1

f`gcdpc, qq ¨ gcdpd,m1q

˘g`gcdpa ´ c, qq ¨ gcdpb ´ d,m1q

˘

“ÿ

cPZq

ÿ

dPZm1

fχpcq

`gcdpd,m1q

˘gχpaćq

`gcdpb ´ d,m1q

˘

“ÿ

cPZq

`fχpcq f gχpaćq

˘pbq, (3)

where we have written χpcq for the highest power of p dividing c, so gcdpc, qq “

pχpcq. Certainly then, pf f gqpa, bq is non-decreasing in gcdpb,m1q for fixed


a, since each term in the final sum is, by the inductive hypothesis, a rulerfunction on Zm1 .

It is harder to prove that the convolution is an non-decreasing functionof gcdpa, qq “ χpaq, indeed it is not even obvious that the right hand side of

(3) is a function of χpaq for fixed b. To this end, define ∆fj “ fj ´ fj´1 for

1 ď j ď r, and ∆f0 “ f0. We have then fi “ři

j“0 ∆fj . Also ∆fjpbq ě 0

for all b P Zm1 since f is a ruler function and therefore fjpbq ě fj´1pbq.Similarly there exist ∆g0,∆g1, . . . ,∆gr : Zm1 Ñ R non-negative functions

with gi “ři

j“0 ∆gj for all i. Now write

ÿ

cPZq

fχpcq f gχpaćq “ÿ

cPZq

¨˝

χpcqÿ

j“0

∆fj

˛‚f

¨˝

χpaćqÿ

k“0

∆gk

˛‚.

Definingλapj, kq “ #t c P Zq : χpcq ě j and χpa ´ cq ě k u,

we get ÿ

cPZq

fχpcq f gχpaćq “ÿ

0ďj,kďr

λapj, kq`∆fj f ∆gk

˘.

Thus to show that pf f gqpa, bq is an non-decreasing function of gcdpa, qq itsuffices to prove that λapj, kq is an non-decreasing function of gcdpa, qq foreach fixed 0 ď j, k ď r. This fact is precisely what we proved in the secondparagraph:

λapj, kq “

#0 if pminpj,kq ffl a,

prp´ maxpj,kq if pminpj,kq | a.

This finishes the proof. l

Theorem 30. If s “ psiqℓi“1 is a composition of n and m ě 1 divides n, the

m-signature distribution Φmpsq is a ruler function.

Proof. The case m “ 1 is trivial. For m ą 1 we have, by Theorem 25,

Φmpsq “ c `ÿ

d|d0

dą1

ÿ

Cdb

psq

bæj“1

ΦgcdpspjqqpspjqqÒm,

where d0 “ gcdpm, s1, s2, . . . , sℓq. Each factor Φgcdpspjqq

pspjqqÒm is a ruler

function: by induction on m and Lemma 28 for lifts if gcdpspjqq ă m, andby direct calculation if gcdpspjqq “ m. Thus Φmpsq is a ruler function byLemmas 28 and 29. l

As a consequence of this theorem we can show that the maximum ofΦmpsq is Φ0

mpsq and the minimum is Φ1mpsq. Thus the difference Φ0

mpsq ´Φ1

mpsq bounds the deviation from equi-distribution of Φmpsq.

Corollary 31. If s “ psiqℓ1 is a composition of n and m ą 1 divides n then

for all a P Zm we have

Φ1mpsq ď Φa

mpsq ď Φ0mpsq.


Proof. Φmpsq is a ruler function and m is the maximum element of Divpmqwhile 1 is the minimum. l

8. Future Directions

We have given several results giving conditions implying equi-distributionof m-signature distributions. In Corollary 16 we proved equi-distribution forpermutations of sets if n ě m. We have characterized, in Theorem 17, ex-actly when we have equi-distribution for prime values of m. If m is compositeand divides the size of the multiset we give a simple sufficient condition forequi-distribution in Corollary 26. We in fact conjecture that, when m di-vides n, the m-signature distribution Φmpsq is equi-distributed exactly whengcdpm, s1, s2, . . . , sℓq “ 1.

It is natural to want to give a bound on how far from equi-distributed agiven Φmpsq is, as in the remark after Theorem 17. By our results about rulerfunctions together with Corollary 31 we can measure the variation of Φmpsqusing the difference Φ0

mpsq ´ Φ1mpsq. Empirically this difference is tiny as a

fraction of Φ0mpsq. However, it is difficult to use Theorem 25 to give explicit

bounds. This seems like a fertile area for future research.The case when m is composite and does not divide n is not covered by

our results, though our methods may be able to be adapted to this setting. Itwould be very interesting to have a fuller understanding of these distributions.

Acknowledgements

We thank Raghunath Tewari for introducing us to this problem. The refereesgave many useful suggestions that have improved the presentation of the re-sults. In particular one referee provided the diagrams illustrating Lemmas 28and 29, for which we are very grateful.

References

1. S. Brunetti, A. Del Lungo, and F. Del Ristoro, An equipartition property for the

distribution of multiset permutation inversions, Adv. in Appl. Math. 27 (2001),no. 1, 41–50. MR 1835676 (2002c:05011)

2. , A cycle lemma for permutation inversions, Discrete Math. 257 (2002),no. 1, 1–13. MR 1931488 (2003h:05004)

3. Sara Brunetti, Alberto Del Lungo, and Francesco Del Ristoro, A general cyclic

lemma for multiset permutation inversions, Formal power series and algebraiccombinatorics (Moscow, 2000), Springer, Berlin, 2000, pp. 135–145. MR 1798208(2001m:05008)

4. Mark Conger and D. Viswanath, Normal approximations for descents and inver-

sions of permutations of multisets, J. Theoret. Probab. 20 (2007), no. 2, 309–325.MR 2324533 (2008k:60019)


5. Dominique Foata, On the Netto inversion number of a sequence, Proc. Amer.Math. Soc. 19 (1968), 236–240. MR 0223256 (36 #6304)

6. S. M. Losanitsch, Die Isomerie-Arten bei den Homologen der Paraffin-Reihe,Chem. Ber. 30 (1897), 1917–1926.

7. P.A. MacMahon, Two applications of general theorems in combinatory analysis,Proc. London Math. Soc. 15 (1916), 314–321.

8. Online Encyclopedia of Integer Sequences, Lozanic’s triangle, http://oeis.org/wiki/Lozani%C4%87%27s_triangle.

9. Richard P. Stanley, Enumerative combinatorics. Vol. 1, Cambridge Studies inAdvanced Mathematics, vol. 49, Cambridge University Press, Cambridge, 1997,With a foreword by Gian-Carlo Rota, Corrected reprint of the 1986 original. MR1442260 (98a:05001)

Stephen G. HartkeDepartment of MathematicsUniversity of Nebraska–LincolnLincoln, NEe-mail: [email protected]

A.J. RadcliffeDepartment of MathematicsUniversity of Nebraska–LincolnLincoln, NEe-mail: [email protected]

Documents

Signatures of Strings