Tight parameterized preprocessing bounds · Tight Parameterized Preprocessing Bounds Sparsiﬁcation via Low-Degree Polynomials PROEFSCHRIFT ter verkrijging van de graad van doctor

Tight parameterized preprocessing bounds

Citation for published version (APA):Pieterse, A. (2019). Tight parameterized preprocessing bounds: sparsification via low-degree polynomials.Eindhoven: Technische Universiteit Eindhoven.

Document status and date:Published: 04/09/2019

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 11. Apr. 2020

https://research.tue.nl/en/publications/tight-parameterized-preprocessing-bounds(705dc505-c05b-48c6-96c4-21ca7706f2de).html

Tight Parameterized Preprocessing BoundsSparsification via Low-Degree Polynomials

Astrid Pieterse

1

Astrid Pieterse


Copyright © 2019 by Astrid Pieterse.

The work in this thesis has been supported by the Netherlands Organi-zation for Scientific Research NWO under project number 024.002.003(Networks).

Cover design by Astrid Pieterse.

The cover represents the method of kernelization by carefully remov-ing redundant constraints, as described in Chapter 3 of this thesis.

Printed by Ipskamp Printing, Enschede.

A catalogue record is available from the Eindhoven University ofTechnology Library.

ISBN: 978-90-386-4821-7.

9 789038 648217


PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de Technische UniversiteitEindhoven, op gezag van de rector magnificus prof.dr.ir. F.P.T. Baaijens, vooreen commissie aangewezen door het College voor Promoties, in het openbaar

te verdedigen op woensdag 4 september 2019 om 16:00 uur

door

Astrid Pieterse

geboren te Weert

Dit proefschrift is goedgekeurd door de promotoren en de samenstelling van depromotiecommissie is als volgt:

Voorzitter: prof.dr. J.J. LukkienPromotor: prof.dr. M.T. de BergCo-promotor: dr. B.M.P. JansenLeden: prof.dr. N. Bansal

prof.dr. S. Kratsch (Humboldt-Universität zu Berlin)dr.hab. M. Pilipczuk (University of Warsaw)prof.dr. H.L. Bodlaender (Universiteit Utrecht)

Het onderzoek of ontwerp dat in dit proefschrift wordt beschreven is uitgevoerdin overeenstemming met de TU/e Gedragscode Wetenschapsbeoefening.

Acknowledgments

Astrid Pieterse,July 2019

I still remember when Mark introduced me and a couple of fellow students toBart, who would teach us a course about parameterized complexity. This led tothe decision to do my master’s thesis with Bart. In the first few weeks of thatproject, I realized how much fun research can be and after giving it a bit ofthought, I accepted a PhD position supervised by him.

As such, I would first of all like to thank Bart for his supervision. Thank youfor all the advice on research and writing, and for always being encouraging.Our weekly meetings were both useful and enjoyable, and I think almost allresults in this thesis spent some time on one of your whiteboards. When finishinga paper (a little too) shortly before the SODA deadline I learned that you arenot only a good supervisor, but also generally great to work with.

Secondly, I would like to thank my promotor Mark de Berg. You always hadan open door, for questions or to help with any practical affairs.

Next, I would like to thank all my other committee members, Stefan Kratsch,Nikhil Bansal, Marcin Pilipczuk, and Hans Bodlaender, for reviewing this thesis.Stefan, thank you for inviting me for a productive research visit to Berlin, it wasgreat to spend the week working with you and Eva-Maria. Marcin, thanks forinviting me for a great week of research at a workshop in Poland1.

I want to thank Hubie Chen for inviting Bart and me to visit him in London,and for visiting us in Eindhoven, each resulting in a chapter of this thesis. I’dlike to thank Gerhard Woeginger for supervising an honors project during mymaster’s that resulted in a publication during this PhD. I would also like to thankmy co-authors Sudeshna Kolay and Hans Bodlaender, for the project we startedat the Lorentz center, the results of which have been accepted to WADS.

Now this whole PhD adventure would have been significantly less fun with-out the people I shared an office with throughout these 4 years. Thanks tomy old office mates Sándor, Aleks2, Mehran, and Bruce, for the great office

1If people remember me muttering that it was in some unreachable location (Karpacz), I canonly say that the view of the snowy mountains was worth it.

2You’re welcome. We’ll meet again.

vi Acknowledgments

atmosphere. Thanks also to my new office mates Martijn and Henk with whomI shared an office during the final stages of this PhD project.

Furthermore I would like to thank all my colleagues for the friendly at-mosphere and daily joint lunches. Thanks to Mehran, Aleks, Sándor, Martijn,Henk, Ali, Max, Thom, Max, Dani, Huib, Arthur, Quirijn, Tim, Willem, Jules,Bram, Jorn, Sudeshna, Kevin, Bart, Herman, Kevin, Irina, Ignaz, Wouter, Marcel,Jesper, Hans, Mark, and Bettina.

Because I was part of the Networks project, I joined many of the NetworksTraining weeks that were held twice a year. Thank you to all Networks andnon-Networks members (I cannot name you all, but surely you know who youare) for making the training weeks enjoyable, for playing games in the evening,and going for walks in the forests. Furthermore, as part of the Networks projectI did a short “internship” at CWI with Pieter Kleer and Guido Schaefer. I reallyenjoyed working with you both.

During this PhD position, I got to travel to various conferences, meetings,and workshops. It took me to places I had always wanted to visit (Vienna,Canada), wonderful places I would otherwise never have thought of (Wrocław,Bergen (NO)) and places I had already been (London). Apart from listening toresearch presentations, it also gave me the opportunity to do some sightseeingand to meet (new or not so new) people there. Thanks to Tom, Marieke, Sándor,Lars, Eva-Maria, Huib, Jari, Bart, and a bunch of people I am forgetting tomention here, for fun conversations, joint lunches and dinners, and for thesightseeing we did together in the free time left by the conference schedule.

Among the many people I met at conferences during this PhD, one is specialto me, and not only because he just-so happened to start working at TU/e onthe day that we met. Huib, thanks for all the love and support, you’re amazing.

After spending nine years here at TU/e, first as a student, then as a PhDstudent, it is strange to leave. I’d like to thank the many teachers I had here forsparking my interest in algorithms, contributing to my (very helpful) backgroundin mathematics, or otherwise being great teachers. Furthermore, I would liketo thank everyone who contributed to making the TU/e feel like home for allthese years. I would like to especially thank those I studied with: Daan, Thom,Willem, Kay, Peter, Laurent, and Niels, for the many successful yet never tooserious group projects we did together, and all the silly jokes and fun.

A special thanks also to my friends outside of this university, for distractingme from research (and possibly listening to me complain about it some): Emmy,Inge, Nadia, Jelleke, Rose, and Saskia.

Last but foremost, I would like to thank my family3, in particular my parentsand my brother. You have been the best support I could possibly have hoped forduring this PhD and for all the years before that. Dank jullie wel!

3and a thanks to Nala for being a (cute, and occasionally hilarious) distraction from everything.

Contents

Acknowledgments v

1 Introduction 11.1 Kernelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Polynomial-time sparsification . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Graph coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Constraint satisfaction problems . . . . . . . . . . . . . . . . . . . . . . . 81.5 Thesis overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.6 List of publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Preliminaries 132.1 General notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Parameterized complexity and kernelization . . . . . . . . . . . . . . . . . 152.4 Kernelization lower bound framework . . . . . . . . . . . . . . . . . . . . 16

2.4.1 Complexity-theoretic assumptions . . . . . . . . . . . . . . . . . . 172.4.2 Linear-parameter transformations . . . . . . . . . . . . . . . . . . 182.4.3 Cross-compositions . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5 (Linear) algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.6 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.7 Constraint satisfaction problems . . . . . . . . . . . . . . . . . . . . . . . 282.8 Graph coloring problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Sparsification Using Low-Degree Polynomials 353.1 Upper bound techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.1.1 Sparsification for Polynomial root CSP . . . . . . . . . . . . . . . 383.1.2 Sparsification for Polynomial non-root CSP . . . . . . . . . . . . . 45

3.2 Tightness of upper bound techniques . . . . . . . . . . . . . . . . . . . . 463.2.1 1-Polynomial root CSP over the rationals . . . . . . . . . . . . . . 463.2.2 Polynomial root CSP (modulo an integer) . . . . . . . . . . . . . 543.2.3 Polynomial non-root CSP . . . . . . . . . . . . . . . . . . . . . . 59

3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4 Sparsification Bounds for CSPs 634.1 Sparsification for Boolean CSPs captured by polynomials . . . . . . . . . 654.2 Lower bounds using cone-definability . . . . . . . . . . . . . . . . . . . . 674.3 Classification of CSPs with worst-case sparsification . . . . . . . . . . . . 724.4 Towards a classification of CSPs with linear sparsification . . . . . . . . . 754.5 Symmetric CSPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.6 Full classification of arity-3 CSPs . . . . . . . . . . . . . . . . . . . . . . 864.7 Capturing polynomials versus compression via Maltsev embeddings . . . . 91

4.7.1 Maltsev embeddings and definitions . . . . . . . . . . . . . . . . . 914.7.2 Balanced constraint languages versus Maltsev embeddings . . . . . 924.7.3 Balanced constraint languages versus finite-domain Malt-

sev embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944.7.4 Preservation by balanced or universal partial Maltsev operations . 984.7.5 Preservation by partial Maltsev operations versus cone-definability 99

4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5 Sparsification Lower Bounds for H-Coloring 1035.1 Sparsification lower bound for Cα-Coloring . . . . . . . . . . . . . . . . . 1045.2 Additional preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.3 Sparsification lower bounds for H-Coloring . . . . . . . . . . . . . . . . . 117

5.3.1 Defining a subdomain homomorphically equivalent to anodd cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.3.2 Obtaining a definable subdomain when an edge is in two α-cycles . 1265.3.3 Sparsification lower bound for H-Coloring . . . . . . . . . . . . . . 136

5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6 An Optimal Kernel for H-Coloring 1396.1 Kernel for 3-Coloring parameterized by Vertex Cover . . . . . . . . . . . . 140

6.1.1 Existing kernel of cubic size . . . . . . . . . . . . . . . . . . . . . 1406.1.2 Improved kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

6.2 Kernel for H-Coloring parameterized by Twin-Cover . . . . . . . . . . . . 1446.2.1 Twin-Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1446.2.2 Modeling constraints as polynomial equalities . . . . . . . . . . . . 1466.2.3 Construction of polynomial equalities . . . . . . . . . . . . . . . . 1496.2.4 Reduction rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1526.2.5 Analysis of the kernelization . . . . . . . . . . . . . . . . . . . . . 154

6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

7 Concluding Remarks 1597.1 Linear-algebraic techniques in kernelization . . . . . . . . . . . . . . . . . 1597.2 Sparsification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1617.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

Bibliography 165

Summary 175

Curriculum Vitae 179

1

Chapter 1Introduction

In this thesis, we will study the effectiveness of preprocessing. To see whatkind of preprocessing we will consider, let us start by looking at a puzzle. InFigure 1.1 you see a set of points. The challenge is to cover all these points, bydrawing at most eight lines. The lines can be arbitrarily long, but they must bestraight. A point is only covered if you draw a line right through its center. Letus attempt to solve the puzzle!

The puzzle in Figure 1.1 has 48 points. With this many points to cover,how do you start solving the puzzle? Observe that if there is a line L you candraw that has at least nine points on it, you must use that line. Otherwise, youwould have to draw separate lines through each of the nine points. After all,any line containing at least two out of these nine points, corresponds to line Litself. Given that you cannot afford nine lines, you must indeed choose to drawthe line L in order to be able to solve the puzzle. In Figure 1.2, such a line isindicated. Now, you can continue from there, with only 7 lines remaining, andmany fewer points. Since we have only seven lines left, if we can find a linewith at least eight points on it we must choose that line. The puzzle depictedhere is such that, by simply continuing this reasoning, you can in fact solve theentire puzzle. Refer to Figure 1.3, for the solution. This of course need notalways be true, the above method is no more than a preprocessing rule: there isno reason to believe that it should always apply, and also no reason to believethat the solution is trivial in puzzles where it does not apply.

Yet we can observe something else. When the above rule no longer applies,

1

2 Introduction

Figure 1.1 Can you cover all points by drawing at most 8 straight lines?

we still have k lines left that we can use, for some k ≤ 8, and no line existsthat covers at least k + 1 points. Stated differently, every line you can draw willcover at most k of the remaining points, and you can only use k lines in total.But then, for the puzzle to be solvable, we cannot actually have many points. Ifthe puzzle has more than k · k = k2 points remaining, it cannot be solved at all.So even if our preprocessing step does not solve the puzzle, it does guaranteethat after preprocessing, the remaining puzzle is “small”!

While the above example is somewhat artificial, it does demonstrate ageneral solving strategy: before trying to solve a problem, first try to preprocessit by applying (simple) rules that reduce its size or its complexity. This strategycan be applied in a wide range of different scenarios, and preprocessing isindeed widely used in practical applications. For example, many real-worldscheduling problems can be modeled as mixed-integer programs, and thensolved by a solver such as the CPLEX optimizer. These solvers often start byusing a number of rules to reduce the solution space as well as to identifyredundant constraints, before solving the problem [2,5]. This preprocessing hasfor example been shown to greatly add to CPLEX’s ability to solve optimizationinstances [3, 12]. Disabling CPLEX’s presolving step was shown to increaseaverage solving time on the more challenging problems by a factor 11 [3].

Preprocessing techniques were also shown to be useful in the PACE competi-tion in recent years [30,31]. This is a competition in which teams design andimplement (parameterized) algorithms that solve a given problem as quicklyas possible on a variety of input instances. Here preprocessing is often applied.For example, the winner of one of the challenges of the 2016 edition [30] usedan efficient preprocessing algorithm for the FEEDBACK VERTEX SET problem as asubroutine [54].

As seen from the examples above, a good preprocessing algorithm mustbe efficient, as we want to speed up the overall computation. Furthermore, apreprocessing algorithm should be guaranteed to be correct, meaning that theanswer or solution to a preprocessed input instance can quickly be turned into asolution for the original instance. Finally, we would like to be able to give some

1

1.1 Kernelization 3

guarantee that the preprocessed instance is always “small”, or in some senseeasier than the original instance.

In this thesis, we will study the power of such preprocessing algorithms.In particular, we will study preprocessing algorithms that are required to runin polynomial time, such that the preprocessing step can be done efficiently.The problems we will consider in this thesis are NP-hard. As such, we cannotexpect to find polynomial-time preprocessing algorithms that always manage toreduce the size of the given instance. After all, such a preprocessing step couldthen be used repeatedly, until the instance is so small that it is trivial to solve.This would yield a polynomial-time algorithm for an NP-hard problem, whichis unlikely to exist. For this reason, we will measure the effectiveness of ourpreprocessing algorithm by some additional parameter, other than the inputsize, that (hopefully) measures the complexity of the given input instance.

To study the effectiveness of preprocessing, we therefore consider parame-terized problems, whose inputs consist of the “regular” input, together with anadditional parameter (an integer) that is a measure of the complexity of thegiven input. A preprocessing algorithm must now guarantee that the size of apreprocessed instance is bounded by some (hopefully small) function of thisadditional parameter.

1.1 Kernelization

To formally study this type of preprocessing, we will use the notion of kerneliza-tion. A kernelization algorithm (or kernel) is a polynomial-time preprocessingalgorithm that takes an instance of a parameterized problem, and outputs anequivalent instance to the same problem whose size is bounded by some func-tion of the considered parameter. This bound on the instance size in terms ofthe initial parameter value is known as the size of the kernel, and can onlydepend on the value of the initial parameter. In this thesis, we will only considerdecision problems, which are problems that only have a yes/no-answer. As such,a kernel is only required to preserve this yes/no-answer for the preprocessedinstance to be considered equivalent to the original one.

The example of a preprocessing algorithm above where we wanted to coverpoints with lines, contains all observations needed to provide a kernelizationalgorithm for the problem. Suppose we have an arbitrary instance asking if it ispossible to cover n points with k lines, and let k be the considered parameter. Thekernel repeatedly finds a line through at least k + 1 points, removes all relevantpoints, and decreases k by one. When this rule is no longer applicable, it checksif there are at most k2 remaining points. If yes, then this is the preprocessedinstance. If not, then the puzzle has no solution, and a trivial no-instance canbe returned. As such, POINT-LINE COVER (the problem of covering points with

1

4 Introduction

L

Figure 1.2 Can you cover all points not already covered by L by drawing at most 7additional straight lines?

lines) has a kernelization algorithm that always outputs an instance with atmost O(k2) points4.

One would like to distinguish parameterized problems that have “good”kernels, such as the one for POINT-LINE COVER given above, from those thatdo not. In this direction, one may ask which parameterized problems allow forso-called polynomial kernels, which are kernels whose size bound is polynomialin the parameter value. Such kernels are generally considered good to have,as their size grows not-too-fast when the parameter increases. Many (general)tools have been developed to find (polynomial) kernels for a wide variety ofproblems [15, 39, 67], and there are several books about the area [29, 40].Furthermore, methods have been developed to be able to rule out the existenceof polynomial kernels5.

The easiest method to rule out polynomial kernels is perhaps to provide acertain type of reduction [18] starting from a parameterized problem that isalready known not to have a polynomial kernel, to the problem under considera-tion. Note that, while NP-hardness reductions may indeed transfer kernelizationlower bounds, this is not always the case, as more care needs to be taken toensure that not only the size of the new instance is polynomial in the originalsize, but that also the new parameter is appropriately bounded. As usual, sucha reduction requires a suitable starting problem.

Another lower bound technique is that of cross-compositions, which was

4The size of this kernel is not necessarily bounded by O(k2), as the size of a kernel is measuredby the number of bits needed to represent the resulting instance. This depends on the representationof the input points, which we will not discuss for this example.

5All “non-existence” or lower bound results mentioned in this introduction are based on com-plexity theoretic assumptions, since for example P = NP would imply constant-size kernels for allproblems considered in this thesis. Refer to Chapter 2, in particular §2.4.1, for further details.

1

1.2 Polynomial-time sparsification 5

7

1

23

456 8

Figure 1.3 One possible way to cover all points given in Figure 1.1 with 8 straight lines.An order in which one could find these lines by repeatedly applying the preprocessingrule is indicated.

introduced by Bodlaender et al. [16] and relies on the non-existence of so-calleddistillation algorithms for problems that are NP-hard, which was proven byFortnow and Santhanam [41]. Using these methods, it has been shown that anumber of problems do not have polynomial kernels [14,18,68].

For those parameterized problems that do have a polynomial kernel, we canstill ask what the best bound is that can be achieved. For example, we haveseen that the POINT-LINE COVER problem has a kernel with O(k2) points, andone may ask whether this can be improved to O(k1.5), or even better. Suchquestions can sometimes be answered, at least in the negative direction, byusing a more refined definition of cross-compositions [16]. This allows us togive very precise bounds on the worst-case size of a best-possible kernel. One ofthe first results in this direction was obtained by Dell and Van Melkebeek [33].

In this thesis, we will be interested in answering the question: Given aparameterized problem, what is the smallest possible kernel? In particular, wewill consider a number of classic graph and logic problems.

1.2 Polynomial-time sparsification

Observe that a graph with n vertices can have (n2) = Θ(n2) edges, which is

quadratic in the number of vertices. One specific method of preprocessingfor a graph problem, would be trying to make the input graph less dense, byreducing the number of edges. Of course, this removal of edges should notchange the answer to the problem at hand. We will refer to such a procedure asa sparsification.

As an example, consider the graph depicted in Figure 1.4. We can ask

1

6 Introduction

u

a b

v

e

d

u

a b

v

e

d

Figure 1.4 Depicted is an instance for 3-COLORING (left). On the right is an equivalentinstance, obtained by removing a number of edges that (provably) do not change thecolorability of the graph. In fact, one may verify that both instances have the exact same3-colorings.

ourselves whether this graph can be colored with at most three colors, say red,green, and blue, such that the endpoints of every edge get distinct colors. Thisproblem is also known as the 3-COLORING problem. Since it is NP-hard [46], wedo not expect to find a polynomial-time algorithm to solve the problem. We canhowever try to eliminate certain edges in polynomial time. For example, sincevertices a and b are connected by an edge, a proper 3-coloring will assign themdifferent colors. Since vertex u connects to both a and b, it will be colored bythe only remaining color. Similarly, vertex v should get a color different from aand b and will thus receive the same color as vertex u.

Now, we can use the knowledge that u and v always receive the same color,to remove a number of edges from the depicted instance: vertex e is connectedto both u and v, but we can actually forget about its connection to v, as theremaining connection to u is enough to ensure that e and v receive distinctcolors. Similarly, we can eliminate the connection between d and u. We canfurthermore observe that vertices a and d always receive the same color, andremove the connection between e and a. This results in the sparsified instancedepicted in Figure 1.4 on the right.

The argumentation we used above to sparsify the instance ensures that thesolution does not change. In this thesis we will study whether or not we canfind such rules that are not only correct in this way, but also powerful enoughto give some non-trivial upper bound on the number of edges of the reducedinstance.

Sparsification can also be applied to logic problems. Consider as an examplethe d-CNF-SAT problem, which is one of the most well-known logic problems. Theinput to d-CNF-SAT is a formula in d-CNF form, thus a formula over n variablesconsisting of a conjunction of clauses, where each clause is a disjunction of dliterals. A literal is simply a variable or its negation. Consider for example thefollowing 3-CNF-SAT instance.

F = (x1 ∨ x2 ∨ ¬x3) ∧ (¬x1 ∨ x2 ∨ x4) ∧ (x2 ∨ ¬x3 ∨ x4).

The problem now asks whether there is a Boolean assignment to the variablesthat satisfies the formula. In the example above, letting x1 := x2 := true and

1

1.2 Polynomial-time sparsification 7

x3 := x4 := false would satisfy F .The goal of sparsification for logic problems is to efficiently reduce the

number of clauses (or constraints) in terms of the number of variables. If welook at the formula F given above, we may observe that removing the lastclause does not influence the set of satisfying assignments to F : any assignmentsatisfying the first two clauses, also satisfies the third. We will therefore say thatthis last clause is redundant. Define F ′ to be the formula consisting of only thefirst two clauses of F . We can see that F ′ and F are equivalent, as follows.Consider any satisfying assignment for F ′. Clearly, this assignment satisfies thefirst clause (x1 ∨ x2 ∨ ¬x3). Thus, the assignment will set either x1, x2 or ¬x3to true. If x2 is assigned true, this assignment also satisfies the last clause of F .Similarly, if ¬x3 is true, and thus if x3 is false, the assignment satisfies the lastclause. It remains to consider the case where only x1 is set to true. But then, tosatisfy the second clause, at least one of x2 and x4 is set to true (as ¬x1 is false)and again the last clause is satisfied. As such, we see that F ′ has exactly thesame satisfying assignments as the original formula!

An interesting challenge is to give general methods that efficiently find suchredundant clauses, and significantly reduce the number of clauses.

For both graph and logic problems, we can observe that we can easily obtaina sparsification that gives some upper bound on the number of edges or clauses,respectively. A simple graph on n vertices can have only (n

2) = O(n2) edges. Ad-CNF formula on n variables can have at most (2n)d = O(nd) distinct clauses.The simplest sparsification for d-CNF-SAT would therefore be to simply removeduplicate clauses. We will therefore say that graph problems for which theinput is a simple graph have a trivial sparsification of size O(n2) (do nothing),and a problem like d-CNF-SAT has a trivial sparsification of size O(nd) (removeduplicate clauses).

The question that is studied in this thesis is whether a non-trivial sparsifi-cation exists: a sparsification that achieves better bounds than the ones givenabove. For example, can we efficiently reduce the number of edges to O(n) fora given problem defined on general graphs? And can we reduce the numberof clauses in a logic problem with clauses of size d to O(nc) for some c < d inpolynomial time?

Many of the earlier results in this direction were negative [33,56,59], and inthis thesis we will add several (graph) problems to the list of problems that donot have a non-trivial sparsification. On the other hand, we will show a numberof surprising non-trivial sparsifications for logic problems, demonstrating thatnon-trivial sparsifications are possible for a remarkable number of such problems.Let us continue by looking at the problems that will be studied in this thesis inmore detail.

1

8 Introduction

1.3 Graph coloringIn this thesis, we will study a very general graph problem, for which it waspreviously not known whether a non-trivial sparsification was possible. A specialcase of the problem that we will study is the q-COLORING problem. It generalizesthe 3-COLORING problem discussed earlier. Given a graph G, it asks whetherwe can assign every vertex one of q colors, such that each edge has differentlycolored endpoints.

This is a very well-studied problem [63,73], with many applications [6,45,77]. For example, graph coloring can be used to solve scheduling problems,where jobs need to be assigned to machines. If we assume for simplicity thatall machines are identical and one machine cannot handle multiple jobs at thesame time, this can be seen as a coloring problem. We can model each job asa vertex, and each machine as one of the colors. We connect two jobs with anedge if they overlap in time. Coloring the resulting graph gives a solution to theoriginal scheduling problem, as it is ensured that each job is assigned a machine(color), and no machine does two tasks at the same time.

In the classic q-COLORING problem, vertices of the same color cannot beconnected by an edge, but there are no further restrictions on the coloring.In some cases, you may want to forbid other color combinations as well. Forexample, one could want to color the graph with five colors, such that everyedge has differently colored endpoints and furthermore, a pink vertex can neverbe connected to an orange one. To capture these kind of constraints, we willlook at the H-COLORING problem, which is defined for every fixed graph H.The colorset we now use is given by the vertex set of H, meaning every vertexof H corresponds to a color we may use. An input to the problem consists of anundirected graph G, and the question is to color each vertex of G with one ofthese colors. If we now look at any edge in G, we require that the coloring ofits endpoints is allowed by H, meaning that the corresponding vertices in H areconnected by an edge. To see that this is a generalization of the q-COLORING

problem, observe that H-COLORING corresponds to q-COLORING when we let Hbe a clique on q vertices.

In this thesis, we will show a sparsification lower bound for H-COLORING

under certain conditions on H. Furthermore, we will obtain a tight kernelizationbound for the problem when parameterized by the size of a vertex cover in theinput graph. We then extend this result to obtain the same kernel bound for asmaller parameter.

1.4 Constraint satisfaction problemsWhen introducing sparsification, we used the logic problem d-CNF-SAT as anexample. In the d-CNF-SAT problem, we are given a formula that consists of a

1

1.5 Thesis overview 9

number of clauses and the formula is satisfied if at least one literal evaluatesto true in each clause. It is easy to think of many variations to this problem, bychanging the type of clauses we use. For example, the case where the formulais satisfied when each clause contains exactly one literal that is set to truecorresponds to the problem that is known as EXACT SAT.

We can put these problems in a common framework by studying the CON-STRAINT SATISFACTION PROBLEM (CSP) over a certain domain D. Broadlyspeaking, the input for this problem consists of a set of variables and a numberof constraints over these variables. The question is whether it is possible toassign each variable a value from D, such that all constraints are satisfied. Inthis thesis we will mostly study the case where D = 0, 1 (or equivalently,D = false, true), known as so-called BOOLEAN CSPS.

In principle, a constraint could be almost anything: it could capture that“exactly one of the variables x,y,z is true”, or that “x = ¬y”, or perhaps that“x = true or y = z”. Clearly, the expressiveness of a CSP depends on thetype of constraints one allows. The type of allowed constraints is given bya so-called constraint language, often denoted by Γ. We can thus study theproblem CSP(Γ), which deals with input instances that only use constraintsallowed by Γ. This study of CSPs led to a complete dichotomy for BooleanCSPs: if Γ satisfies certain (verifiable) conditions, CSP(Γ) is polynomial-timesolvable, otherwise CSP(Γ) is NP-complete [83]. The hardness of CSP(Γ) overnon-Boolean domains remained open for many decades, until recently also inthis situation a complete dichotomy was obtained. The dichotomy was provenindependently by Bulatov and Zhuk [22,92].

We can obtain a more fine-grained view of the complexity of constraintsatisfaction problems, by studying how the sparsifiability of the problem dependson properties of Γ. The goal is then to identify relevant properties of Γ thatdetermine the best-possible size of a sparsification for CSP(Γ). While we arefar from a complete classification, we put a number of existing sparsificationresults into a common framework in this thesis. Furthermore, we give a widelyapplicable technique to sparsify CSPs and show that the obtained results aretight in several cases.

1.5 Thesis overviewWe will start by giving the necessary preliminaries (including the formal defini-tions of all concepts mentioned in this introduction) in Chapter 2.

In Chapter 3, we study the sparsifiability of a specific type of constraintsatisfaction problem. In particular, we consider CSPs that are given as equationsof polynomials over multiple variables over various rings and fields. In this case,we obtain sparsification results where the size of the sparsification depends onthe degree of the given polynomial equalities and the properties of the ring

1

10 Introduction

or field over which these polynomials are given. This method of sparsificationturns out to generalize a number of existing sparsification results for well-knownlogic problems.

We furthermore give lower bounds to show that the results obtained by thissparsification technique are best-possible, by providing constraint languages forwhich we obtain an optimal sparsification.

In Chapter 4 we continue our study of CSPs, by applying the methodologyintroduced in Chapter 3 to a number of constraint satisfaction problems. Westart by showing how to apply the sparsification method from Chapter 3 to awider range of CSPs. We continue by giving a general property for constraintlanguages that can be used to obtain sparsification lower bounds.

Using these results, we show that we can fully distinguish between BooleanCSPs that have a non-trivial sparsification and those that do not. We then give asufficient condition of a constraint language that ensures that the correspondingCSP has a linear sparsification. We use this result to fully classify all symmetricBoolean CSPs that have a linear sparsification. We conclude the chapter byfully classifying the sparsifiability of CSPs over constraint languages where eachconstraint concerns at most 3 variables.

After studying CSPs, we change our focus to a graph problem, namelythe H-COLORING problem, which is a generalization of the well-known q-COLORING problem. In Chapter 5, we show a sparsification lower bound forthis problem. Under complexity-theoretic assumptions, we show that for allgraphs H satisfying certain conditions, the H-COLORING problems does nothave a sparsification with O(n2−ε) bits, for any ε > 0. Among other things, thisimplies a sparsification lower bound for the well-known 3-COLORING problem.This chapter combines kernelization lower bound techniques, with techniquesfrom the algebraic analysis of constraint satisfaction problems.

In Chapter 6, we will study the kernelization of the H-COLORING problem,when parameterized by the size of a vertex cover in the input graph. A kernel forq-COLORING parameterized by the size of a vertex cover was previously known,but while this kernel had size O(kq), the best lower bound was a factor k smaller.We show how to improve the kernel to have O(kq−1) vertices and edges bycleverly using the sparsification for CSPs introduced in Chapter 3. We furthergeneralize this result to obtain a kernel for H-COLORING parameterized by thesize of a twin-cover.

1.6 List of publicationsThis thesis is based on the following publications. Chapter 3 is based on thefollowing publication.

Hubie Chen, Bart M. P. Jansen, and Astrid Pieterse. Best-case and worst-case sparsifiability of Boolean CSPs. In Proceedings of the 13th International

1

1.6 List of publications 11

Symposium on Parameterized and Exact Computation (IPEC 2018), volume115, pages 15:1–15:13, 2019. doi:10.4230/LIPIcs.IPEC.2018.15

Chapter 4 is based on the following publication.

Bart M. P. Jansen and Astrid Pieterse. Optimal sparsification for some binaryCSPs using low-degree polynomials. In Proceedings of the 41st InternationalSymposium on Mathematical Foundations of Computer Science (MFCS 2016),volume 58, pages 71:1–71:14, 2016. doi:10.4230/LIPIcs.MFCS.2016.71

Chapter 5 is based on the following manuscript.

Hubie Chen, Bart M. P. Jansen, and Astrid Pieterse. Sparsification lowerbounds for H-coloring. Submitted

Chapter 6 is based on the following publication.

Bart M. P. Jansen and Astrid Pieterse. Optimal data reduction for graphcoloring using low-degree polynomials. Algorithmica, 2019. Online First.doi:10.1007/s00453-019-00578-5

The following publications do not appear in this thesis.

Hans L. Bodlaender, Sudeshna Kolay, and Astrid Pieterse. Parameterizedcomplexity of conflict-free graph coloring. In Proceedings of the 16th Algo-rithms and Data Structures Symposium (WADS 2019), 2019. To appear. URL:https://arxiv.org/abs/1905.00305

Bart M. P. Jansen and Astrid Pieterse. Polynomial kernels for hitting forbid-den minors under structural parameterizations. In Proceedings of the 26thAnnual European Symposium on Algorithms (ESA 2018), volume 112, pages48:1–48:15, 2018. doi:10.4230/LIPIcs.ESA.2018.48

Bart M. P. Jansen and Astrid Pieterse. Sparsification upper and lower boundsfor graph problems and not-all-equal SAT. Algorithmica, 79(1):3–28, 2017.doi:10.1007/s00453-016-0189-9

Astrid Pieterse and Gerhard J. Woeginger. The subset sum game revisited. InProceedings of the 5th International Conference on Algorithmic Decision Theory(ADT 2017), volume 10576, pages 228–240, 2017. doi:10.1007/978-3-319-67504-6_16

Rafael G. Cano, Kevin Buchin, Thom Castermans, Astrid Pieterse, WillemSonke, and Bettina Speckmann. Mosaic drawings and cartograms. ComputerGraphics Forum, 34(3):361–370, 2015. doi:10.1111/cgf.12648

http://dx.doi.org/10.4230/LIPIcs.IPEC.2018.15

http://dx.doi.org/10.4230/LIPIcs.MFCS.2016.71

http://dx.doi.org/10.1007/s00453-019-00578-5

https://arxiv.org/abs/1905.00305

http://dx.doi.org/10.4230/LIPIcs.ESA.2018.48

http://dx.doi.org/10.1007/s00453-016-0189-9

http://dx.doi.org/10.1007/978-3-319-67504-6_16

http://dx.doi.org/10.1007/978-3-319-67504-6_16

http://dx.doi.org/10.1111/cgf.12648

1

2Chapter 2

Preliminaries

In this section we introduce the background material that will be used through-out this thesis. We start in Section 2.1 by introducing some basic notation.In Section 2.2, we give notation and definitions relating to graphs and graphproblems. We continue in Section 2.3 by introducing parameterized complexity,and giving the formal definition of kernelization. In Section 2.4, we describethe methods that will be used to obtain kernelization lower bounds, and thecomplexity-theoretic assumption under which we are able to obtain these lowerbounds. In Sections 2.5 and 2.6, we describe notation and definitions relating tolinear algebra, matrices, and polynomials. In Section 2.7, we formally introduceConstraint Satisfaction Problems, and provide some known results. Finally,in Section 2.8 we introduce graph coloring, graph homomorphism, and therelation between graph homomorphism and Constraint Satisfaction Problems.

2.1 General notationFor a positive integer n, we define [n] := 1, 2, . . . , n. Let Z := . . . ,−3,−2,−1, 0, 1, 2, 3, . . . be the set of all integers. We use N := 0, 1, 2, 3, . . . toindicate the non-negative integers. We use N+ := 1, 2, 3, . . . to denote the setof all positive integers.

For a set S we use the notation (Sk) := S′ ⊆ S | |S′| = k to denote the set of

all size-k subsets of S. We use the notation Sk := (s1, . . . , sk) | s1, . . . , sk ∈ S

2

14 Preliminaries

to denote the set of all k-tuples with elements from S. In particular, [n]2 denotesall 2-tuples of elements from [n].

We use the notation O to suppress polylogarithmic factors, thus O( f (n)) =O( f (n) · logc n) for some constant c.

2.2 Graphs

All graphs considered in this thesis are finite, simple, and undirected, unlessexplicitly mentioned otherwise. A graph G has vertex set V(G) and edgeset E(G). An edge is a nonempty subset of V(G) of size at most two; fornon-simple graphs the singleton sets correspond to self-loops (although we maysometimes denote a self-loop on vertex v by edge v, v for convenience). Ina simple graph, all edges have size two. For a simple graph G, let G denotethe complement of G, such that V(G) := V(G) and E(G) := u, v | u, v ∈V(G), u, v /∈ E(G).Definition 2.1 (Neighborhood) Let G be a (not necessarily simple) undirectedgraph. For a vertex v ∈ V(G), let NG(v) := u | u, v ∈ E(G) denote itsopen neighborhood. These are simply all vertices that are connected to v by anedge. A vertex v is contained in its own open neighborhood, if and only if thereis a self-loop at v. We let NG[v] := NG(v) ∪ v be its closed neighborhood.

Similarly, for a set of vertices S ⊆ V(G), we denote its closed neighborhoodby NG[S] :=

⋃v∈S NG[v]. In a simple graph G, its open neighborhood is given

by NG(S) := NG[S] \ S. In general, the open neighborhood of S is given byNG(S) := (NG[S] \ S) ∪ s | s ∈ S, s ∈ E(G). We may omit the subscript Gwhen it is clear from context.

For a vertex v ∈ V(G) in a graph G, we use the notation dG(v) to denote itsdegree in G. If G is a simple graph, then dG(v) := |NG(v)|. If G has self-loops,dG(v) := |NG(v)|+ 1 if v ∈ E(G) and dG(v) := |NG(v)| otherwise (such thata vertex with only a self-loop has degree 2). We use ∆(G) := maxv∈V(G) dG(v)to denote the maximum degree of any vertex in G.

For a graph G and a set S ⊆ V(G), we say S is a clique in G if u, v ∈ E(G)for all u, v ∈ S with u 6= v. We use ω(G) to denote the size of the largest cliquein G, known as the clique number of G. We say a set S ⊆ V(G) is an independentset if for all s, s′ ∈ S, it holds that s, s′ /∈ E(G). Observe that for a simplegraph G, the set S is an independent set in G if and only if S is a clique in G.

Let G be a graph and let S ⊆ V(G). We say S is a vertex cover of G if forevery edge e ∈ E(G) at least one of its endpoints is in S, thus e ∩ S 6= ∅. Fromthis definition it is easy to see that S is a vertex cover in a graph G, if and onlyif V(G) \ S is an independent set.

A set S ⊆ V(G) is a dominating set in G if for every vertex v /∈ S, at leastone of its neighbors is in S, meaning NG[v] ∩ S 6= ∅.

2

2.3 Parameterized complexity and kernelization 15

A graph H is a subgraph of G if V(H) ⊆ V(G) and E(H) ⊆ E(G). We sayH is a spanning subgraph of G if H is a subgraph of G such that V(H) = V(G).We sometimes also say that a graph H is a (spanning) subgraph of G if H isisomorphic to a subgraph G′ of G.

Let S be a subset of the vertices of G. We use G[S] to denote the subgraphof G induced by S, thus G[S] is the graph with vertex set S and edge sets, s′ | s, s′ ∈ E(G) ∧ s, s′ ∈ S. For a subset of the vertices S we use thenotation G− S do denote the graph G[V(G) \ S], which is the graph G afterremoving all vertices in S and their incident edges.

Definition 2.2 (Paths, walks, cycles) For a (not necessarily simple) graph G,a walk of length k in G is a sequence of vertices (x0, x1, . . . , xk) such thatxi−1, xi ∈ E(G) for all i ∈ [k]. A walk is closed if x0 = xk. A path is a walk onwhich all vertices are distinct. A cycle is a closed walk (x0, x1, . . . , xk) where allvertices are distinct except that x0 = xk

6. For vertices u, v ∈ V(G), a path orwalk of length k from u to v is a path or walk for which x0 = u and xk = v.

A path or cycle is odd if its length is odd. Note that a self-loop is an odd cycle(of length 1).

Definition 2.3 (Girth) The (odd) girth of a (not necessarily simple) graph G isthe length of its shortest (odd) cycle, or +∞ if no such cycle exists.

For an integer α ≥ 1, we use the notation Cα for the cycle on α vertices,such that V(Cα) := [α], and E(Cα) := x, x + 1 | x ∈ [α− 1] ∪ α, 1. Welet Kα be the complete graph (clique) on α vertices, such that V(Kα) := [α] andE(Kα) := x, y | x, y ∈ [α], x 6= y.

We say a graph G is bipartite if its vertex set can be partitioned into sets Sand T, such that S and T are independent sets in G, meaning the only edgesin G connect a vertex in S to a vertex in T. Observe that a graph is bipartite,if and only if it is 2-colorable. We use Kα,β to denote the complete bipartitegraph whose partite sets have size α and β, such that V(Kα,β) := S ∪ T withS := s1, . . . , sα and T := t1, . . . , tβ. Furthermore, E(Kα,β) := s, t | s ∈S, t ∈ T.

2.3 Parameterized complexity and kernelization

Let Σ be a fixed finite alphabet. Define Σ∗ as the set of all finite strings withelements from Σ (including the empty string). A parameterized problem is alanguage Q ⊆ Σ∗ ×N. This means that such a problem has input instances(x, k), where the second component is called the parameter. The parameter iscommonly referred to by the letter k. In this thesis, we will sometimes deviatefrom this convention. When the chosen parameter corresponds to the number

6By this definition, paths and cycles are necessarily simple paths and cycles.

2

16 Preliminaries

of variables in a logic problem, or the number of vertices in a graph problem,we will regularly denote it by n.

Now that we have defined parameterized problems, we can formalize thedefinition of kernelization that was presented in the introduction of this thesis.

Definition 2.4 ((Generalized) kernel) Let Q,Q′ ⊆ Σ∗ ×N be parameterizedproblems and let h : N → N be a computable function. A generalized kernelfor Q into Q′ of size h(k) is an algorithm that, on input (x, k) ∈ Σ∗ ×N, takestime polynomial in |x|+ k and outputs an instance (x′, k′) such that:

1. |x′| and k′ are bounded by h(k), and

2. (x′, k′) ∈ Q′ if and only if (x, k) ∈ Q.

The algorithm is a kernel for Q if Q′ = Q. It is a polynomial (generalized) kernelif h(k) is a polynomial. We use the words kernel and kernelization (algorithm)interchangeably.

Observe that, for a given parameterized problem, it is entirely possiblethat no kernelization exists. Intuitively, this happens when the parameterdoes not capture the complexity of the problem very well. As an extremeexample, choosing a parameter that is constant means that a kernel only existsif the problem is solvable in polynomial time. In this thesis we will studyparameterized problems that do have kernelization algorithms, and investigatewhat the smallest kernel is for these problems.

One of the other questions that parameterized complexity deals with iswhether a parameterized problem can be solved in time polynomial in |x| (butpossibly exponential in the parameter). More formally:

Definition 2.5 (FPT) We say that a parameterized problemQ is Fixed-ParameterTractable (FPT) if there is a computable function f and an algorithm to determinefor any (x, k) ∈ Σ∗ ×N if (x, k) ∈ Q in time f (k) · |x|c for a constant c.

It turns out that the problems that admit FPT algorithms, and those thathave kernels, coincide.

I Theorem 2.6 ([13, Theorem 1]) Let Q be a parameterized problem. Then Q isFPT if and only if Q is decidable and has a kernel. J

By this result, showing that a problem is (not) FPT can be used to show thatit does (not) have a kernel.

2.4 Kernelization lower bound framework

Now that we have seen the definition of a kernel, we will look at the methodsto prove kernelization lower bounds. We start by describing the necessarycomplexity-theoretic assumptions.

2

2.4 Kernelization lower bound framework 17

2.4.1 Complexity-theoretic assumptionsThe kernelization lower bounds obtained in this thesis, are proven undercomplexity-theoretic assumptions. This is necessary, since the problems weconsider are all contained in NP. As such, P = NP would imply that all con-sidered problems have a kernel of size O(1), obtained by solving the problemin polynomial time, and then outputting a constant-size yes- or no-instance,depending on the answer.

Thereby, we will have to assume P 6= NP to prove meaningful lower bounds.However, kernelization lower bounds are generally only obtained under aneven stronger assumption, namely NP * coNP/poly. We assume the reader isfamiliar with the complexity theory relating to the classes P, NP, and coNP.In the remainder of this section we will give an informal description of themeaning of the assumption that NP is not contained in coNP/poly.

The class coNP/poly is the class of problems that can be co-nondeterministic-ally decided by a Turing machine M, in polynomial time, when given polynomialadvice. Machine M co-nondeterministically decides a problem, if for a yes-instance, all computation paths lead to an accept. When given a no-instance,there must be at least one computation path leading to a reject.

This polynomial advice means that M may have an additional, read-only,input tape. An input to this machine is now a “normal” input x, together with aninput f (|x|) on the advice tape of length polynomial in the length of the input.Observe that this advice may only depend on the length of the given input x.

Alternatively, the class coNP/poly can be seen as those problems for whichthere is an infinite sequence of algorithms A1, A2, . . ., one for every inputlength n, such that An co-nondeterministically decides the inputs of size ex-actly n in time O(nc) for some constant c independent of n, and additionallythe description size of each algorithm An is bounded by O(nc).

Clearly, coNP ⊆ coNP/poly, but it is known that coNP is in fact a strictsubset of coNP/poly. The argument for this is not very complicated: We canshow that coNP/poly contains undecidable problems, while it is obvious thatcoNP does not. In fact, even P/poly contains problems that are not decidable.Consider for example the undecidable HALTING PROBLEM. The problem is, givena Turing machine, to decide whether the given Turing machine will halt (withina finite number of steps) when given an empty input string. Suppose we encodethe inputs to the halting problem in unary (thus, by only 1’s), thereby givingeach possible input instance a unique length. This can be done, by fixing anenumeration of all possible Turing machines. Then if an input for the HALTING

PROBLEM corresponds to the i’th Turing machine, we can encode i in unary.Even though the problem is now encoded in unary, the HALTING PROBLEM

remains undecidable, but a Turing machine with polynomial advice can nowtrivially solve this version of the HALTING PROBLEM in constant time. The advicestring can simply consist of a single bit, that indicates whether the unique

2

18 Preliminaries

instance of length n is a yes-instance.

Note however, that the above should not be taken as evidence that perhapsNP ⊆ coNP/poly holds. While it indeed shows that coNP/poly is in some sensesignificantly larger than coNP, this is mostly the case due to a very specificinput encoding. In fact, there are at least two different reasons to believeNP * coNP/poly.

First (and perhaps foremost), if NP ⊆ coNP/poly would hold, this has impli-cations for the polynomial hierarchy. The polynomial hierarchy is a hierarchyof complexity classes, all contained in PSPACE that starts with P at the lowestlevel. The level above contains NP (and also coNP). The hierarchy continueswith infinitely more levels. It is not only believed that P ( NP, but also thatevery other level is a strict subset of the levels above it. Just like proving P 6= NP,proving this remains an open question. However, if NP ⊆ coNP/poly holds, theentire hierarchy would collapse to the third level [91, Theorem 2], or in otherwords, NP ⊆ coNP/poly implies PH = Σp

3 .Secondly, the complexity classes NP and coNP are believed to be incompara-

ble due to the way they are defined. For problems in NP it is easy to verify ayes-instance, when given a (small) certificate, but potentially difficult to verify ano-instance. For problems in coNP however the opposite is true; for problemsin coNP verifying no-instances is easy. This is one of the reasons that NP andcoNP are believed to be distinct. A polynomial amount of advice that is onlyallowed to depend on the length of the given input, is not expected to changethis situation, because over a binary alphabet there are exponentially manyinputs of the same size.

2.4.2 Linear-parameter transformationsThe most straightforward way to obtain kernelization lower bounds, is bygiving an appropriately bounded transformation that starts from a problem forwhich a kernelization lower bound is known. We start by defining this type oftransformation.

Definition 2.7 (Linear-parameter transformation) Let Q,Q′ ⊆ Σ∗ ×N be twoparameterized problems. A linear-parameter transformation from Q to Q′ isa polynomial-time algorithm that, given an instance (x, k) ∈ Σ∗ ×N of Q,outputs an instance (x′, k′) ∈ Σ∗ ×N of Q′ such that the following holds:

1. (x, k) ∈ Q if and only if (x′, k′) ∈ Q′, and

2. k′ = O(k).

We denote the existence of such a transformation by Q ≤lpt Q′.The next theorem shows how linear-parameter transformations give kernel-

ization lower bounds. In particular, the contraposition of the theorem statesthat if it is known that Q′ does not admit a generalized kernel of size O(kd) for

2


some d (possibly under additional assumptions), then Q does not admit such akernel either (under the same assumptions).

I Theorem 2.8 Let Q and Q′ be parameterized problems with Q ≤lpt Q′. If Q′admits a generalized kernel of size O(kd), then Q admits a generalized kernel ofsize O(kd).

Proof. This result has been used several times [16,18], we give a short proofhere for completeness. Suppose Q′ admits a generalized kernel of size O(kd),we show how to obtain a generalized kernel of the same size for Q. Let (x, k)be an instance for Q. Use the linear-parameter transformation from Q to Q′to obtain an instance (x′, k′) for Q′ such that k′ = O(k) and (x, k) ∈ Q if andonly if (x′, k′) ∈ Q′. Apply the assumed generalized kernel for Q′ to obtainan instance (x′′, k′′) for some problem Q′′ such that (x′, k′) ∈ Q′ if and only if(x′′, k′′) ∈ Q′′, |x′′| = O((k′)d) and k′′ = O((k′)d).

Clearly hereby (x′′, k′′) ∈ Q′′ if and only if (x, k) ∈ Q and |x′′| = O((k′)d) =O(kd) and k′′ = O(kd). As such, this gives a generalized kernel for Q ofsize O(kd). J

As a starting point for linear-parameter transformations, we will regularlyuse the following kernelization lower bound for d-CNF-SAT, that was proven byDell and Van Melkebeek [33]. Let us start by formally defining d-CNF-SAT. Wesay a formula F is in d-CNF if it is a conjunction of clauses, where every clauseis a disjunction of d literals. A literal is simply a variable or its negation.

d-CNF-SAT

Input: A d-CNF formula F on variables x1, . . . , xn.Parameter: The number of variables n.Question: Does there exist an assignment τ : x1, . . . , xn → 0, 1 satisfy-ing F? Here F is satisfied if every clause contains at least one literal thatevaluates to 1.

I Theorem 2.9 ([33, Theorem 1]) Let d ≥ 3 be an integer. Then d-CNF-SAT

parameterized by the number of variables n does not have a generalized kernel ofsize O(nd−ε) for any ε > 0, unless NP ⊆ coNP/poly. J

Another problem we will use as the starting problem for linear-parametertransformations is the VERTEX COVER problem defined below.

VERTEX COVER

Input: An undirected graph G and an integer k.Parameter: The number of vertices n.Question: Does G have a vertex cover of size at most k?

2

20 Preliminaries

u

v

w

x

uin

vin

win

xin

xout

uout

wout

vout

G G′

Figure 2.1 Linear-parameter transformation from a VERTEX COVER instance (left) to aFEEDBACK ARC SET instance (right). A minimum vertex cover (respectively, feedback arcset) is indicated in blue.

I Theorem 2.10 ([33, Theorem 2]) VERTEX COVER parameterized by the numberof vertices n does not have a generalized kernel of size O(n2−ε) for any ε > 0,unless NP ⊆ coNP/poly. J

From the definition of a linear-parameter transformation, it is clear that onsome occasions, kernelization lower bounds for a certain NP-hard problem Qmay follow directly from its NP-hardness proof. This is the case if an appropriatelower bound is known for the starting problem for the reduction, and further-more the parameter only increases linearly. Below, we describe an examplewhere this is indeed the case.

I Example 2.11 Consider the following graph problem.

FEEDBACK ARC SET

Input: A directed graph G on n vertices and an integer k.Parameter: The number of vertices n.Question: Is is possible to remove at most k arcs from G, such that theresulting graph is acyclic?

We can show by a linear-parameter transformation from VERTEX COVER, thatFEEDBACK ARC SET parameterized by the number of vertices n does not have ageneralized kernel of size O(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly.

The linear-parameter transformation will correspond to the many-one re-duction given by Karp [65], that is used to prove the NP-hardness of FEEDBACK

ARC SET. Let us start by repeating the construction here for completeness.Let an instance (G, k) for VERTEX COVER be given, we create an instance

(G′, k′) for FEEDBACK ARC SET as follows. Let k′ := k. Let V(G′) := vin, vout |v ∈ V(G). For every edge u, v ∈ E(G), add the arcs (uout, vin) and (vout, uin)to G′. For every vertex v ∈ V(G), add the arc (vin, vout) to G′. An illustrationof this construction can be found in Figure 2.1.

It is straightforward to show that G′ has a feedback arc set of size k′, if andonly if G has a vertex cover of size k, we leave this as an exercise to the reader.Furthermore, we can observe that |V(G′)| = 2|V(G)| = O(n), and thereby we

2


have given a linear-parameter transformation from VERTEX COVER to FEEDBACK

ARC SET parameterized by the number of vertices n. The result follows fromTheorems 2.8 and 2.10. J

2.4.3 Cross-compositionsAnother way to obtain a kernelization lower bound for a parameterized prob-lem Q, is by giving a cross-composition from an NP-hard problem P to Q.There are two types of cross-compositions, namely AND-cross-compositions andOR-cross-compositions. All kernelization lower bounds in this thesis will beobtained via OR-cross-compositions, and we will often omit the prefix “OR”. Wewill introduce the notion of cross-compositions in this section.

Informally speaking, a cross-composition is a polynomial-time algorithm thatis given a t instances for P , for some large number t. It outputs a single instancefor Q such that this instance acts as the logical OR of the given input instances.Furthermore, the parameter value of this instance should be appropriatelybounded.

To formally define cross-compositions, we first need some additional defini-tions.

Definition 2.12 (Polynomial equivalence relation, [16, Def. 3.1]) An equivalencerelation R on Σ∗ is called a polynomial equivalence relation if the followingconditions hold.

• There is an algorithm that, given two strings x, y ∈ Σ∗, decides whether xand y belong to the same equivalence class in time polynomial in |x|+ |y|.

• For any finite set S ⊆ Σ∗ the equivalence relation R partitions the elementsof S into a number of classes that is polynomially bounded in the size of thelargest element of S.

As mentioned, the input for a cross-composition consists of t instances ofa chosen starting problem. It is allowed to assume that these instances areequivalent under a polynomial equivalence relation of our choice. We can usethis to ensure that the inputs given for a cross-composition are somewhat similar.For example, for a graph problem, it can be used to ensure that all inputs havethe same number of vertices, by letting input instances with the same numberof vertices be equivalent. If the inputs have a vertex set that is partitionedinto for example red vertices and blue vertices, one can even use a polynomialequivalence relation to ensure that inputs have the same number of red (andblue) vertices. A polynomial equivalence relation can also in many cases beused to ensure that the inputs all ask for a solution of the same size.

Definition 2.13 (or-cross-composition, [16, Def. 3.3]) Let L ⊆ Σ∗ be a language,let R be a polynomial equivalence relation on Σ∗, let Q ⊆ Σ∗ ×N be aparameterized problem, and let f : N → N be a function. An OR-cross-com-position of L into Q (with respect to R) of cost f (t) is an algorithm that, given t

2

22 Preliminaries

instances x1, x2, . . . , xt ∈ Σ∗ of L belonging to the same equivalence class of R,takes time polynomial in ∑t

i=1 |xi| and outputs an instance (y, k) ∈ Σ∗×N suchthat:

• The parameter k is bounded by O( f (t) · (maxi |xi|)c), where c is some con-stant independent of t, and

• instance (y, k) ∈ Q if and only if there is an i ∈ [t] such that xi ∈ L.

The lower bound result obtained by a cross-composition depends on the costof the given cross-composition. It is known that cross-compositions of cost to(1)

can be used to rule out the existence of polynomial kernels [16, Corollary 3.9].In this thesis however we focus on giving tight kernelization lower bounds forproblems that do have a polynomial kernel. We will use the following result.

I Theorem 2.14 ([16, Theorem 3.8]) Let L ⊆ Σ∗ be a language, let Q ⊆ Σ∗×N

be a parameterized problem, and let d, ε be positive reals. If L is NP-hard underKarp reductions, has an OR-cross-composition into Q with cost f (t) = t1/d+o(1),where t denotes the number of instances, and Q has a polynomial (generalized)kernelization with size bound O(kd−ε), then NP ⊆ coNP/poly. J

For d ∈N we will refer to an OR-cross-composition of cost f (t) = t1/d+o(1)

as a degree-d cross-composition. By Theorem 2.14, a degree-d cross-compositioncan be used to rule out generalized kernels of size O(kd−ε). Note that whenstudying sparsification, we use the number of vertices or variables in the instance(which is usually denoted by n) as the parameter value.

Let us describe the “standard” approach to giving a degree-2 OR-cross-composition, when dealing with a graph problem Q parameterized by thenumber of vertices n. This approach is based on the table-based approachintroduced by Dell and Marx [32].

As an input for the cross-composition, we are given t instances of a suitablychosen input problem P . It will turn out that for the construction of a degree-2cross-composition, it is often convenient to assume that

√t is an integer, and

it is sometimes even desirable that log t and log√

t are integers. We can easilyensure this, by duplicating one of the input instances until we have t′ ≥ t inputinstances of P , with t′ an integer such that log

√t′ is integer. Observe that, in

particular, for any c ∈ N we have that 22c satisfies this condition, since√

22c

= 2c and log√

22c = c. As such, there exists an appropriate t′ with t′ ≤ 4t:Let t′′ be the smallest power of two that is larger than t, such that t′′ = 2d.Observe that t′′ ≤ 2t. If d is even, then t′ := t′′ satisfies our requirements. Ifhowever d is odd, then t′ := 2t′′ = 2d+1 satisfies the requirements. In bothcases, t′ ≤ 2t′′ ≤ 4t.

To select a suitable NP-hard starting problem P , it can be helpful to considera restricted variant of the goal problem Q. Often, it is convenient to have astarting problem that is defined on a restricted set of graphs, for which the

2


L1 L2 L3

R1 R2 R3

G1,1 G1,2 G1,3

√t

√t

# Vertices in all inputs: 9 · 5 = 45 # Vertices in G′:√

9 · 5 = 15

G2,1 G2,2 G2,3

G3,1 G3,2 G3,3

Figure 2.2 Basic structure of a degree-2 cross-composition. Inputs for a problem Pdefined on bipartite graphs are depicted on the left. The result of applying the table-likestructure for cross-compositions is depicted on the right. The figure is adapted from anearlier publication [81].

vertices can be partitioned into two sets, say L and R, such that G[L] and G[R]always have a known structure. An example would be bipartite graphs, whereG[L] and G[R] are always independent sets, but one can also think of caseswhere G[L] is for example a path, or a union of disjoint triangles, etcetera.

Next, one may define a polynomial equivalence relation that lets inputsbe equivalent if the sizes of L and R are the same. So, suppose we are givent instances of P with |L| = m and |R| = n. We now need to encode theseinstances into one single instance of Q, whose number of vertices can beapproximately O(

√t · poly(n + m)). The challenge is now that we do not have

the budget to copy every vertex of every input instance. We show how to resolvethis, using a table-like structure.

As mentioned, we can assume that√

t is integer. As such, we can label the tinput graphs as Gi,j for i, j ∈ [

√t]. By the choice of our equivalence relation

and input problem, all Gi,j have partite sets Li,j and Ri,j and all Gi,j[Li,j] areisomorphic. The same holds for Gi,j[Ri,j]. We now show how to create aninstance G′ for Q. Start by creating

√t copies of G1,1[L1,1] and label them Li for

i ∈ [√

t]. Similarly, create√

t copies of G1,1[R1,1] labeled Rj for j ∈ [√

t]. Now,we can represent the edges of all input instances, by connecting vertices in Lito Rj in such a way, that G′[Li ∪ Rj] is isomorphic to graph Gi,j. Observe thatG′ at this point has

√t · (n + m) vertices, which is appropriately bounded for a

degree-2 cross-composition. Refer to Figure 2.2 for an example.The created graph G′ now represents all input instances. It remains to

ensure that G′ is a yes-instance for Q if and only if there exist i and j such thatGi,j is a yes-instance for P . This is generally done by adding additional gadgets,that select one index i and one index j. For example, one could add gadgetssuch that the solution in all-but-one Li is “for free”, and add similar gadgets forRj. Of course, the actual design of these gadgets and their precise working is

2

24 Preliminaries

highly problem-dependent.

As mentioned earlier, apart from OR-cross-compositions, there also existAND-cross-compositions. These are defined similarly except that the resultinginstance should be a logical AND of the given input instances. It follows from aresult by Drucker [35] that they result in the same kernelization lower boundsas OR-cross-compositions.

2.5 (Linear) algebra

For an integer q, we let Z/qZ denote the integers modulo q. These form a fieldif q is prime, and a ring otherwise. We will use x ≡q y to denote that x and y arecongruent modulo q, and x 6≡q y to denote that they are incongruent modulo q.

We denote the greatest common divisor of two integers x and y as gcd(x, y)and their least common multiple as lcm(x, y). The two concepts are closelyrelated, since lcm(x, y) = |x · y|/ gcd(x, y) for x or y nonzero.

I Theorem 2.15 (Bézout’s identity) Let x, y, z ∈ Z with gcd(x, y) = z. Thereexist integers a and b such that ax + by = z. J

We will use x | y to indicate that x divides y (over the integers) and x - y toindicate that it does not.

For definitions and basic properties of rings, fields, vector spaces, and relatedconcepts, we refer the reader to [72]. We will recall the most relevant definitionshere, and introduce some notation.

Definition 2.16 (Span) Given a set S = s1, . . . , sn of k-ary vectors over aringR, we define spanR(S) as the set of all vectors y inRk for which there existα1, . . . , αn ∈ R such that y = ∑i∈[n] αisi. We may omit the subscript indicatingthe relevant ring, if it is clear from the context.

For a positive integer q, we use spanq(S) as a shorthand for spanZ/qZ(S).

For an m× n matrix A, we commonly use ai for i ∈ [m] to denote the i’throw of A and ai,j to denote the element in row i and column j.

Definition 2.17 (Row space) Let A be an m× n matrix over a ring R. Then therow space of A is defined as the span of the set of rows of A, or equivalently, asspanR(a1, . . . , am).Definition 2.18 (Basis) A basis of a vector space V is defined as a set B ⊆ Vsuch that span(B) = V and the vectors in B are linearly independent.

It is well-known that for a vector space, its dimension is defined as the sizeof a basis of this vector space, which is well-defined since any basis of a vectorspace has the same size. Furthermore, for a matrix over a field, it is known thatthe dimension of its row space equals the dimension of its column space. Hereby,the dimension of the row space of such a matrix, is bounded by its number

2

2.5 (Linear) algebra 25

of columns (and vice versa), which is useful information when consideringmatrices with m n.

Observe that when considering matrices over a ring, the situation is morecomplex. Consider for example the 2× 1 matrix A over the integers modulo 6,given by

A :=(

23

).

Now, since A has only one column, you might hope to find a size-1 subset ofthe rows that spans the entire row span of A. This is however not possible, thevector (1) is in the row space of A, as it is the result of subtracting the firstrow from the second. It is however not a multiple of either of the two rows.Thus, there is no single row that spans the entire row space of A. This shows animportant difference between matrices with elements from a field, and matriceswith elements from a ring.

Definition 2.19 (Diagonal matrix) We say an m × n matrix A is a diagonalmatrix if all entries ai,j with i 6= j are zero. Thus, all non-zero elements occuron the diagonal. Note that under this definition a matrix can be diagonal evenif m 6= n.

Definition 2.20 (Unimodular matrix [48, Definition 367]) An n× n matrix U overZ is called unimodular if its determinant is 1 or −1.

The relevant property of unimodular matrices that we will use is that aunimodular matrix over Z is invertible over Z. This means that if U is aunimodular matrix over Z, then there exists a matrix U−1 over Z such thatU ·U−1 = U−1 ·U = I where I is the identity matrix (a diagonal matrix withall-ones on the diagonal). In Chapters 3 and 4 we will use two normal forms ofcertain types of matrices in our proofs, namely the Smith Normal Form and theHowell Normal Form. We introduce them below.

Definition 2.21 (Smith normal form [48, Theorem 368]) Let A be an m× n matrixover the integers. There exist unimodular matrices U and V over the integers,such that U is an m×m matrix and V is an n× n matrix, and a diagonal m× nmatrix S over Z such that

A = USV,

the diagonal entries of S are d1, d2, . . . , dr, 0, . . . , 0, each di a positive integer,and di|di+1 for i ∈ [r− 1]. We will call U, S, and V the Smith decomposition ofA. Finally, S is called the Smith Normal Form of A.

The Smith normal form can be useful when solving systems of linear equa-tions over the integers. In particular, when one wants to solve Ax = b for x,one can alternatively solve for Sx′ = U−1b. Observe that this system is easierto solve since S is a diagonal matrix. If this gives a solution x′, then x := V−1x′

is a solution for Ax = b, as is easily verified:

Ax = AV−1x′ = USVV−1x′ = U(Sx′) = U(U−1b) = b.

2

26 Preliminaries

We will see this in more detail in some of the proofs in Chapter 4.

I Example 2.22 Consider the matrix

A =

1 2 31 2 14 0 12 4 4

,

then its Smith decomposition looks as follows:

U =

1 2 1 01 0 0 04 −3 −1 02 2 1 1

, S =

1 0 00 1 00 0 160 0 0

, V =

1 2 10 8 10 −1 0

.

One may verify that U and V are indeed invertible with

U−1 =

0 1 0 0−1 5 −1 03 −11 2 0−1 −1 0 1

, and V−1 =

1 −1 −60 0 −10 1 8

.

J

I Theorem 2.23 ([64, Theorem 4]) The Smith decomposition of a matrix can becomputed in polynomial time. J

We will furthermore use the Howell form of a matrix, which was firstintroduced by Howell [51]. The following definition is based on [87].

Definition 2.24 (Howell form) Let q ∈ N be an integer and let A be an m× nmatrix over Z/qZ. Then there exists an invertible n× n matrix U over Z/qZ,and an m × n matrix H over Z/qZ, such that A = UH and the followingstatements hold.

• Let r be the number of non-zero rows of H. Then the first r rows of H arenon-zero.

• For i ∈ [r], let hi,ji be the first nonzero entry in row i. Then j1 < j2 < . . . < jr.

• hi,ji |q for all i ∈ [r].

• For 1 ≤ k < i ≤ r, we have 0 ≤ hk,ji < hi,ji .

• When x ∈ spanq(h1, . . . , hm) has zeros as its first ji − 1 components, thenx ∈ spanq(hi, . . . , hm).

We say that H is in Howell form. Furthermore, we may call H the Howell formof matrix A (which is known to be unique [51]).

2

2.6 Polynomials 27

For a matrix to be in Howell form, the above five statements must all hold.For our proofs however, we are mostly interested in the first two properties, theremaining properties will not be used. The next theorem shows that the Howellform can be efficiently computed.

I Theorem 2.25 ([87, §3]) Given a matrix A over Z/qZ, we can computematrices U and H in polynomial time, such that U is invertible over Z/qZ,A = UH, and H is in Howell form. J

2.6 PolynomialsWe continue by providing the definitions and results that we will use consideringpolynomials and polynomial equalities.

A multivariate polynomial (over a ring R) is a function p : Rk → R of thetype

p(x1, . . . , xk) = ∑(d1,...,dk)∈Nk

αd1,...,dk ∏i∈[k]

(xi)di

with coefficients αd1,...,dk∈ R that are non-zero for a finite number of choices for

(d1, . . . , dk) ∈Nk. A term of the type ∏i∈[k](xi)di is referred to as a monomial.

The degree of a monomial is simply ∑i∈[k] di. The degree of a polynomial isthe maximum of the degrees of all monomials in this polynomial that have anon-zero coefficient.

We say that a monomial is multilinear if for all i ∈ [k], we have di ∈ 0, 1.In other words, a monomial is multilinear if it corresponds to a multiplicationof distinct variables. A polynomial is multilinear if all its monomials that have anon-zero coefficient are multilinear.

I Example 2.26 Consider the following polynomial over Q:

p(x, y, z) := 5 · x2y +12· xyz3 − x + 3y.

This is a polynomial of degree 5 (achieved by the second monomial), that is notmultilinear, since its first two monomials are not multilinear. J

We will often need to bound the number of (multilinear) monomials of adegree-d polynomial. We use the following lemma.

I Lemma 2.27 There are at most nd + 1 multilinear monomials of degree atmost d over a set of n variables.

Proof. The number of multilinear monomials over n variables of degree atmost d is equal to ∑d

i=0 (ni ). We will show that ∑d

i=1 (ni ) ≤ nd. The left side

counts all non-empty subsets of [n] of size at most d. Each of these can bemapped to a distinct d-tuple containing numbers from [n], by repeating anarbitrary element. Since there are nd possible d-tuples, the claim follows. J

2

28 Preliminaries

Observation 2.28 Let F be a field, and let S ⊆ F be a set of size k. Then there isa degree-k polynomial p in one variable, such that p(x) = 0 if and only if x ∈ S;and the polynomial defined by

p(x) := ∏s∈S

(x− s)

satisfies these conditions.

2.7 Constraint satisfaction problemsWe continue by formally introducing constraint satisfaction problems, as were(informally) introduced in Chapter 1.

Definition 2.29 ((Boolean) relation) A relation over the set D is a subset of Dk;here, k is a natural number called the arity of the relation. A relation of arity kmay be called a k-ary relation. A Boolean relation is a relation over 0, 1. A2-ary relation is sometimes referred to as a binary relation.

When considering the Boolean domain, we will regularly refer to the 0-element with false, and to the 1-element with true.

Definition 2.30 (k-or) For each k ≥ 1, we use k-OR to denote the relation0, 1k \ (0, . . . , 0).By the above definition, 2- OR = (1, 1), (1, 0), (0, 1).Definition 2.31 (R+) For a binary relation R, let R+ denote the transitiveclosure of R. For n ∈N let Rn be defined as

(u, v) | ∃x1, . . . , xn+1 : ∀i∈[n](xi, xi+1) ∈ R ∧ u = x1 ∧ v = xn+1.

Observe that for any binary relation R over universe D, we have R0 = (x, x) |x ∈ D.Definition 2.32 (Constraint language) A constraint language over D is a finiteset of relations over D; a Boolean constraint language is a constraint languageover 0, 1.

For a constraint language Γ over a domain D, we define CSP(Γ) as follows.CSP(Γ)

Input: A tuple (C, V), where C is a finite set of constraints, V is a finiteset of variables, and each constraint is a pair R(x1, . . . , xk) for R ∈ Γ andx1, . . . , xk ∈ V.Parameter: n = |V|.Question: Does there exist a satisfying assignment, that is, an assignmentf : V → D such that for each constraint R(x1, . . . , xk) ∈ C it holds that( f (x1), . . . , f (xk)) ∈ R?

2

2.7 Constraint satisfaction problems 29

There are three well-known logic problems, that can be seen as constraintsatisfaction problems, that we will regularly encounter. First of all, d-CNF-SAT,as introduced previously. Furthermore, d-EXACT SAT and d-NAE-SAT, which aredefined below.

d-NAE-SAT

Input: A d-CNF formula F , where every clause has size at most d.Parameter: The number of variables n.Question: Does there exist a satisfying assignment to F , such that eachclause of F contains at least one false and at least one true literal?

d-EXACT SAT

Input: A d-CNF formula F , where every clause has size at most d.Parameter: The number of variables n.Question: Does there exist a satisfying assignment to F , such that eachclause of F contains exactly one true literal?

We define CNF-SAT, EXACT SAT, and NAE-SAT as in definitions above, exceptthat for these problems there is no bound on the clause length.

I Example 2.33 We can define 3-EXACT SAT as a constraint satisfaction problemas follows. Consider the relations

R0 := (0, 0, 1), (0, 1, 0), (1, 0, 0)R1 := (1, 0, 1), (1, 1, 0), (0, 0, 0)R2 := (1, 1, 1), (1, 0, 0), (0, 1, 0)R3 := (1, 1, 0), (1, 0, 1), (0, 1, 1).

Let Γ := R0, R1, R2, R3, then 3-EXACT SAT corresponds to CSP(Γ), as follows.The satisfying assignments of clauses with no negated variables are given by R0.Clauses with one negated variable can be rewritten to a constraint using R1by swapping the literals such that the negated variable is in the first position.Similarly, clauses with two or three negated literals can be represented using R2and R3 respectively. For example, the 3-EXACT SAT formula

(x ∨ y ∨ z) ∧ (¬x ∨ ¬z ∨ w) ∧ (x ∨ v ∨ ¬w)

gives the CSP(Γ) instance (C, V) with:

C = R0(x, y, z), R2(x, z, w), R1(w, x, v), V = v, w, x, y, z. J

We define a Boolean operation as a mapping from 0, 1k to 0, 1, where kis said to be the arity of the operation. From here, we define a partial Booleanoperation as a mapping from a subset of 0, 1k to 0, 1.

2

30 Preliminaries

Definition 2.34 (Idempotent, self-dual) We say that a (partial) Boolean operationof arity k is idempotent if f (0, . . . , 0) = 0 and f (1, . . . , 1) = 1. We say theoperation is self-dual if for all a1, . . . , ak ∈ 0, 1, when f (a1, . . . , ak) is defined,then f (¬a1, . . . ,¬ak) is defined and f (a1, . . . , ak) = ¬ f (¬a1, . . . ,¬ak).

We now introduce the notion of polymorphisms of a constraint language. Inthe study of CSP(Γ), it turned out that studying the polymorphisms of Γ is auseful tool in understanding the complexity of CSP(Γ).Definition 2.35 (Polymorphism) Let f be a (partial) operation of arity k on adomain D, and let R ⊆ Dn be a relation. We say that f is a polymorphismof R when for any tuples t1 = (t1,1, . . . , t1,n), . . . , tk = (tk,1, . . . , tk,n) ∈ R, if allentries of the tuple ( f (t1,1, . . . , tk,1), . . . , f (t1,n, . . . , tk,n)) are defined, then thistuple is in R. We say f is a polymorphism of a constraint language Γ if f is apolymorphism of each relation in Γ.

We say that a relation R is preserved by a (partial) operation f if f is apolymorphism of R, similarly we say a constraint language Γ is preserved by fif f is a polymorphism of Γ.

Let f : Dk → D be a k-ary (partial) operation and let t1, . . . , tk ∈ Dn betuples. We use the following notational convention:

f (t1, . . . , tk) := ( f (t1,1, . . . , tk,1), . . . , f (t1,n, . . . , tk,n)).

I Example 2.36 Consider the 2-ary operation f : 0, 12 → 0, 1 correspond-ing to the binary OR, meaning that f is given by f (a1, a2) := a1 ∨ a2. Considerthe 3-ary Boolean relation

R := (0, 0, 1), (0, 1, 1), (1, 0, 1), (1, 1, 0), (1, 1, 1).

We can for example compute that

f ((0, 0, 1), (1, 1, 0)) = ( f (0, 1), f (0, 1), f (1, 0)) = (0∨ 1, 0∨ 1, 1∨ 0)= (1, 1, 1).

In the example above, we see that for the tuples (0, 0, 1), (1, 1, 0) ∈ R, we getf ((0, 0, 1), (1, 1, 0)) = (1, 1, 1) ∈ R. We leave it as an exercise to the reader toshow that for any two tuples t1, t2 ∈ R it holds that f (t1, t2) ∈ R. It followsthat f is a polymorphism of R; or equivalently that R is preserved by f . J

For b ∈ 0, 1, let ub : 0, 1 → 0, 1 be the unary operation defined byub(0) = ub(1) = b; let major : 0, 13 → 0, 1 be the operation defined bymajor(x, y, z) := (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), hence, the value of major(x, y, z)is the Boolean that occurs at the majority of the positions of x, y, z. Letminor : 0, 13 → 0, 1 be the operation defined by minor(x, y, z) := x⊕ y⊕ z,where ⊕ denotes exclusive OR. Observe that hereby minor(1, 1, 1) = 1 andminor(0, 0, 0) = 0.

2

2.7 Constraint satisfaction problems 31

The following is a well-known result, it is a rephrasing of the originaltheorem that was proven by Schaefer [83]. See [25] for a proof; in particular,refer there to the proof of Theorem 3.21.

I Theorem 2.37 Let Γ be a Boolean constraint language. The problem CSP(Γ)is polynomial-time decidable when Γ is preserved by one of the six followingoperations: u0, u1,∧,∨,major,minor. Otherwise, CSP(Γ) is NP-complete. J

As such, we will say a Boolean constraint language Γ is tractable when itis preserved by one of the operations u0, u1,∧,∨,major,minor, and intractableotherwise.

I Example 2.38 Recall the constraint language used for 3-EXACT SAT in Exam-ple 2.33. We can show that 3-EXACT SAT is NP-complete using Theorem 2.37by verifying that Γ = R0, R1, R2, R3 is not preserved by any of the six oper-ations above. We will show this by showing that R0 is not preserved by theseoperations.

Since u0(1, 0, 0) = (0, 0, 0) and (1, 0, 0) ∈ R0, while (0, 0, 0) /∈ R0, it followsthat R0 is not preserved by u0. Similarly, since u1(1, 0, 0) = (1, 1, 1) /∈ R0, weobtain that R0 is not preserved by u1. Furthermore, while (0, 0, 1), (0, 1, 0) ∈ R0,we have (0 ∨ 0, 0 ∨ 1, 1 ∨ 0) = (0, 1, 1) /∈ R0. Similarly, (0 ∧ 0, 0 ∧ 1, 1 ∧ 0) =(0, 0, 0) /∈ R0. Therefore, R0 is not preserved by operations ∧ and ∨. Finally,one may observe that applying the majority operation to all tuples in R0 resultsin the tuple (0, 0, 0) /∈ R0, while taking the minority operation results in thetuple (1, 1, 1) /∈ R0. We conclude that Γ is intractable (observe that we mayeven conclude that the constraint language Γ′ := R0 is intractable). J

Definition 2.39 (Γ∗) When Γ is a constraint language over D, we use Γ∗ todenote the expansion of Γ where each element of D appears as a relation. Thatis, we define Γ∗ := Γ ∪ (d) | d ∈ D.

Effectively, the added relations in Γ∗ make it possible to enforce via aconstraint that a variable must be assigned a fixed value by any satisfyingassignment.

Definition 2.40 (pp-definability [25]) A relation T ⊆ Dk is pp-definable (shortfor primitive positive definable) from a constraint language Γ if for some m ≥ 0there exists a finite conjunction C consisting of constraints over Γ and equalitiesover variables v1, . . . , vk, x1, . . . , xm such that

T(v1, . . . , vk) ≡ ∃x1, . . . , ∃xm : C.

That is, for each map f : v1, . . . , vk → 0, 1, it holds that f can be extendedto a satisfying assignment of C if and only if ( f (v1), . . . , f (vk)) ∈ T.

I Example 2.41 Recall the constraint language Γ = R0, R1, R2, R3 used todefine 3-EXACT SAT in Example 2.33. We can show that Γ pp-defines the 3-OR

relation given by T = (v1, v2, v3) ∈ 0, 13 | v1 ∨ v2 ∨ v3. Consider the

2

32 Preliminaries

v1 v2 v3 x1 x2 x3 x4

1 1 1 1 0 0 11 1 0 1 0 0 01 0 1 0 1 0 11 0 0 0 1 0 0

v1 v2 v3 x1 x2 x3 x4

0 1 1 0 0 0 10 1 0 0 0 0 00 0 1 0 0 1 00 0 0 7 7 7 7

Table 2.1 All assignments to v1, v2, and v3, together with an extension that satisfiesEquation (2.1) if (v1, v2, v3) ∈ 3-OR.

following pp-definition:

C := R1(v1, x1, x2) ∧ R0(v2, x2, x3) ∧ R1(v3, x3, x4), (2.1)

over variable set V = v1, v2, v3, x1, x2, x3, x4. To show that ∃x1, . . . , x4 : Cis a pp-definition of T, we need to show that an assignment to v1, v2, v3 canbe extended to an assignment satisfying all constraints in C if and only if(v1, v2, v3) ∈ 3-OR.

Suppose the assignment is such that (v1, v2, v3) ∈ 3-OR, thus there existsi ∈ [3] such that vi = 1. One may verify (see Table 2.1) that in this case, theassignment can be extended to a satisfying assignment of the entire formula.Suppose the assignment is such that (v1, v2, v3) /∈ 3-OR, thus v1 = v2 = v3 = 0.Then the first constraint in C implies that x1 = x2 = 0, and the third constraintimplies that x3 = x4 = 0, in any assignment satisfying C. However, this does notsatisfy the second constraint, since (0, 0, 0) /∈ R0, which is a contradiction. J

The next theorem gives an important relation between pp-definability andpolymorphisms.

I Theorem 2.42 ([20, 47]) Let Γ be a constraint language over domain D. Anon-empty relation R over domain D is pp-definable over Γ if and only if eachpolymorphism of Γ is a polymorphism of R. J

The following proposition is a known fact.

I Proposition 2.43 ([25, Exercise 5.3]) If Γ is an intractable Boolean constraintlanguage, then every Boolean relation is pp-definable from Γ∗.

Proof. The idea is as follows. It follows from the intractability of Γ and Theo-rem 5.1 of [25], that all polymorphisms of Γ are “essentially unary”. This meansthat if f is a k-ary polymorphism of Γ, there exists i ∈ [k] and g : 0, 1 → 0, 1such that f (x1, . . . , xk) = g(xi) for all (x1, . . . , xk) ∈ 0, 1k. It follows from thisthat the only polymorphisms of Γ∗ are in fact constant. Since any relation hasany constant function as a polymorphism, it follows from Theorem 2.42 thatany Boolean relation is pp-definable from Γ∗. J

We now give a result about the existence of linear-parameter transforma-tions. Recall that Γ∗ is defined as the relation Γ together with all constants, by

2

2.8 Graph coloring problems 33

Definition 2.39.

I Theorem 2.44 (Follows from [23]) Let Γ be a constraint language over a finiteset D such that each unary operation u : D → D that preserves Γ is a bijection.Then, there exists a linear-parameter transformation from CSP(Γ∗) parameterizedby the number of variables to CSP(Γ) parameterized by the number of variables.

Note that in particular, an intractable Boolean constraint language can onlybe preserved (recall Definition 2.35) by unary operations that are bijections,as otherwise it is preserved by u0 or u1 and thus the constraint language istractable by Theorem 2.37. Hence the above theorem implies that for intractableBoolean Γ, there is a linear-parameter transformation from CSP(Γ∗) to CSP(Γ).

Proof of Theorem 2.44. The desired transformation is the final polynomial-timereduction given in the proof of Theorem 4.7 of [23]. This reduction translatesan instance of CSP(Γ∗) with n variables to an instance of CSP(Γ ∪ =D)with n + |D| variables; here, =D denotes the equality relation on domain D.Each constraint of the form =D (v, v′) may be removed (while preservingsatisfiability) by taking one of the variables v, v′, and replacing each instance ofthat variable with the other. The resulting instance of CSP(Γ) has ≤ n + |D|variables. J

2.8 Graph coloring problems

A proper q-coloring of a graph G is a function f : V(G) → [q] such that forall u, v ∈ E(G) it holds that f (u) 6= f (v). Based on this, the q-COLORING

problem for a fixed integer q is defined as follows.

q-COLORING

Input: A graph G.Question: Does G have a proper q-coloring?

A generalization of the q-COLORING problem is the H-COLORING problem.To define this problem, we first need the notion of homomorphisms.

Definition 2.45 (Homomorphism) Let G and H be (not necessarily simple)graphs. A homomorphism from G to H is a mapping f : V(G) → V(H) suchthat f (u), f (v) ∈ E(H) for all u, v ∈ E(G). We denote the existence of ahomomorphism from G to H by G → H. If G → H we say G is homomorphicto H, and if both G → H and H → G we say G and H are homomorphicallyequivalent, which we denote by G ↔ H.

For a fixed graph H, we can now define the H-COLORING problem as fol-lows.

2

34 Preliminaries

H-COLORING

Input: A graph G.Question: Does there exist a homomorphism from G to H?

When there is a homomorphism f from G to H we will often say G isH-colorable, and for a vertex v ∈ V(G) we may refer to f (v) as the color ofvertex v, as is consistent with the terminology for q-COLORING. Observe thatthe q-COLORING problem is equivalent to the problem of Kq-COLORING, whereKq is the clique on q vertices.

When H is a simple graph, H-COLORING can be viewed as equivalent to a con-straint satisfaction problem over the domain V(H). In particular, H-COLORING

corresponds to CSP(H) where H is defined as the constraint language contain-ing the single relation F, which is the symmetric binary relation given by F =(u, v) | u, v ∈ E(H). A mapping f : V(G) → V(H) is a homomorphismfrom a graph G to H if and only if it is a satisfying assignment of the CSP(F)instance on variables V(G) and constraints F(u′, v′) | u′, v′ ∈ E(G).I Example 2.46 Let H be the graph defined on three vertices a, b, and c, thatform a triangle, such that H is the graph K3. Then the constraint languageH := F is defined by

F := (a, b), (b, a), (a, c), (c, a), (b, c), (c, b).

If now G is an arbitrary graph, checking whether G is homomorphic to His equivalent to checking if the CSP(H)-instance (V, C) given by V := V(G)and C := F(u, v) | u, v ∈ E(G) is satisfiable over domain D = a, b, c. Inthis example, observe that any satisfying assignment simply corresponds to a3-coloring of G with colors a, b, and c. J

Definition 2.47 (H∗) In correspondence with Definition 2.39, we define H∗ asthe constraint language that contains the relation F = (u, v) | u, v ∈ E(H)and, for each v ∈ V(H), the arity-1 relation (v). The arity-1 relations in H∗

effectively allow the value of a variable to be forced to a constant value, sothat CSP(H∗) corresponds to the PRE-H-COLORING EXTENSION problem.

Definition 2.48 (Core) A graph H is a core if there exists no homomorphismfrom H to an induced proper subgraph of H. We may say that a graph H′ is thecore of a graph H if H′ is an induced subgraph of H such that H′ is a core, andH′ ↔ H. Observe that for any graph H, its core is unique up to isomorphism.

3Chapter 3Sparsification Using Low-Degree

Polynomials

In this chapter we provide techniques to obtain sparsifications for certain Con-straint Satisfaction Problems. Here we aim to reduce the number of constraints,without changing the satisfiability of the formula. In particular, we want toobtain guarantees on the number of remaining constraints, in terms of thenumber of variables. To this end, we will study CSPs where for each constraint,the satisfying assignments can be captured by low-degree polynomials in anappropriate sense. Let us start by giving a simple example of this technique,before studying the problem in full generality, by considering the EXACT SAT

problem (as defined in Section 2.7).Let an instance of EXACT SAT with m clauses over variable set V with |V| = n

be given. Any constraint of an instance of EXACT SAT is satisfied by an assignmentτ : V → 0, 1 if and only if it contains exactly one true literal, where we equatetrue with the value 1 and false with 0. This is equivalent to saying that a clause(x1, . . . , xi,¬xi+1, . . . ,¬xd) is satisfied by τ if and only if the following equationholds:

i

∑j=1

τ(xj) +d

∑j=i+1

(1− τ(xj)) = 1.

To find redundant clauses, transform each of the m input clauses into a linearequality, to obtain a system of equalities Ax = b where A is an m× n matrix,

3

36 Sparsification Using Low-Degree Polynomials

x is the column vector (x1, . . . , xn), and b is an integer column vector. UsingGaussian elimination, we can efficiently compute a basis B for the row space ofthe extended matrix (A|b). So, B is a set of equalities such that every equalitycorresponding to a clause can be written as a linear combination of equalitiesin B. One can even compute B such that it is a subset of the set of originalequalities. Since (A|b) has n + 1 columns, it follows that it has rank at mostn + 1 and thus B contains at most n + 1 equalities. To perform the sparsification,remove all clauses from the EXACT SAT instance for which the correspondingequality does not appear in B.

Since we only removed clauses, it is clear that any assignment satisfying theoriginal instance, will satisfy the sparsified instance. The other direction followsfrom the fact that if x satisfies f1(x) = b1 and f2(x) = b2, then it will also satisfyf1(x) + f2(x) = b1 + b2. Using that any equality we are interested in can bewritten as a linear combination of equalities in B gives us that any assignmentsatisfying the sparsified instance, also satisfies the original input instance. Thus,we have given a sparsification for EXACT SAT with O(n) clauses.

In terms of kernelization, the measure for kernel size is not the number ofclauses of the sparsified instance, but the number of bits required to store it. Ifclauses are assumed to have size at most d for some constant d, the instancecan be stored in O(n log n) bits. This can for example be done by encodingeach clause individually by storing the d variables it contains. Since there are nvariables, we can represent a single variable by log n bits, resulting in a totalsize of O(d · n log n), which equals O(n log n) for d constant.

When the number of literals in a clause can be as large as O(n), it mayhowever take Ω(n) bits to encode a single clause. After reducing the number ofclauses in EXACT SAT to n + 1, it may therefore still take Θ(n2) bits to encodethe instance. Observe that obtaining an encoding of bitsize O(n2) instead ofO(n2 log n) is in this case possible: we can safely assume that each clausecontains every literal at most twice. Since there are 2n literals, we can representeach clause by storing for each literal whether it occurs zero, one, or two timesin the clause, which can be done in O(n) bits. This leads to an encoding of sizeO(n2) when we have n + 1 clauses.

The change from O(n log n) bits for bounded clause length to O(n2) bitsfor unbounded clause length turns out to be unavoidable: we prove that EXACT

SAT has no kernel of size O(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly inCorollary 3.23.

The example given above uses that d-EXACT SAT can be expressed usingdegree-1 polynomials over the rationals. We show that d-NAE-SAT and d-CNF-SAT

can be expressed using equalities of polynomial expressions of degree d− 1and d. We therefore study the following problem, where F is a field. Refer toSection 2.6 for relevant definitions on polynomials.

3

37

d-POLYNOMIAL ROOT CSP over F

Input: A list L of polynomial equalities over variables V = x1, . . . , xn.An equality is of the form f (x1, . . . , xn) = 0, where f is a multivariatepolynomial over F of degree at most d.Parameter: The number of variables n.Question: Does there exist an assignment of the variables τ : V → 0, 1satisfying all equalities in L?

When interpreting truth assignments as elements of a ring or field, we equatethe value true with the 1 element in the ring (multiplicative identity), and thevalue false with the 0 element (additive identity). Consequently, for a Booleanvariable x its negation ¬x corresponds to (1− x). Note that over a field, we canassume this without loss of generality: for any a and b in F with a 6= b , thereexists a linear bijection that maps 0 to b and 1 to a, and this bijection does notchange the degree of the resulting polynomial.

Overview We show in Section 3.1.1 that using a generalization of the argu-ment presented for EXACT SAT, the number of constraints in an instance ofd-POLYNOMIAL ROOT CSP can efficiently be reduced to O(nd), even when thenumber of variables that occur in a constraint is not restricted. Assuming thateach constraint can be encoded in O(n) bits, this allows us to compress instancesof d-POLYNOMIAL ROOT CSP to bitsize O(nd+1). This reduction argument infact works over arbitrary fields F. We then extend our results to obtain similarresults over Z/mZ. When m is not prime, the resulting structure is not a fieldand this imposes additional technical difficulties.

In Section 3.1.2, we consider Boolean CSPs whose constraints are formed bydisequalities, rather than equalities, of degree-d polynomials. This leads to thefollowing generic problem:

d-POLYNOMIAL NON-ROOT CSP over F

Input: A list L of polynomial disequalities over variables V = x1, . . . , xn.A disequality is of the form f (x1, . . . , xn) 6= 0, where f is a multivariatepolynomial over F of degree at most d.Parameter: The number of variables n.Question: Does there exist an assignment of the variables τ : V → 0, 1satisfying all disequalities in L?

This problem formulation is related to the weak representation of Booleanfunctions by polynomials: a polynomial f weakly represents a Boolean func-tion g when f (x1, . . . , xn) 6= 0 if and only if g(x1, . . . , xn) 6= 0 (cf. [7,11]). Thismeans that effectively, the d-POLYNOMIAL NON-ROOT CSP problem asks to findan assignment that satisfies a list of constraints which are weakly representedby degree-d polynomials. For a prime p, we show that the number of constraints

3


Problem d-POLY. ROOT CSP d-POLY. NON-ROOT CSP

Lower bound1 Upper bound2 Lower bound1 Upper bound2

Q Ω(nd+1−ε) O(nd+1) SuperpolynomialZ/pZ Ω(nd+1−ε) O(nd+1) Ω(nd(p−1)−ε) O(nd(p−1)+1)

Z/mZ Ω(nd+1−ε) O(nd+1) Ω(n(d/2)r−ε) ?

Theorem 3.26, 3.27 3.1, 3.6, 3.12 3.28, 3.30, 3.29 3.131 The lower bounds hold for any ε > 0, for the problems that are not polynomial-time

solvable and under the assumption that NP * coNP/poly.2 The upper bounds hold when each n-variate polynomial constraint in the input can

be encoded in O(n) bits.

Table 3.1 Summary of the kernel upper and lower bounds obtained in this chapter,expressed in the number of bits. The bounds depend on whether the polynomialsdefining the constraints are over the rationals Q, the integers modulo a prime p, or theintegers modulo a composite m. The integer r denotes the number of distinct primedivisors of m. The values of p, m, r, and d are treated as constants in these bounds.

for d-POLYNOMIAL NON-ROOT CSP over Z/pZ can be efficiently reduced toO(nd(p−1)), resulting in an instance of bitsize O(nd(p−1)+1).

We conclude this chapter (Section 3.2) by showing a number of kernelizationlower bounds that (nearly) match the upper bounds obtained in Section 3.1. Weshow that for d-POLYNOMIAL ROOT CSP no kernel of size O(nd+1−ε) is possible,unless NP ⊆ coNP/poly. For d-POLYNOMIAL NON-ROOT CSP over Z/pZ fora prime p, we obtain a lower bound of O(nd(p−1)−ε), which leaves a factor-ngap with the upper bound. For general d-POLYNOMIAL NON-ROOT CSP, weshow that it does not admit a polynomial kernel unless NP ⊆ coNP/poly. Ford-POLYNOMIAL NON-ROOT CSP over Z/mZ we give a lower bound. We do nothave a polynomial upper bound for this problem, the difficulties in obtaining apolynomial kernel are discussed in Section 3.3. An overview of the upper andlower bound results obtained for d-POLYNOMIAL ROOT CSP and d-POLYNOMIAL

NON-ROOT CSP can be found in Table 3.1.

3.1 Upper bound techniques

3.1.1 Sparsification for Polynomial root CSPWe start by showing how to reduce the number of constraints in instances ofd-POLYNOMIAL ROOT CSP, by extending the argument given for EXACT SAT inthe introduction. Let a field F be efficient if the field operations and Gaussianelimination can be done in polynomial time in the size of a reasonable inputencoding. By this definition, Q and Z/pZ for a prime p are examples of efficient

3

3.1 Upper bound techniques 39

fields.

I Theorem 3.1 There is a polynomial-time algorithm that, given an instance(L, V) of d-POLYNOMIAL ROOT CSP over an efficient field F, outputs an equivalentinstance (L′, V) with at most nd + 1 constraints such that L′ ⊆ L.

Proof. Given a list L of polynomial equalities over variables V for d-POLYNOMIAL

ROOT CSP, we use linear algebra to find redundant constraints. Observethat (xi)

c = xi for all 0/1-assignments and c ≥ 1. As constraints are evaluatedover 0/1-assignments, we may assume without loss of generality that themonomials in each of the polynomials are multilinear: each monomial consistsof a coefficient from F multiplied by distinct variables. Note that we also needthe constant monomial 1.

Create a matrix A with |L| rows and a column for every multilinear mono-mial of degree at most d over variables from V. Let position ai,j in A bethe coefficient of the monomial corresponding to column j in the polynomialequality corresponding to row i.

Compute a basis B of the row space of matrix A, for example using Gaussianelimination [50], and let L′ consist of the equalities in L whose correspondingrow appears in the basis. Since L′ ⊆ L, it follows that if the original instancehas a satisfying assignment, the reduced instance has a satisfying assignment aswell. The crucial part of the correctness proof is to establish the converse.

B Claim 3.2 If an assignment τ : V → 0, 1 of the variables in V satisfies theequalities in L′, then it satisfies all equalities in L.

Proof. Consider any equality ( f (x) = 0) ∈ L \ L′, and assume it corresponds tothe i’th matrix row. Let f j(x) be the polynomial represented in the j’th row ofmatrix A for j ∈ [|L|]. Without loss of generality, let the basis of A correspondto its first m rows a1, . . . , am. We then have i > m, and by the definition ofbasis there exist β1, . . . , βm ∈ F such that ai = ∑m

j=1 β jaj . Let t be the columnvector containing, for each multilinear monomial of degree ≤ d in variablesx1, . . . , xn, the evaluation under τ. For example, for monomial x1x3 it containsτ(x1) · τ(x3). By using the same order of monomials as in the construction ofA, we obtain for all j ∈ [|L|] that f j(τ(x1), . . . , τ(xn)) = aj t, the inner productof aj and t. It follows that aj t = 0 for all j ∈ [m], since satisfying L′ impliesf j(τ(x1), . . . , τ(xn)) = 0. Now observe that

fi(x) = ait =m

∑j=1

(β jaj)t =m

∑j=1

β j(aj t) =m

∑j=1

β j · 0 = 0,

which proves the claim. C

We now prove an upper bound on the number of constraints in the resultingkernel.

3


B Claim 3.3 The number of constraints in the resulting kernel is bounded bynd + 1.

Proof. The size of a basis of any matrix over a field equals its rank, which isbounded by the number of columns. As there is exactly one column for eachmultilinear monomial of degree at most d, it follows from Lemma 2.27 thatthere are at most nd + 1 columns and the claim follows. C

This concludes the proof of Theorem 3.1. J

When each constraint can be encoded in O(n) bits, for example when eachpolynomial can be represented as an arithmetic circuit of sizeO(n), Theorem 3.1gives a kernelization of size O(nd+1). When constraints can be encoded in O(1)bits, which may occur when constraints have constant arity, we obtain kernelsof bitsize O(nd).

Example application Let us consider the following problem, that generalizesthe d-EXACT SAT and d-NAE-SAT problems. Optionally, a prime p may be chosen.

GENERALIZED d-SAT (MOD p)

Input: A set of clauses C over variables V := x1, . . . , xn, and for eachclause a set Si ⊂N with |Si| ≤ d. Each clause is a set of distinct literals ofthe form xi or ¬xi.Parameter: |V| = n.Question: Does there exist a truth assignment τ : V → 0, 1 such that thenumber of satisfied literals in clause i modulo p lies in Si for all i?

We can now use the results obtained for d-POLYNOMIAL ROOT CSP, to give akernelization for the above problem.

I Corollary 3.4 GENERALIZED d-SAT and GENERALIZED d-SAT MOD p both havea kernel with nd + 1 clauses, such that the kernelized instance can be encodedin O(nd+1 log n) bits. Furthermore, the clauses of the kernelized instance are asubset of the clauses of the original instance.

Proof. To reduce the number of clauses using Theorem 3.1, we only have toprovide a polynomial of degree at most d to represent each constraint. Considera clause involving k variables xi1 , . . . , xik , with set S`. Let tj = xij if variable xij

occurs positively in the clause, and let tj = (1 − xij) if the variable occursnegatively. Then the number of satisfied literals in the clause is given by thedegree-1 polynomial f (xi1 , . . . , xik ) := ∑k

j=1 tj. Let F(x) be a polynomial withroot set S` (mod p) of degree at most |S`|, which exists by Observation 2.28.We obtain that F( f (x)) ≡p 0 if and only if x satisfies the clause. Note that thedegree of F( f (x)) is at most |S`| ≤ d.

3


Applying Theorem 3.1 to the resulting instance of d-POLYNOMIAL ROOT CSPidentifies a subset of at most nd + 1 constraints which preserve the answer tothe SAT problem. Each clause contains at most 2n literals, which can be encodedin O(log n) bits each. Additionally, for each clause we need to store the set S`

of at most d integers, which have value at most 2n in relevant inputs. As d is aconstant, the instance can be encoded in O(nd+1 log n) bits. J

Corollary 3.4 yields a new way to get a non-trivial compression for d-NAE-SAT, which is conceptually simpler than the existing approach which requiresan unintuitive lemma by Lovász [75]. The new approach gives the same sizebound as given earlier [59, Theorem 6].

I Corollary 3.5 d-NAE-SAT has a kernel with nd−1 + 1 clauses, such that thekernelized instance can be encoded in O(nd−1 log n) bits.

Proof. A clause of size k ≤ d is not-all-equal satisfied if and only if the numberof satisfied literals lies in S := 1, . . . , k− 1. Using Corollary 3.4 we can reducethe number of clauses to nd−1 + 1. Each clause has d ∈ O(1) variables and canthus be encoded in O(log n) bits. J

We can generalize Theorem 3.1 to also obtain a sparsification for d-POLY-NOMIAL ROOT CSP over the integers modulo a non-prime. We give two differentapproaches for sparsifying such problems. The first approach gives the small-est number of constraints after reduction, but has the disadvantage that theresulting list of constraints is not necessarily a subset of the original list ofconstraints. The second approach results in a larger (but still bounded) numberof constraints, which form a subset of the original constraints. We first givesome linear-algebraic background.

Consider an instance (L, V) of d-POLYNOMIAL ROOT CSP over a ring Rwith n variables and m constraints. We consider the matrix A over R with mrows and ∑d

i=0 (nd) columns, in which the i’th row contains the coefficients of the

multilinear monomials in the polynomial for the i’th constraint. The satisfiabilityof the constraints by a 0/1-assignment then comes down to the followingquestion: is there a 0/1-assignment to the variables, such that the vector xconsisting of all multilinear monomial evaluations of the variables x1, . . . , xnsatisfies Ax = 0 over R? The key insight for the sparsification is that anymatrix B for which the row-space over R is equal to that of A, satisfies Ax =0⇔ Bx = 0. (Recall that the row-space over R consists of the vectors that canbe written as a linear combination of the rows, with coefficients from R.) Hencewe can obtain an encoding of an equivalent problem by selecting a matrix Bwhose row-space equals that of A. When working over a field we can justextract a basis for the row-space to obtain B, which is exactly what happenedin Theorem 3.1. When working over the integers modulo m for composite m,the existence of a basis is not guaranteed (for more details, see the discussionafter Definition 2.18). For our first approach we therefore use the Howell

3


normal form of the matrix, which is a canonical matrix form which has the samerow-space.

I Theorem 3.6 There is a polynomial-time algorithm that, given an instance(L, V) of d-POLYNOMIAL ROOT CSP over Z/mZ for some integer m ≥ 2, outputsan equivalent instance (L′, V) of d-POLYNOMIAL ROOT CSP over Z/mZ with atmost nd + 1 constraints.

Proof. In a similar way as in Theorem 3.1, we use linear algebra to find redun-dant constraints. Let a list L of polynomial equalities over variable set V begiven. We again assume without loss of generality that the monomials in eachof the polynomials are multilinear. Construct a matrix A with |L| rows anda column for every multilinear monomial of degree at most d over variablesfrom V.

We now compute the Howell form H of matrix A (Definition 2.24), such thatA = PH, where P is invertible over Z/mZ. This can be done in polynomialtime by Theorem 2.25. Let H′ be the matrix H with all zero rows removed. LetL′ contain the polynomial equations where the left-hand sides are given by therows of H′ and the right-hand sides are all 0. We now prove the correctness ofthis procedure.

B Claim 3.7 An assignment τ : V → 0, 1 of the variables in V satisfies theequalities in L′, if and only if it satisfies the equalities in L.

Proof. (⇒) Suppose assignment τ : V → 0, 1 satisfies all equalities in L′.Consider the vector x with the assignment given to the j’th monomial on positionj. Then

H′x = 0⇔ Hx = 0⇒ PHx = P0⇒ Ax = 0,

which implies that τ is also a satisfying assignment for L.(⇐) Suppose assignment τ : V → 0, 1 satisfies all equalities in L. Consider

the vector x with the assignment given to the j’th monomial on position j. Then

Ax = 0⇔ PHx = 0⇒ P−1PHx = P−10⇒ Hx = 0⇒ H′x = 0,

which implies that τ is also a satisfying assignment for L′. C

B Claim 3.8 The number of constraints in the resulting kernel L′ is bounded bynd + 1.

Proof. The number of constraints in L′ equals the number of rows in H′. We willuse the following properties of a matrix in Howell form (recall Definition 2.24)to give an upper bound on the number of non-zero rows in H.

• Let r be the number of non-zero rows of H. Then the first r rows of H arenon-zero.

3


• For 1 ≤ i ≤ r let the first non-zero entry in row i of H be in column ji. Thenj1 < j2 < . . . < jr.

By these two properties any matrix in Howell form has at most as many non-zerorows as it has columns. Thereby there are at most nd + 1 polynomial equationsin L′. C

This concludes the proof of Theorem 3.6. J

Compared to Theorem 3.1, the sparsification of Theorem 3.6 has the disad-vantage that it may output polynomials (representing constraints) that were notpart of the input. If the input polynomials had an efficient encoding, for exampleas an arithmetic circuit, this property may be lost in the transformation. Ingeneral, to represent an output polynomial one may have to store all its O(nd)coefficients individually. We present an alternative approach that alleviates thisissue by ensuring that the set of constraints in the output instance is a subsetof the original constraints. However, it comes at the expense of increasing thenumber of constraints. The following lemma captures the key linear-algebraicinsight behind the approach.

I Lemma 3.9 Let m ≥ 2 be an integer with r distinct prime divisors. Forany S ⊆ Z/mZ there exists a subset S′ ⊆ S of size at most r such that anyelement in S can be written as a linear combination over Z/mZ of elementsin S′. For any fixed m, one can compute S′ and expressions of all a ∈ S as linearcombinations of S′ in polynomial time.

Proof. Let p1, . . . , pr be the distinct prime divisors of m, which can be found inconstant time for fixed m. For a prime p and positive integer a, define:

µp(a) := maxk ∈N | pk divides a.νp(a) := maxk ∈N | pk divides both a and m.

Observe that νp(a) ≤ µp(a) for all a. For any a that divides m we have νp(a) =µp(a).

Using these notions we construct the set S′ as follows. For each i ∈ [r] selectan element a ∈ S that minimizes νpi (a) and add this element to S′. Since m isconstant this can be done in polynomial time. The resulting set S′ has size atmost r. We prove it spans S using the following claim.

B Claim 3.10 Let d be the largest integer that simultaneous divides m and allelements of S′. For any b ∈ S \ S′, the integer d divides b. Equivalently:

gcd(m, S′) | b.

Proof. If d = 1 then the claim is trivial. Suppose all prime factors p of d arealso prime factors of b with µp(d) ≤ µp(b). Then the factorization of b can be

3


written as the factorization of d multiplied by remaining factors. Hence d | b,and the claim follows.

Now suppose there is a prime factor p of d with µp(d) > µp(b). Since pis a factor of d = gcd(m, S′), we know p is a factor of m and was there-fore considered during the construction of S′. Since d divides m we knowthat µp(d) = νp(d). Combined with the fact that νp(b) ≤ µp(b) it followsthat νp(b) ≤ µp(b) < µp(d) = νp(d). Since d divides all members of S′, itfollows that νp(b) < νp(d) ≤ νp(a) for all a ∈ S′. But then b should have beenadded to S′ during its construction; a contradiction. C

To conclude the proof, we use Claim 3.10 to show that any b ∈ S \ S′ canefficiently be written as a linear combination of S′ over Z/mZ. By Bézout’sidentity (Theorem 2.15), the greatest common divisor of a set of integerscan be written as an integer linear combination of the elements in that set.Such a combination can efficiently be found using the extended Euclideanalgorithm. Hence there are integer coefficients αi such that d = gcd(m, S′) =αm ·m + ∑a∈S′ αa · a. Let b′ := b/ gcd(m, S′), which is integral by Claim 3.10.But then

b = b′ · gcd(m, S′) = (b′ · αm)m + ∑a∈S′

(b′ · αa)a,

which implies that

b ≡m ∑a∈S′

(b′ · αa)a ≡m ∑a∈S′

((b′ · αa) mod m)a

is a linear combination over Z/mZ resulting in b. J

The following lemma follows from a procedure similar to Gaussian elimina-tion, using Lemma 3.9 as a subroutine.

I Lemma 3.11 Let m ≥ 2 be an integer with r distinct prime divisors. For anymatrix A over Z/mZ in which k ≥ 1 columns contain a nonzero element, there isa subset B of r · k rows of A that spans the row-space of A. For any fixed m, sucha subset B can be found in polynomial time.

Proof. Proof by induction on k. Consider the first column ci of A that contains anonzero and let S be the elements appearing in that column. Using Lemma 3.9,compute a subset S′ ⊆ S of size at most r that spans S, and find the correspond-ing linear combinations. For each element a ∈ S′ select one row with value ain column ci and add it to B1. If ci is the only column containing a nonzero,then it is easy to see that B := B1 is a valid output for the procedure. Otherwise,since all elements of S are linear combinations of elements of S′, by subtractingthe relevant linear combinations of rows of B1 from rows in A we can obtainzeros at all positions in column ci, without introducing nonzeros in earliercolumns. Let A′ be the resulting matrix, which therefore has at most k − 1

3


nonzero columns. Apply induction to find a spanning subset B′ of the rows of A′

of size at most r · (k− 1). Let B2 be the rows of A corresponding to rows B′

in A′. Then B1 ∪ B2 consists of at most r + (k− 1)r = rk rows of A. It is easyto verify that these rows indeed span the row-space of A. The inductive proofdirectly translates into a polynomial-time recursive algorithm, using the factthat the procedure of Lemma 3.9 provides the required linear combinations. J

Using the lemmas above, we can now give our sparsification procedure ford-POLYNOMIAL ROOT CSP over Z/mZ that outputs a subset of the originalconstraints.

I Theorem 3.12 There is a polynomial-time algorithm that, given an instance(L, V) of d-POLYNOMIAL ROOT CSP over Z/mZ for some fixed integer m ≥2 with r distinct prime divisors, outputs an equivalent instance (L′, V) of d-POLYNOMIAL ROOT CSP over Z/mZ with at most r · (nd + 1) constraints suchthat L′ ⊆ L.

Proof. We proceed similarly as in the proof of Theorem 3.6. Consider an in-put (L, V) of d-POLYNOMIAL ROOT CSP over Z/mZ with n := |V| variables.Let A be the matrix with |L| rows and ∑d

i=0 (ni ) columns, containing the coeffi-

cients of the multilinear monomials that form the constraints for each of the |L|constraint polynomials. A 0/1-assignment to the variables satisfies all con-straints if and only if the vector x of all monomial evaluations satisfies Ax = 0.Use Lemma 3.11 to compute a subset B of at most r · ∑d

i=0 (ni ) ≤ r · (nd + 1)

rows of A that span the row-space of A. Let L′ contain the constraints whosecorresponding row appears in B and output the instance (L′, V) as the resultof the procedure. Using the guarantee of Lemma 3.11 this procedure runsin polynomial time for fixed m. Since L′ ⊆ L, the instance (L′, V) can besatisfied if (L, V) can. For the reverse direction, consider a satisfying assign-ment for (L′, V) and the corresponding vector x of evaluations of multilinearmonomials of degree at most d. Then Bx = 0 since the assignment satisfies allconstraints in L′. As any row in A can be written as a linear combination ofrows in B, it follows that Ax = 0, showing that (L, V) is satisfiable and hencethat the output instance is equivalent to the input. J

3.1.2 Sparsification for Polynomial non-root CSPWe will now change focus to constraint satisfaction problems with constraintsgiven by polynomial disequalities. Therefore, we consider the d-POLYNOMIAL

NON-ROOT CSP problem. In Section 3.2.3 we will show that, over the fieldof rational numbers, the problem cannot be compressed to size polynomialin n, unless NP ⊆ coNP/poly. We therefore consider the field Z/pZ of integersmodulo a prime p.

3


I Theorem 3.13 There is a polynomial-time algorithm that, given an instance(L, V) of d-POLYNOMIAL NON-ROOT CSP over Z/pZ, outputs an equivalentinstance (L′, V) with at most nd(p−1) + 1 constraints such that L′ ⊆ L.

Proof. Suppose we are given a list of polynomial disequalities L over variables V.Observe that a disequality f (x) 6≡p 0 is equivalent to f (x) mod p ∈ 1, . . . , p−1. Recall that Fermat’s little theorem states that ap ≡p a for any integer aand prime p, which implies that a(p−1) ≡p 1 if and only if a 6= 0. Hencethe disequality f (x) 6≡p 0 can equivalently be stated as ( f (x))p−1 − 1 ≡p 0.Therefore, L can be written as an instance of d(p − 1)-POLYNOMIAL ROOT

CSP by replacing every polynomial disequality f (x) 6≡p 0 by the equality( f (x))p−1 − 1 ≡p 0. By Theorem 3.1, the theorem statement follows. J

In Section 3.2.3 we will establish a nearly-matching lower-bound counterpartto Theorem 3.13. We do not have upper bounds for d-POLYNOMIAL NON-ROOT

CSP modulo a composite number m, the difficulty of obtaining these is discussedin the conclusion of this Chapter (Section 3.3).

3.2 Tightness of upper bound techniquesWe now turn our attention to lower bounds and show that the results ob-tained for d-POLYNOMIAL ROOT CSP and d-POLYNOMIAL NON-ROOT CSP inSections 3.1.1 and 3.1.2 are (almost) tight.

3.2.1 1-Polynomial root CSP over the rationalsWe will start with d-POLYNOMIAL ROOT CSP over Q and over Z/mZ. We startby proving that EXACT RED-BLUE DOMINATING SET does not have generalizedkernels of bitsize O(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly. The samelower bound for both variants of 1-POLYNOMIAL ROOT CSP will follow by alinear-parameter transformation. We then show how to generalize this result tod-POLYNOMIAL ROOT CSP. As a starting problem for the cross-composition wewill use the NP-hard RED-BLUE DOMINATING SET (RBDS) problem [34,65].

RED-BLUE DOMINATING SET (RBDS)

Input: A bipartite graph G = (R ∪ B, E) containing red (R) and blue (B)vertices, and an integer k.Question: Does there exist a set D ⊆ R with |D| ≤ k such that every vertexin B has at least one neighbor in D?

EXACT RED BLUE DOMINATING SET (ERBDS) is defined similarly, except thatevery vertex in B must have exactly one neighbor in D. Furthermore we willnot bound the size of such a set, but merely ask for the existence of any ERBDS.

3

3.2 Tightness of upper bound techniques 47

Finally, we define a weakening of the notion of an ERBDS of a graph, called aSEMI-ERBDS. Given a bipartite graph G and set S ⊆ B, a set X ⊆ V(G) is aSEMI-ERBDS of G with respect to S if it is a RBDS of G and furthermore, any bluevertex x /∈ S has exactly one neighbor in X. Vertices from S may be dominatedmultiple times.

The following lemma gives a degree-2 cross-composition from RBDS to SEMI-ERBDS, which will be used to prove Theorem 3.22. It is proven separatelybecause the construction will also be used in the proofs of Theorems 3.26and 3.27, which is also the reason we require part (4) of the lemma statement.

I Lemma 3.14 There exists a polynomial-time algorithm that, given t instancesof RBDS with

√t ∈ N, labeled X`1,`2 with `1, `2 ∈ [

√t], which all ask for a

solution of size k and all have mR red and mB blue vertices, constructs a bipartitegraph G′ with vertices partitioned into red (R) and blue (B) vertices, and a subsetV of the blue vertices, such that the following holds:

1. |R|+ |B| ≤ O(√

t · (mR + mB)3).

2. If there exist `1, `2 ∈ [√

t] such that X`1,`2 has a RBDS of size k, then G′ has anERBDS.

3. If G′ has a SEMI-ERBDS with respect to V, then there exist `1, `2 ∈ [√

t] suchthat X`1,`2 has a RBDS of size k.

4. There are at most 2 vertices in B \V with degree more than mR + k + 2.

In particular, the lemma shows how to embed a series of t size-n instancesX`1,`2 = (G`1,`2 , k) for `1, `2 ∈ [

√t] that share the same target value k, into a

single graph G′ with O(√

t · poly(n)) vertices such that G′ has an ERBDS if andonly if some input instance has a size-k RBDS. This straightforwardly gives akernelization lower bound for ERBDS: since the number of output vertices isroughly

√t, by choosing a suitable polynomial equivalence relation we get a

degree-2 cross composition. Now the actual lemma statement is even strongerthan the statement “some input has a RBDS ⇔ G′ has an ERBDS”, because the(⇐) implication already holds when G′ has a SEMI-ERBDS. The fact that itis only required to be exact on a set of vertices B \ V that has almost onlysmall-degree vertices, will be used later. Later constructions “pay extra” forchecking exactness of large-degree vertices, and the bound in (4) guaranteesthis does not happen too often.

Before proving the lemma, let us give the main ideas. The standard approachto give a degree-2 cross composition (see Section 2.4.3) is to have a table-likestructure with sets of vertices U` consisting of mR vertices and V` consistingof mB vertices for all ` ∈ [

√t]. In this way we can add connections between U

and V such that G′[U`1 ∪V`2 ] is isomorphic to G`1,`2 , thereby embedding theadjacency information of all t individual inputs while only needing

√t · (mR +

mB) vertices in the graph. Selector gadgets are then used to ensure that the

3


part of a (SEMI)-ERBDS in G′ in U`1 for some `1 corresponds to a RBDS of size kin G`1,`2 for some `2. In the case of ERBDS however, difficulties arise when wetry to use this type of construction. Given a RBDS for some input instance G`1,`2 ,finding an ERBDS in G′ can be problematic. The issue is that adding the verticesin U`1 corresponding to a solution in G`1,`2 to an ERBDS in G′, may dominatesome of the vertices from V multiple times. This is not easy to avoid, as there issimply no guarantee on how many times a vertex in the set V` with ` 6= `2 willbe dominated by this choice of red vertices.

To resolve this problem, every set U` and V` has k copies of each vertex.Connections are made such that the i’th copy of a vertex may only connect tothe i’th copy of another vertex, such that G[U`1 ∪V`2 ] contains k disjoint copiesof G`1,`2 . To translate a RBDS in G`1,`2 to a ERBDS in G′, we take at most onevertex from the i’th set of copies in U`1 . Hereby, any vertex in V is dominatedat most once. Furthermore, for each vertex in V`2 , at least one of its copies isdominated. We add additional gadgets to ensure that the remaining verticescan also be dominated.

Proof of Lemma 3.14. Let instance X`1,`2 have graph G`1,`2 , with red verticesR`1,`2 and blue vertices B`1,`2 . For each input graph G`1,`2 enumerate the redvertices as r1, . . . , rmR and the blue vertices as b1, . . . , bmB , arbitrarily. Create agraph G′ by the following steps. Figure 3.1 shows a sketch of G′.

1. Create√

t sets U1, . . . , U√t each consisting of k ·mR red vertices, with U` :=u`

i,j | i ∈ [k], j ∈ [mR] for each ` ∈ [√

t]. Let U be the union of all sets

U`, for ` ∈ [√

t].

2. Similarly create√

t sets V1, . . . , V√t, each consisting of k ·mB blue vertices,

and define V` := vì,j′ | i ∈ [k], j′ ∈ [mB] for all ` ∈ [√

t]. Let V be theunion of all sets V`. Note that a SEMI-ERBDS wrt. V must dominate all bluevertices that are created in the remainder of the construction exactly once.

3. For each i ∈ [k] add the edge from u`1i,j to v`2

i,j′ if rj, bj′ is an edge in instance

X`1,`2 with `1, `2 ∈ [√

t], j ∈ [mR], and j′ ∈ [mB].

By Steps 1 to 3, the subgraph of G′ induced by the vertices in U`1 ∪V`2 consistsof k vertex-disjoint copies of G`1,`2 . The next steps are used to ensure thatthere are exactly k vertices from U in any SEMI-ERBDS, which must all belong tothe same set U`. These vertices will correspond to a RBDS in one of the inputinstances.

4. Create blue vertices dì for ` ∈ [√

t] and i ∈ [k]. Connect vertex dì to allvertices u`

i,j with j ∈ [mR]. Define D := dì | ` ∈ [√

t], i ∈ [k]. These bluevertices ensure that a SEMI-ERBDS wrt. V, which dominates each vertex of D

3


a3,12,1

U1 U2 U3

V1 V2 V3

d11

d12

d21

d22

d31

d32

z′′1 z′′2 z′′3

s

Add edges if b1, r3 ∈ E(G1,1)

a1,21,1

a1,31,1

a1,12,1 a2,1

2,1a1,11,1

y1 y2 y3

s′

a3,32,1

V

z1 z2 z3z′1 z′2 z′3

Figure 3.1 The graph G′ created in the proof of Lemma 3.14, for k = 2, mR = 5, mB = 4,and t = 9. Edges between U and V are left out for simplicity. Of the 24 gadgets in Conly c`1,1 and c`2,1 are shown for all ` ∈ [

√t]. Vertices in R are shown in red and vertices

in B are shown in dark blue. The set V of vertices that may be dominated multiple timesby a SEMI-ERBDS wrt. V is highlighted by a rectangle.

exactly once, cannot contain two vertices uì,j and u`

i,j′ belonging to the samerow of the same set U`.

5. Add blue vertex s and add the vertices Z := z`, z′`, z′′` | ` ∈ [√

t]. Let zànd z′′` be red and let z′` be blue for all ` ∈ [

√t]. Connect z′′` to dì for i ∈ [k]

and ` ∈ [√

t]. Add the edges z`, z′` and z′`, z′′` for all ` ∈ [√

t]. Connecteach vertex z` to s for ` ∈ [

√t], thereby ensuring that exactly one vertex z`

is contained in a SEMI-ERBDS wrt. V. Intuitively, the index `1 for which z`1belongs to a SEMI-ERBDS controls the first index of the input instance X`1,`2to which the solution corresponds.

The next steps ensure that some of the blue vertices in one set V`2 need to bedominated by vertices from U, while all other vertices in V can be dominated“for free”. This will control the second index of the input instance X`1,`2 to whichthe solution corresponds.

7. Add sets of gadgets C` for ` ∈ [√

t]. Each set C` consists of mB · k selectorgadgets cì,j′ for i ∈ [k], j′ ∈ [mB]. Selector gadget cì,j′ consists of k + 1 red

3


vertices labeled a`,1i,j′ , . . . , a`,k+1

i,j′ that are all connected to a blue vertex bì,j′that is the only blue vertex inside the gadget. Furthermore, for j′ ∈ [mB],` ∈ [√

t] and i ∈ [k], in gadget cì,j′ the vertex a`,xi,j′ for x ∈ [k] is connected to

v`x,j′ . We refer to the vertex set of gadget cì,j′ by V(cì,j′).

By Step 7 of the construction a SEMI-ERBDS uses at most one red vertex from eachgadget, which can be used to dominate one vertex from V. Using vertex a`,k+1

i,j′ ofa gadget, the blue vertex of that gadget can be dominated without dominatingany other blue vertices. Using the k gadgets introduced for j′ ∈ [mB], ` ∈ [

√t],

we can thus precisely dominate all vertices in vì,j′ | i ∈ [k]. Now we will

ensure that there is a `2 ∈ [√

t] such that in V`2 , for each j′ ∈ [mB], one of the

vertices v`2i,j′ | i ∈ [k] is not dominated by a gadget and must therefore be

dominated by a vertex from U.

8. Add red vertices Y := y1, . . . , y√t. For each ` ∈ [√

t] connect y` to theblue vertices of the gadgets c`1,j′ for all j′ ∈ [mB], thereby making gadget

cì,j′ special for i = 1. Connect y1, . . . , y√t to the new blue vertex s′, whichensures that exactly one vertex y`2 ∈ Y belongs to any SEMI-ERBDS wrt. V.

This concludes the construction of graph G′, with red vertices

R := U ∪Y ∪ z`, z′′` | ` ∈ [√

t]∪ a`,x

i,j′ | ` ∈ [√

t], x ∈ [k + 1], i ∈ [k], j′ ∈ [mB],

and blue vertices

B := V ∪ D ∪ s, s′ ∪ bì,j′ | ` ∈ [√

t], i ∈ [k], j′ ∈ [mB] ∪ z′` | ` ∈ [√

t].

The following observation follows immediately from the construction above.

Observation 3.15 Let x /∈ V be a blue vertex in G′. The neighborhood of xdepends only on mR, mB, t, and k; it is independent of the structure of the giveninput instances.

Furthermore, we can show that requirement 4 of this lemma is satisfied.

B Claim 3.16 There are at most 2 vertices in B \V with degree more than mR +k + 2.

Proof. We list all vertices in B \ V, together with an upper bound on theirdegree.

Vertices s and s′: It follows from Steps 5 and 8 that these two vertices bothhave large degree, namely

√t.

3


Vertices in D: It follows from Steps 4 and 5 that these vertices have degreemR + 1.

Vertices in Z ∩B: It follows from Step 5 that vertex z′` has degree two for all` ∈ [√

t].

Vertices in gadgets: The blue vertex of any gadget has degree at most k + 2,the incident edges are added in Steps 7 and 8.

Thus there are at most 2 vertices of degree larger than mR + k + 2 in B \V. C

We now continue by providing a number of relevant properties of thegraph G′, that will be used to prove that the construction satisfies require-ment 3 of the lemma statement.

B Claim 3.17 For any SEMI-ERBDS E of G′ wrt. V, there exists an index `1 ∈ [√

t]such that U` ∩ E = ∅ for all ` 6= `1 ∈ [

√t] and |E ∩ u`1

i,j | j ∈ [mR]| = 1 for alli ∈ [k].

Proof. By Step 5, blue vertex s /∈ V has neighborhood z` | ` ∈ [√

t]. Since thesemi-exact RBDS is exact on blue vertices outside V, exactly one neighbor of sis contained in E; let this be z`1 . Thereby, for all ` ∈ [

√t] with ` 6= `1 we obtain

z` /∈ E. Since blue vertex z′` has neighborhood exactly NG′(z′`) = z`, z′′` , itfollows that z′′` ∈ E for all ` 6= `1 with ` ∈ [

√t].

Let ` ∈ [√

t] with ` 6= `1, we show that no vertex in U` is in E. Considervertex u`

i,j with i ∈ [k], j ∈ [mR]. Then uì,j ∈ NG′(dì ) for blue vertex dì . Since

z′′` ∈ NG′(dì ) and z′′` ∈ E, it follows that uì,j /∈ E.

It remains to show that |E ∩ u`1i,j | j ∈ [mr]| = 1 for all i ∈ [k]. Since

NG′(z′`1) = z`1 , z′′`1

and z`1 ∈ E, it follows that z′′`1/∈ E. As d`1

i ∈ B \V and E

is an exact RBDS on vertices outside V, it follows that |E ∩ NG′(d`1i )| = 1 for all

i ∈ [k]. Since z′′`1/∈ E, it thereby follows that |E ∩ u`1

i,j | j ∈ [mR]| = 1 for alli ∈ [k]. C


t]such that E ∩V(c`2

1,j′) = ∅ for all j′ ∈ [mB].

Proof. By Step 8, blue vertex s′ has neighborhood y` | ` ∈ [√

t]. Since s′ /∈ V,exactly one of these vertices is contained in E; let this be y`2 . It is connected

to the blue vertex of all gadgets c`21,j′ for j′ ∈ [mB]. Since all red vertices in

a gadget c`21,j′ for j′ ∈ [mB] have the blue neighbor b`2

1,j′ that is also adjacent

to y`2 ∈ E, the red vertices in these gadgets are not present in E, as b`21,j′ has

exactly one red neighbor in E. C

3



t]such that for every j′ ∈ [mB] at least one of the vertices in v`2

i,j′ | i ∈ [k] has aneighbor in E ∩U.

Proof. By Claim 3.18 there exists `2 ∈ [√

t] such that E ∩ V(c`21,j′) = ∅ for all

j′ ∈ [mB]. Consider an arbitrary j′ ∈ [mB]. The k vertices in v`2i,j′ | i ∈ [k] are

connected to vertices of the k gadgets c`21,j′ , c`2

2,j′ , . . . , c`2k,j′ , and to some vertices

in U. From each gadget, at most one red vertex is in E, since the red verticeshave a common blue neighbor that is not in V. Any red gadget vertex isconnected to only one vertex in V. Since no vertex of gadget c`2

1,j′ is in E, at most

k− 1 of the vertices in v`2i,j′ | i ∈ [k] have a neighbor in E ∩ C`2 . Consequently,

at least one of these vertices has a neighbor in E ∩U for each j′ ∈ [mB]. C

We can now prove that G′ and V fulfill requirement 3 of the lemma statement.

B Claim 3.20 If G′ has a SEMI-ERBDS wrt. V, then some input X`1,`2 has a RBDS

of size at most k.

Proof. Assume G′ has a SEMI-ERBDS wrt. V, say E. By Claim 3.19, there exists`2 ∈ [

√t], such that for every j′ ∈ [mB] at least one of the vertices in v`2

i,j′ | i ∈[k] has a neighbor in E ∩U. By Claim 3.17, there exists `1 ∈ [

√t] such that for

all ` 6= `1 we have U` ∩ E = ∅, so these neighbors lie in U`1 .We now construct a RBDS E′ for instance X`1,`2 . For each j ∈ [mR], add rj to

E′ if E ∩ u`1i,j | i ∈ [k] 6= ∅. By Claim 3.17, it follows that E′ has size at most k,

as required. It remains to show that every vertex in B`1,`2 has a neighbor in E′.If some vertex bj′ from B`1,`2 does not have a neighbor in E′, then none of the

vertices v`2i,j′ | i ∈ [k] have a neighbor in E ∩U`1 . This contradicts our choice

of `2. Hence E′ is an RBDS of size at most k for instance X`1,`2 . C

Furthermore we show that requirement 2 is fulfilled in the following claim.

B Claim 3.21 If some input instance has a RBDS of size at most k, then G′ has anERBDS.

Proof. Suppose instance X`1,`2 has a RBDS E′ of size k consisting of verticesri1 , . . . , rik ⊆ R`1,`2 . We construct an ERBDS E for G′. Start by choosing vertices

u`1x,ix

for x ∈ [k], so for every vertex in E′ we pick one vertex in the ERBDS for G′.Add the red vertex z`1 and the vertices z′′` for all ` 6= `1 to E. Furthermore, welet the vertex y`2 be in E.

To exactly dominate the blue vertices in V, we use the gadgets in C asfollows. For ` 6= `2 ∈ [

√t], add red vertex a`,x

x,j′ of gadget c`x,j′ if vertex v`x,j′ does

3


not yet have a neighbor in E, for j′ ∈ [mB] and x ∈ [k]. Else, add vertex a`,k+1x,j′

of gadget c`x,j′ to E, in order to exactly dominate the blue vertex of this gadget.To exactly dominate the vertices in V`2 we apply a similar procedure, except

that gadget c`21,j′ cannot be used since its blue vertex b`2

1,j′ is already dominatedby y`2 . Since E′ is a RBDS of instance X`1,`2 , for each j′ ∈ [mB] at least one vertex

from set v`2i,j′ | i ∈ [k] has a neighbor in E ∩U. As such, the k− 1 remaining

gadgets can be used to each dominate one of the k− 1 remaining vertices inthis set, if they do not already have a neighbor in E ∩U. If no red vertex of agadget c`2

x,j′ is needed to dominate, we choose vertex a`2,k+1x,j′ of the gadget in E

to dominate the blue vertex in the gadget.It is straight-forward to verify that this results in an ERBDS for G′. C

From Claims 3.20 and 3.21 it follows that graph G′ has a SEMI-ERBDS wrt. Vif and only if at least one of the input instances has a RBDS of size at most k. Thegraph G′ has O(

√t · (mR +mB)

3) vertices and can be constructed in polynomialtime. J

Using the lemma above, we now prove the kernel lower bound for ERBDS.

I Theorem 3.22 EXACT RED-BLUE DOMINATING SET parameterized by thenumber of vertices n does not have a generalized kernel of size O(n2−ε) for anyε > 0, unless NP ⊆ coNP/poly.

Proof. We will prove this result by giving a degree-2 cross-composition fromRBDS to ERBDS. We start by giving a polynomial equivalence relation R oninputs of RBDS. Let two instances of RBDS be equivalent under R if they havethe same number of red vertices mR, the same number of blue vertices mB, andthe same maximum size k of a RBDS. It is easy to check that R is a polynomialequivalence relation.

Assume we are given t instances of RBDS, labeled X`1,`2 for `1, `2 ∈ [√

t],from the same equivalence class of R. If the number of instances given is nota square, we duplicate one of the input instances until a square number isreached. Since this changes the number of inputs by at most a factor four, thisdoes not influence the cross-composition. Call the number of red vertices inevery instance mR, the number of blue vertices mB, and the required size of thedominating set k. By Lemma 3.14, we can in polynomial time construct graphG′ such that

• |V(G)| ≤√

t · poly(mB + mR) and

• G′ has an ERBDS if and only if at least one input instance has a RBDS. Thisfollows from requirements 2 and 3 from Lemma 3.14, and the fact that anyERBDS is also a SEMI-ERBDS.

3


Thereby we have given a degree-2 cross-composition and the lower boundfollows from Theorem 2.14. J

Using Theorem 3.22 we provide lower bounds for constraint satisfactionproblems. It is easy to give a linear-parameter transformation from ERBDS toboth 1-POLYNOMIAL ROOT CSP and EXACT SAT, by introducing a variable foreach red vertex and adding a constraint for each blue vertex such that exactlyone of its neighbors is chosen in any assignment. Using Theorem 2.8, this resultsin the following corollary.

I Corollary 3.23 The problems EXACT SAT and 1-POLYNOMIAL ROOT CSP over Q,parameterized by the number of variables n, do not have a generalized kernel ofsize O(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly. J

3.2.2 Polynomial root CSP (modulo an integer)In order to also establish a lower bound for 1-POLYNOMIAL ROOT CSP over theintegers modulo m, we will need the following lemma. It allows us to enforcea linear equality constraint over Q using constraints over Z/mZ, through theuse of auxiliary 0/1-dummy variables. Since ∑i xi = 1 implies ∑i xi ≡m 1, thenon-trivial part is to add extra constraints which, together with ∑i xi ≡m 1, alsoimply ∑i xi = 1.

I Lemma 3.24 Let m ≥ 3 be an integer. Given a linear equality ∑i∈[N] xi = 1over Q, there exists a system S of linear equalities over Z/mZ using the variablesxi | i ∈ [N] and at most 4N additional variables, such that

1. Any 0/1-solution to the system S sets exactly one of the variables xi | i ∈ [N]to 1,

2. any assignment to xi | i ∈ [N] setting exactly one variable xi to 1 can beextended to a 0/1-solution of S, and

3. S can be constructed in polynomial time.

Proof. Given the linear equality ∑i∈[N] xi = 1, first of all add the equation

∑i∈[N]

xi ≡m 1

to S. Any choice of x1, . . . , xN satisfying ∑i∈[N] xi = 1 also satisfies the equal-ity modulo m. Furthermore, any 0/1-assignment of x1, . . . , xN satisfying theequation ∑i∈[N] xi ≡m 1 ensures that at least one variable xi is set to 1.

To ensure that at most one of these variables is set to 1, we add additionalconstraints in the following way. Construct a complete binary tree with N′ :=2dlog Ne leaves, implying N ≤ N′ < 2N. Identify the first N leaves with variables

3


x1, . . . , xN and introduce dummy variables for all other vertices. For every non-leaf d in the tree with children d` and dr, each corresponding to a uniquevariable, add the equation

d` + dr ≡m d.

It is clear that this construction can be done in polynomial time, thus Property 3holds. To show that Properties 1 and 2 hold, we prove the following claim.

B Claim 3.25 Let a 0/1-assignment satisfying all equalities in S be given. Thevalue assigned to any variable x corresponds to the number of leaves in the subtreerooted in x that are assigned value 1.

Proof. We prove this by induction on the height of the tree rooted in x. If x isa leaf, the result is obvious. Suppose the tree has height larger than one andlet x` and xr be the left and right child of x. By the induction hypothesis, thevalues of x` and xr correspond to the number of leaves in the left (respectively,right) subtree that were assigned 1. Since x`, xr ∈ 0, 1 and x ≡m x` + xr withm > 2, the result follows. C

Suppose we are given any 0/1-assignment satisfying all equalities in S.Hence the variable corresponding to the root r of the binary tree has value 0or 1. By Claim 3.25, it follows that the number of leaves (and thus the numberof variables in x1, . . . , xN) that are assigned the value 1 is at most one. Aswe have seen earlier, at least one variable xi is set to 1, to fulfill ∑i∈[N] xi ≡m 1,and thus ∑i∈[N] xi = 1. Hence Property 1 holds.

Given a 0/1-assignment to x1, . . . , xN such that ∑i∈[N] xi = 1, it can beextended to a satisfying assignment of S by setting all dummy leaves to 0. Forevery other dummy vertex, let its value be the number of variables correspondingto leaves in its subtree, that are set to 1. Note that this number is always either 0or 1 since there is only one leaf whose corresponding variable is set to 1.Therefore Property 2 holds as well. J

For m = 2, an input to the problem 1-POLYNOMIAL ROOT CSP over theintegers mod m only consists of linear equations over the two-element field0, 1 and is thus polynomial time solvable by Schaefer’s dichotomy theorem(Theorem 2.37). For larger moduli, we use Lemma 3.24 to prove the followingresult.

I Theorem 3.26 Let m ≥ 3 be an integer. The problem 1-POLYNOMIAL ROOT

CSP over Z/mZ, parameterized by the number of variables n, does not have ageneralized kernel of size O(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly.

Proof. We will use the graph constructed in Lemma 3.14, by transforming theconstructed instance G′ of (SEMI)-ERBDS of size O(

√t · poly(mR + mB)) to an

instance I of 1-POLYNOMIAL ROOT CSP over Z/mZ withO(√

t ·poly(mR +mB))variables. In this way we obtain a degree-2 cross-composition from RBDS to1-POLYNOMIAL ROOT CSP over Z/mZ, proving the lower bound.

3


Suppose we are given t instances of RBDS, such that√

t is integer and suchthat every instance has mB blue vertices and mR red vertices and asks for a RBDS

of size k ≤ mR. This can be assumed by choosing an appropriate polynomialequivalence relation. Apply Lemma 3.14 to obtain graph G′ and V ⊆ V(G′).By requirements 2 and 3 of Lemma 3.14, it is sufficient to ensure that G′ has aSEMI-ERBDS with respect to V if I is satisfiable, and that I is satisfiable if G hasan ERBDS, to obtain the cross-composition.

Recall that a SEMI-ERBDS of G′ with respect to V contains at least oneneighbor of each blue vertex, and contains exactly one neighbor of each bluevertex in V(G′) \V.

First of all introduce a variable vr for every red vertex r in G′. For everyblue vertex b, we add the following equation to ensure that it has at least oneneighbor in the SEMI-ERBDS:

∑r∈NG′ (b)

vr ≡m 1. (3.1)

For every blue vertex b /∈ V, we add a number of linear equations thatensure b has exactly one neighbor in a SEMI-ERBDS, using at most 4 · |NG′(b)|additional variables. This is done by applying Lemma 3.24 to the equation∑r∈NG′ (b)

vr = 1.

This completes the construction. If G′ has an ERBDS, then I can be satisfiedby setting the variables corresponding to the ERBDS to 1 and all other variablescorresponding to vertices to 0. The dummy variables can then be chosen in sucha way that all equations are satisfied according to Lemma 3.24.

For the opposite direction, suppose I has a satisfying assignment. Defineset Y to contain the vertices whose corresponding variable is set to 1. FromEquation (3.1) it follows that every blue vertex has at least one neighbor in theset Y. Furthermore every blue vertex not in V has exactly one neighbor in Y byLemma 3.24. It follows that Y is a SEMI-ERBDS of G′.

It remains to bound the number of used variables. The key idea is that wehave only few variables outside of V whose corresponding vertex has a largeneighborhood, and for which the number of dummy variables added dependson√

t. Furthermore there are many variables whose corresponding verticeshave small neighborhoods, with size depending only on mB + mR. Note thatthe degree of any vertex, and the total number of vertices, is bounded by theorder of the graph O(

√t · (mR + mB)

3).For every blue vertex in V(G) \V with a degree larger than mR + k + 2 we

add O(√

t(mB + mR)3) dummy variables. By requirement 4 of Lemma 3.14,

there are at most 2 such vertices. Furthermore for any vertex with a degreesmaller than mR + k + 2 we add O(mR + k) dummy vertices. This togetherresults in using O(

√t · poly(mB + mR)) variables, which is properly bounded

for a degree-2 cross-composition. J

3


We now generalize this result to polynomial equalities of higher degree.

I Theorem 3.27 Let m ≥ 2, and d ≥ 2 be integers. The problems d-POLYNOMIAL

ROOT CSP over Z/mZ and d-POLYNOMIAL ROOT CSP over Q parameterized bythe number of variables n do not have a generalized kernel of size O(nd+1−ε) forany ε > 0, unless NP ⊆ coNP/poly.

Proof. We will only provide the proof over Z/mZ, the result for d-POLYNOMIAL

ROOT CSP over Q can be obtained in the same way (using the same equationswithout the moduli). Let m ≥ 2, d ≥ 2 be given. The result will be proven by adegree-(d + 1) cross-composition from RBDS, using Lemma 3.14. Suppose weare given t = rd+1 instances of RBDS, all having mR red vertices, mB blue vertices,and the same target size k. By a similar padding argument as before, we mayassume r is an integer. Split the inputs into rd−1 groups of size r2 each and applythe algorithm given by Lemma 3.14 to each group. We obtain rd−1 instancesof (SEMI)-ERBDS with O(r · (mR + mB)

3) vertices each, such that the answer toeach composed instance is the logical OR of the answers to the RBDS instancesin its group. Label the instances resulting from the group compositions Xi1,...,id−1

with i1, . . . , id−1 ∈ [r]. Let instance Xi1,...,id−1have graph Gi1,...,id−1

and let the seton which the RBDS is not required to be exact be Vi1,...,id−1

. All produced graphshave the same number of red and blue vertices; let the number of red verticesin each graph be N ≤ |V(Gi1,...,id−1

)| ≤ O(r · (mR + mB)3). We create N new

variables and identify each red vertex x with one variable vx. It is essential forthe remaining part of this proof that vertices from the produced (SEMI)-ERBDS

instances that had the same label are mapped to the same new variable and viceversa. Since the set Vi1,...,id−1

that is produced by Lemma 3.14 does not dependon the structure of the input graphs, only on their size, all produced graphshave the same labeled vertices in the set Vi1,...,id−1

. Hence we can treat it as asingle set V of vertex labels. Create an instance for d-POLYNOMIAL ROOT CSP asfollows.

1. Add sets Y1, . . . , Yd−1 of r variables each, where Yi := yi,j | j ∈ [r]. Addthe requirement ∑j∈[r] yi,j ≡m 1 to L′ for each i ∈ [d− 1].

2. Consider each graph Gi1,...,id−1for i1, . . . , id−1 ∈ [r]. For each blue vertex b in

this instance, add the following equation to L′: ∑x∈NGi1,...,id−1

(b)vx

· ∏z∈[d−1]

yz,iz ≡m ∏z∈[d−1]

yz,iz . (3.2)

Furthermore, if b is not an element of V, then for every pair of distinctvertices x, x′ ∈ N(b) add the following constraint to L′:

vx · vx′ ≡m 0. (3.3)

3


Note that, by Observation 3.15, the neighborhood of a blue vertex b 6∈ Vdoes not depend on the graphs to which Lemma 3.14 is applied, but only onthe number of red and blue vertices and the target size of the RBDS. As theseare identical for all applications of the lemma, it does not matter in which ofthe graphs we evaluate N(b) when finding relevant pairs x, x′.

The polynomial equalities have degree ≤ d as d is at least two. The numberof variables, which is the parameter of the CSP, is suitably bounded for adegree-(d + 1) cross-composition:

N + (d− 1) · r ∈ O(r · (d + (mR + mB)3)) = O(t1/(d+1)(mR + mB)

3).

As the construction can easily be performed in polynomial time, it remainsto show that the constraints in L′ can be satisfied if and only if one of the inputinstances of RBDS has a solution of size k. First assume that some input instanceof RBDS indeed has a solution of size k. Consider the indices i1, . . . , id−1 ofthe group containing the satisfiable RBDS instance. Then Lemma 3.14 ensuresthat Gi1,...,id−1

has an ERBDS. Set the variables corresponding to vertices in theERBDS of Gi1,...,id−1

to 1 and the others to 0. Furthermore, set variables yz,iz forz ∈ [d− 1] to 1. Set all other variables to 0. Thereby the sum of variables in eachset Yi is 1, as required. Furthermore, each equation defined by (3.2) is satisfiedin the following way. If it was defined for Xi1,...,id−1

, it is satisfied since the largesummation equals one (exactly one neighbor is in the exact dominating set)and the product term is one on both sides. Equations belonging to any otherinstance are trivially satisfied since the term ∏z yz,jz is zero on both sides ifthere exists z ∈ [d− 1] for which jz 6= iz. It remains to show that the equationsdefined by (3.3) are satisfied. This is follows from Observation 3.15 and the factthat an ERBDS contains at most one neighbor of each blue vertex.

For the reverse direction, suppose the constraints in L′ are satisfied by some0/1-assignment to the variables. Then from each set Yi with i ∈ [d− 1], at leastone variable is set to 1. So suppose variables yz,iz are set to 1 for z ∈ [d− 1], iz ∈[r]. We show instance Xi1,...,id−1

has a SEMI-ERBDS wrt. V consisting of thevertices whose corresponding variable is set to 1. Since the product ∏z∈[d−1] yz,izis 1 on both sides of the equations defined by (3.2) for Gi1,...,id−1

, for each bluevertex b in the graph we have:

∑x∈NGi1,...,id−1

(b)vx ≡m 1

implying all blue vertices have at least one neighbor in the SEMI-ERBDS. Fur-thermore if x /∈ V, we know that it has at most one neighbor in the SEMI-ERBDS since the multiplication of any two of its neighbors yields zero by (3.3).Hence Gi1,...,id−1

has a SEMI-ERBDS wrt. V. By Lemma 3.14, this implies the groupof RBDS instances from which it was constructed contained a satisfiable instance.Hence there was a yes-instance among the inputs of the cross-composition. J

3


Observe that the polynomials constructed in Theorem 3.27 have a simpleform: each polynomial is a product of (d− 1) Y-variables multiplied by a sumof variables corresponding to red vertices, or simply a multiplication of twovariables corresponding to red vertices. Each polynomial can therefore beencoded in O(n) bits, where n is the number of variables in the constructed CSP.The sparsification of Theorem 3.1 therefore encodes such instances in O(nd+1)

bits. The lower bound shows that this is optimal up to no(1) factors.

3.2.3 Polynomial non-root CSPWe start our lower bound discussion for d-POLYNOMIAL NON-ROOT CSP byconsidering polynomials over the rationals. Using existing kernel lower boundsfor CNF-SAT parameterized by the number of variables, we first show that 1-POLYNOMIAL NON-ROOT CSP over Q does not have a generalized kernel of sizebounded by any polynomial in n, unless NP ⊆ coNP/poly.

I Theorem 3.28 1-POLYNOMIAL NON-ROOT CSP over Q parameterized by thenumber of variables n does not have a generalized kernel of polynomial size unlessNP ⊆ coNP/poly.

Proof. We present a linear-parameter transformation from CNF-SAT with un-bounded clause length parameterized by the number of variables. Existingresults [33,41] imply that this problem does not have a generalized kernel ofpolynomial size. The linear-parameter transformation will transfer this lowerbound to 1-POLYNOMIAL NON-ROOT CSP over Q.

A clause in conjunctive normal form can directly be translated into a non-root constraint of a degree-1 polynomial over Q. For example, the clause (x1 ∨¬x3 ∨ x4) is satisfied by a 0/1-assignment if and only if x1 + (1− x3) + x4 6= 0over Q. More generally, a clause (xi1 ∨ . . . ∨ xik ∨ ¬xik+1

∨ . . . ∨ ¬xi`) translatesinto the constraint (∑k

j=1 xij) + (∑`j=k+1(1− xij)) 6= 0. Hence the system of

disequalities derived by transforming all clauses in a CNF-formula is satisfiableif and only if the formula is. As the number of variables is preserved by thistransformation, the theorem follows. J

We now turn our attention to d-POLYNOMIAL NON-ROOT CSP over finite ringsand fields. In Theorem 3.13 we provided a kernel for d-POLYNOMIAL NON-ROOT

CSP over Z/pZ for primes p. It is natural to ask whether similar results canbe obtained when working with polynomials modulo an arbitrary integer m.When m is composite, our kernelization fails. We can show that this is not ashortcoming of our proof strategy, but a necessity due to the fact that constraintsexpressed by degree-d polynomials modulo composite numbers can model morecomplex constraints than degree-d polynomials modulo a prime. For example,it is known (cf. [7, §2]) that there is a degree-3 polynomial f over the integers

3


modulo 6 which represents a logical OR of size 27 in the following way:

f (x1, . . . , x27) 6≡6 0⇔ (x1 ∨ . . . ∨ x27). (3.4)

By this expressibility of a size-27 OR by a polynomial of degree 3 over Z/6Z us-ing the same variables, one easily constructs a linear-parameter transformationfrom 27-CNF-SAT to 3-POLYNOMIAL NON-ROOT CSP over Z/6Z by mimick-ing the proof of Theorem 3.28. Since 27-CNF-SAT does not have a kernel ofsize O(n27−ε) for any ε > 0 unless NP ⊆ coNP/poly (Theorem 2.9), this linear-parameter transformation rules out kernels of size O(n27−ε) for 3-POLYNOMIAL

NON-ROOT CSP over Z/6Z under the same conditions. Plugging in the degreeof 3 and modulus 6 into the bound of Theorem 3.13 would give a reductionto O(n3·(6−1)) = O(n15) constraints and would contradict the lower bound.The example therefore shows that the problem is more complex for compositemoduli: the bound for the prime case cannot be matched. In particular, wewill see that the exponent in the kernel size may depend super-linearly on thedegree d of the CSP. For general non-primes, we give a lower bound using aconstruction by Bhowmick et al. [11] of low-degree polynomials representingOR in the sense of Equation (3.4).

I Theorem 3.29 Let m be a non-prime with a prime factorization consistingof r distinct primes, such that m = ∏i∈[r] pi. Let d be an even integer. Thend-POLYNOMIAL NON-ROOT CSP over Z/mZ parameterized by the number ofvariables n does not have a generalized kernel of size O(n(d/2)r−ε) for any ε > 0,unless NP ⊆ coNP/poly.

Proof. For any integer N ≥ 1, Bhowmick et al. [11, Appendix A] provide a wayto construct a polynomial f of degree 2dN1/re such that for all x1, . . . , xN ∈0, 1,

f (x1, . . . , xN) 6≡m 0⇔ (x1 ∨ · · · ∨ xN).

This implies that for even values of d and N = (d/2)r, we can find a polyno-mial f of degree d satisfying the above equation. As such, d-POLYNOMIAL NON-ROOT CSP can express a logical OR of size (d/2)r without introducing auxiliaryvariables. As in the proof of Theorem 3.28, this gives a linear-parameter transfor-mation from (d/2)r-CNF-SAT to d-POLYNOMIAL NON-ROOT CSP. By Theorem 2.9,the latter problem does not have a generalized kernel of size O(n(d/2)r−ε) forany ε > 0, unless NP ⊆ coNP/poly. Hence the same lower bound applies to theCSP. J

In case m does not have a prime factorization in which all primes are distinct,it is possible to obtain weaker a lower bound using a result by Barrington etal. [8], which proves that there exists a polynomial of degree O(`N1/r) thatrepresents a logical OR when taken modulo m. Here ` is the largest prime factorof m. For prime moduli, the following result provides a lower bound almostmatching the upper bound in Theorem 3.13.

3

3.3 Conclusion 61

I Theorem 3.30 Let p be a prime. Then d-POLYNOMIAL NON-ROOT CSPover Z/pZ parameterized by the number of variables n does not have a gen-eralized kernel of size O(nd(p−1)−ε) for any ε > 0, unless NP ⊆ coNP/poly.

Proof. We use a linear-parameter transformation from d(p− 1)-CNF-SAT. Weproceed similarly as in the proof of Theorem 3.29. It is known (cf. [9, Theorem24]) that for each prime p and integer d, there is a polynomial f of degree dmodulo p, such that for any x1, . . . , xd(p−1) ∈ 0, 1 we have:

f (x1, . . . , xd(p−1)) 6≡p 0⇔ (x1 ∨ x2 ∨ · · · ∨ xd(p−1)).

This allows the linear-parameter transformation to be carried out as in Theo-rem 3.29. J

3.3 ConclusionWe have given upper and lower bounds on the kernelization complexity ofBoolean CSPs that can be represented by polynomial (dis)equalities, obtaining(nearly) tight sparsification bounds in several cases. For d-POLYNOMIAL NON-ROOT CSP over the integers modulo a prime, there is a factor n differencebetween the upper and lower bound. It would be interesting to see whether thiscan be resolved.

Our main conceptual contribution is to analyze constraints on Booleanvariables based on the minimum degree of multivariate polynomials whoseroots, or non-roots, capture the satisfying assignments. The ultimate goal of thisline of research is to characterize the optimal sparsification size of a BooleanCSP based on easily accessible properties of the constraint language. To reachthis goal, several significant hurdles have to be overcome.

One such issue is that we do not have upper bounds for d-POLYNOMIAL

NON-ROOT CSP modulo a composite number m. For example, for d-POLYNOMIAL

NON-ROOT CSP over the integers modulo 6, we do not know of any way toreduce the number of constraints to polynomial in n. This difficulty is connectedto longstanding questions regarding the minimum degree of a multivariatepolynomial modulo 6 that represents the OR-function of n variables in the senseof Equation (3.4). In the lower bound construction in Theorem 3.29, we willuse that if the OR-function with g(d) inputs can be represented by polynomi-als of degree d, then d-POLYNOMIAL NON-ROOT CSP cannot be compressedto size O(ng(d)−ε) unless NP ⊆ coNP/poly. By contraposition, a kernelizationwith size bound O(nh(d)) implies a lower bound of h−1(d) on the degree of apolynomial representing an OR of arity h(d), assuming NP * coNP/poly. Kernelupper bounds where h(d) is polynomially bounded in d, would therefore estab-lish lower bounds of the form Ω(nα) on the degree of polynomials representingan n-variable OR modulo 6, for some α > 0. However, the current-best degree

3


lower bound [88] is only Ω(log n), which has not been improved in nearly twodecades (cf. [11, §1.4]).

A simple example of a CSP whose kernelization complexity is currentlyunclear has constraints of the form “the number of satisfied literals is one or two,modulo six”. The approach of Theorem 3.1 fails, since there is no polynomialmodulo six with root set 1, 2.

On the other hand, we will see in the next chapter that this frameworkdoes give a number of insights into the sparsifiability of Boolean CSPs, asit for example allows us to classify which Boolean CSPs have a non-trivialsparsification.

Fully classifying the sparsifiability of all CSPs furthermore requires under-standing of CSPs over larger domains. Our upper bound techniques easilyextend to CSPs over domains of size k > 2. Suppose we have to determinewhether there is an assignment τ : V → 0, . . . , k− 1 to a set of variables V,such that f (τ(x1), . . . , τ(xn)) = 0 for all degree-d polynomial equalities f on agiven list L. Using the same technique as in Theorem 3.1, one can efficientlyfind a subset of O(nd) constraints that preserve the answer to the problem.This bound is slightly worse than over the Boolean domain, because we can nolonger assume all monomials to be multilinear.

The real challenge in analyzing CSPs over any domain is to find low-degreepolynomials that represent constraints of interest. We will see the usefulness ofthis technique in Chapter 6, in which constraints of the form “variables x1, . . . , xkdo not all receive distinct values from 1, ..., k” are represented by polynomials ofdegree k− 1 to compress k-coloring problems on graphs.

Finally, we mention that all our results extend to the setting of min-onesand max-ones CSPs, in which one has to find a satisfying assignment that setsat least, or at most, a given number of variables to true. For example, ourresults easily imply that EXACT HITTING SET parameterized by the numberof variables n has a sparsification of size O(n2), which cannot be improvedto O(n2−ε) unless NP ⊆ coNP/poly.

4

Chapter 4Sparsification Bounds for CSPs

In Chapter 3, we have given a way to sparsify CSP instances for which theconstraints are given as equalities of low-degree polynomials. In this chapterwe will further study the sparsifiability of CSP(Γ) (Sec. 2.7) for finite Booleanconstraint languages Γ, applying the main results of the previous chapter.

We start by extending the results obtained in the previous chapter in Sec-tion 4.1. We will show for a fixed constraint language Γ, that if Γ can becaptured by polynomials in an appropriate sense, the results in the previouschapter allow us to obtain a kernelization whose size will depend on the degreeof these polynomials. Clearly, the upper bound results carry over if CSP(Γ) isequivalent to d-POLYNOMIAL ROOT CSP for some d, but it turns out that we canstill apply our results under somewhat weaker conditions on Γ.

In Section 4.2, we will give a general lower bound technique for CSP(Γ).We give a simple condition that, if satisfied by Γ, leads to a kernelizationlower bound for CSP(Γ). The lower bounds are obtained by linear-parametertransformations from VERTEX COVER (for quadratic lower bounds) and d-CNF-SAT (for other polynomial lower bounds).

Using the results from Sections 4.1 and 4.2, we continue in Section 4.3 witha dichotomy theorem. We fully classify for which Boolean constraint languagesCSP(Γ) allows for a non-trivial sparsification, assuming NP * coNP/poly. Itturns out that, contrary to the pessimistic picture that arose during the initialinvestigation of sparsifiability, the phenomenon of non-trivial sparsification iswidespread and occurs for almost all Boolean CSPs! When considering constraint

4

64 Sparsification Bounds for CSPs

languages whose largest constraint has arity d, a trivial sparsification withO(nd)constraints can be obtained for CSP(Γ). Dell and Van Melkebeek [33] showedthat d-CNF-SAT does not allow for a non-trivial sparsification. However, it turnsout that the d-OR relation is special in this sense: if Γ is a constraint languagewhose largest constraint has arity-d, then the only reason that CSP(Γ) does nothave a non-trivial sparsification, is that it contains a relation that is essentially ad-OR relation. In all other cases, sparsification is possible.

The polynomial-based framework also resulted in some linear sparsifications.As seen in the introductory example in Chapter 3, the d-EXACT SAT problemhas a sparsification with O(n) constraints for each constant d. This prompteda detailed investigation into linear sparsifications for CSPs by Lagerkvist andWahlström [70], who used the toolkit of universal algebra in an attempt toobtain a characterization of the Boolean CSPs with a linear sparsification. Theirresults give a necessary and sufficient condition on the constraint language of aCSP for having a so-called Maltsev embedding over an infinite domain. They alsoshow that when a CSP has a Maltsev embedding over a finite domain, then thiscan be used to obtain a linear sparsification. Alas, it remains unclear whetherMaltsev embeddings over infinite domains can be exploited algorithmically, anda characterization of the linearly-sparsifiable CSPs is currently not known.

In Section 4.4, we give a necessary and sufficient condition for a relationto be captured by degree-1 polynomials. We introduce the notion of balancedoperations, and say that a relation is balanced if and only if it is preservedby all balanced operations. We prove that if a Boolean relation R is balanced,then it can efficiently be captured by a degree-1 polynomial and the numberof constraints that are applications of this relation can be reduced to O(n).Hence when all relations in a constraint language Γ are balanced—we call sucha constraint language balanced—then CSP (Γ) has a sparsification with O(n)constraints. We also show that, on the other hand, if a Boolean relation Ris not balanced, then there does not exist a degree-1 polynomial over anyring that captures R in the sense required for application of the polynomialframework. The property of being balanced is (as defined) a universal-algebraicproperty; these results thus tightly bridge universal algebra and the polynomialframework. In Section 4.7 we compare our universal-algebraic approach to theapproach proposed by Lagerkvist and Wahlström [70].

In Section 4.5, we continue our investigation of CSPs that have linear sparsi-fication by considering Boolean symmetric constraint satisfaction problems. Aconstraint satisfaction problem is symmetric if the satisfiability of a constraintonly depends on the number of true variables, and not on their positions (seeDefinition 4.30 for a complete definition). Using the results from Section 4.4,we obtain that CSP(Γ) has a linear sparsification when Γ is balanced. In thecase of symmetric CSPs, we can furthermore show that when Γ is intractableand not balanced, then there must exist a relation in Γ that cone-defines 2-OR.Using the results from Section 4.2, this implies a quadratic lower bound on the

4

4.1 Sparsification for Boolean CSPs captured by polynomials 65

sparsification size. Consequently, we obtain a characterization of the sparsi-fication complexity of NP-complete Boolean CSPs whose constraint languageconsists of symmetric relations: there is a linear sparsification if and only if theconstraint language is balanced. This yields linear sparsifications in several newscenarios that were not known before.

In Section 4.6, we combine the results on non-trivial sparsification obtainedin Section 4.3 with the results on linear sparsification obtained in Section 4.4.This allows us to obtain an exact characterization of the optimal sparsificationsize for all Boolean CSPs where each relation has arity at most three. For aBoolean constraint language Γ consisting of relations of arity at most three, wecharacterize the sparsification complexity of Γ as an integer k ∈ 1, 2, 3 thatrepresents the largest OR that Γ can cone-define. Then we prove that CSP(Γ)has a sparsification of size O(nk), but no sparsification of size O(nk−ε) forany ε > 0, giving matching upper and lower bounds. Hence for all BooleanCSPs with constraints of arity at most three, the polynomial-based frameworkgives provably optimal sparsifications.

4.1 Sparsification for Boolean CSPs captured bypolynomials

In this section, we will consider the problem CSP(Γ) for fixed Γ. We showthat under certain conditions on Γ, the problem allows for a sparsification. Wewill rely heavily on the results from Section 3.1.1. We will need the followingdefinition.

Definition 4.1 Let R be a k-ary Boolean relation (recall Definition 2.29). Wesay that a polynomial pu over a ring Eu captures an unsatisfying assignmentu ∈ 0, 1k \ R with respect to R, if the following two conditions hold over Eu.

pu(x1, . . . , xk) = 0 for all (x1, . . . , xk) ∈ R, and (4.1)

pu(u1, . . . , uk) 6= 0. (4.2)

We will say a k-ary relation R is captured by degree-d polynomials if for allu ∈ 0, 1k \R there exists a ring Eu ∈ Q∪Z/muZ | mu > 1 and mu ∈Nand polynomial pu over Eu of degree at most d that captures u with respectto R.

In the previous chapter, we obtained a sparsification in the case that eachconstraint was given by a degree-d polynomial over the same field or ring. Inthis section we will generalize this result by allowing the use of polynomialsover different rings, and by allowing multiple polynomials per constraint. Wefirst show the result for constraint languages consisting of a single relation inthe next lemma, and then generalize this to arbitrary finite constraint languagesin Theorem 4.6

4


I Lemma 4.2 Let R ⊆ 0, 1k be a fixed k-ary relation that is captured bydegree-d polynomials. There exists a polynomial-time algorithm that, given a set ofconstraints C over R over n variables, outputs C ′ ⊆ C with |C ′| = O(nd), suchthat any Boolean assignment satisfies all constraints in C if and only if it satisfiesall constraints in C ′.

Proof. Let C be a set of constraints over R and let V be the set of variablesused. We will create |0, 1k \ R| instances of d-POLYNOMIAL ROOT CSP withvariable set V. For each u ∈ R \ 0, 1k, we create an instance (Lu, V) ofd-POLYNOMIAL ROOT CSP over Eu, as follows. Choose a ring Eu ∈ Q ∪Z/muZ | mu > 1 and mu ∈N and a polynomial pu over Eu such that (4.1)and (4.2) are satisfied for u. For each constraint (x1, . . . , xk) ∈ C, add theequality pu(x1, . . . , xk) = 0 to the set Lu; note that these are equations over thering Eu. Let L :=

⋃u/∈R Lu be the union of all created sets of equalities. From

this construction, we obtain the following property.

B Claim 4.3 Any Boolean assignment f that satisfies all equalities in L, satisfiesall constraints in C.

Proof. Let f be a Boolean assignment that satisfies all equalities in L. Suppose fdoes not satisfy all equalities in C, thus there exists (x1, . . . , xk) ∈ C, such that( f (x1), . . . , f (xk)) /∈ R. Let u := ( f (x1), . . . , f (xk)). Since u /∈ R, the equationpu(x1, . . . , xk) = 0 was added to Lu ⊆ L. However, it follows from (4.2) thatpu( f (x1), . . . , f (xk)) 6= 0, which contradicts the assumption that f satisfies allequalities in L. C

For each instance (Lu, V) of d-POLYNOMIAL ROOT CSP over Eu with Eu 6= Q,apply Theorem 3.6 to obtain an equivalent instance (L′u, V) with L′u ⊆ Luand |L′u| = O(rund), where ru is the number of distinct prime divisors of mu.Similarly, for each instance (Lu, V) of d-POLYNOMIAL ROOT CSP over Eu withEu = Q, apply Theorem 3.1 and obtain an equivalent instance (L′u, V) withL′u ⊆ Lu and |L′u| = O(nd). Let L′ :=

⋃L′u. By this definition, any Boolean

assignment satisfies all equalities in L, if and only if it satisfies all equalities inL′. Construct C ′ as follows. For any (x1, . . . , xk) ∈ C, add (x1, . . . , xk) to C ′ ifthere exists u ∈ 0, 1k \ R such that pu(x1, . . . , xk) = 0 ∈ L′. Hereby, C ′ ⊆ C.The following two claims show the correctness of this sparsification procedure.

B Claim 4.4 Any Boolean assignment f satisfies all constraints in C ′, if and onlyif it satisfies all constraints in C.

Proof. Since C ′ ⊆ C, it follows immediately that any Boolean assignment satis-fying the constraints in C also satisfies all constraints in C ′. It remains to provethe opposite direction.

Let f be a Boolean assignment satisfying all constraints in C ′. We showthat f satisfies all equalities in L′. Let pu(x1, . . . , xk) = 0 ∈ L′. Thereby,

4

4.2 Lower bounds using cone-definability 67

(x1, . . . , xk) ∈ C ′ and since f is a satisfying assignment, ( f (x1), . . . , f (xk)) ∈ R.It follows from property (4.1) that pu( f (x1), . . . , f (xk)) = 0 as desired.

Since f satisfies all equalities in L′, it satisfies all equalities in L by the choiceof L′. It follows from Claim 4.3 that thereby f satisfies all constraints in C. C

B Claim 4.5 |C ′| = O(nd).

Proof. By the construction of C ′, it follows that |C ′| ≤ |L′|. Let r be the maxi-mum of all ru for u ∈ 0, 1k \ R. Since R is fixed, r is a constant. We know|L′| = ∑u/∈R |L′u| ≤ ∑u/∈RO(r · nd) ≤ 2k · O(r · nd) = O(nd), as |R| ≤ 2k and kis a constant. C

Claims 4.4 and 4.5 complete the proof of Lemma 4.2. J

By applying the above lemma repeatedly, we obtain a sparsification for finiteBoolean constraint languages consisting of multiple relations.

I Theorem 4.6 Let Γ be a finite Boolean constraint language such that everyR ∈ Γ is captured by degree-d polynomials. Then there exists a polynomial-timealgorithm that, given a set of constraints C over Γ over n variables, outputs C ′ ⊆ Cwith |C ′| = O(nd), such that any Boolean assignment satisfies all constraints in Cif and only if it satisfies all constraints in C ′.

Proof. Suppose we are given an instance of CSP(Γ), with set of constraints C.We show how to define C ′ ⊆ C. For a k-ary relation R ∈ Γ, let CR contain allconstraints in C of the form R(x1, . . . , xk). For all R ∈ Γ, apply Theorem 4.2 toobtain C ′R ⊆ CR such that |C ′R| = O(nd) and any Boolean assignment satisfiesall constraints in C ′R if and only if it satisfies the constraints in CR. Let C ′ be theunion of all C ′R. It is easy to verify that |C ′| ≤ |Γ| · O(nd) = O(nd). J

4.2 Lower bounds using cone-definabilityIn this section we will show that if one of the relations in Γ can be used to definea k-OR in a specific way, then CSP(Γ) does not have a kernel of size O(nk−ε)unless NP ⊆ coNP/poly. To formalize this, we need the following definition.

Definition 4.7 Let us say that a Boolean relation T of arity m is cone-definablefrom a Boolean relation U of arity n if there exists a tuple (y1, . . . , yn) where:

• for each j ∈ [n], it holds that yj is an element of 0, 1 ∪ x1, . . . , xm ∪¬x1, . . . ,¬xm;

• for each i ∈ [m], there exists j ∈ [n] such that yj ∈ xi,¬xi; and,

• for each f : x1, . . . , xm → 0, 1, it holds that ( f (x1), . . . , f (xm)) ∈ T if andonly if ( f (y1), . . . , f (yn)) ∈ U. Here, f denotes the natural extension of fwhere f (0) = 0, f (1) = 1, and f (¬xi) = ¬ f (xi).

4


(The prefix cone indicates the allowing of constants and negation.)Let us say that two Boolean relations T, U are cone-interdefinable if each is

cone-definable from the other.

I Example 4.8 Let R = (0, 0), (0, 1) and let S = (0, 1), (1, 1). We havethat R is cone-definable from S via the tuple (¬x2,¬x1); also, S is cone-definablefrom R via the same tuple. J

The following is a key property of cone-definability; it states that relationsthat are cone-definable from a constraint language Γ may be simulated bythe constraint language, and thus used to prove hardness results for CSP(Γ).Recall that Γ∗ is the constraint language Γ together with all constants (as inDefinition 2.39).

I Proposition 4.9 Suppose that Γ is an intractable Boolean constraint language,and that ∆ is a constraint language such that each relation in ∆ is cone-definablefrom a relation in Γ. Then, there exists a linear-parameter transformation fromCSP(Γ∗ ∪ ∆) to CSP(Γ), parameterized by the number of variables.

Proof. It suffices to give a linear-parameter transformation from CSP(Γ∗ ∪ ∆)to CSP(Γ∗), by Theorem 2.44. Let (C, V) be an instance of CSP(Γ∗ ∪ ∆), andlet n denote |V|. We generate an instance (C ′, V′) of CSP(Γ∗) as follows.

• For each variable v ∈ V, introduce a primed variable v′. By Proposition 2.43,the relation 6= (that is, the relation (0, 1), (1, 0)) is pp-definable (Defini-tion 2.40) from Γ∗. Fix such a pp-definition, and let d be the number ofvariables in the definition. For each v ∈ V, include in C ′ all constraints in thepp-definition of 6=, but where the variables are renamed so that v and v′ arethe distinguished variables, and the other variables are fresh.

The number of variables used so far in C ′ is nd.

• For each b ∈ 0, 1, introduce a variable zb and add the constraint (b)(zb)to C ′. This is equivalent to requiring that z0 = 0 and z1 = 1 in any satisfyingassignment. Observe that for this step, it is required that we can use relationsfrom Γ∗, instead of just Γ.

• For each constraint T(v1, . . . , vk) in C such that T ∈ Γ∗, add the constraintto C ′.• For each constraint T(v1, . . . , vk) in C such that T ∈ ∆ \ Γ∗, we use the

assumption that T is cone-definable from a relation in Γ to add a constraintto C ′ that has the same effect as T(v1, . . . , vk). In particular, assume that T iscone-definable from U ∈ Γ via the tuple (y1, . . . , y`), and that U has arity `.Add to C ′ the constraint U(w1, . . . , w`), where, for each i ∈ [`], the entry wiis defined as follows:

wi =

vj if yi = xj,v′j if yi = ¬xj,

z0 if yi = 0, andz1 if yi = 1.

4


See Example 4.10 for an example of this whole construction. The set V′ ofvariables used in C ′ is the union of V ∪ v′ | v ∈ V ∪ z0, z1 with the othervariables used in the copies of the pp-definition of 6=. We have |V′| = nd + 2. Itis straightforward to verify that an assignment f : V → 0, 1 satisfies C if andonly if there exists an assignment f ′ : V′ → 0, 1 of f that satisfies C ′. J

I Example 4.10 We give an example of the linear-parameter transformationused in the proof of Proposition 4.9. Let us consider R = (0, 0, 1), (0, 1, 0),(1, 0, 0), (0, 1, 1), (1, 0, 1), (1, 1, 0) and let Γ = R. Thus, a three-tuple is in Runless all three variables are equal.

Define R′ = (0, 0), (1, 0), (1, 1) and ∆ := R′. Observe that Γ is anintractable constraint language, as can be shown using Theorem 2.37. We cancone-define R′ from R by the following tuple:

(x1,¬x2, 0).

Verify that indeed (x1, x2) ∈ R′ if and only if (x1,¬x2, 0) ∈ R. As such, thepreconditions of the proposition are satisfied, we now show the linear-parametertransformation from CSP(Γ∗ ∪ ∆) to CSP(Γ∗), based on an example. Considerthe input

V = v, w, x, y, z, C = R′(x, y), R′(v, x), R(w, v, z), (0)(w)

for CSP(Γ∗ ∪ ∆). We create an instance (V′, C ′) of CSP(Γ∗). Initialize

V′ := v, v′, w, w′, x, x′, y, y′, z, z′, z0, z1.

In the first step, we want to add the inequality requirements on the variables andtheir primed counterparts. For this, we need a pp-definition of (0, 1), (1, 0)from Γ∗. For example, we may use the following definition.

∃b, c : R(x1, x2, b) ∧ (0)(b) ∧ R(x1, x2, c) ∧ (1)(c).

As such, we add variables bv, bw, bx, by, bz, cv, cw, cx, cy, cz to V′ and initializeC ′ as

R(v, v′, bv), (0)(bv), R(w, w′, bw), (0)(bw),

R(x, x′, bx), (0)(bx), R(y, y′, by), (0)(by),

R(z, z′, bz), (0)(bz) ∪R(v, v′, cv), (1)(cv), R(w, w′, cw), (1)(cw),

R(x, x′, cx), (1)(cx), R(y, y′, cy), (1)(cy),

R(z, z′, cz), (1)(cz).

In this particular case, we could have re-used the variables for b and c, asthey are constant, but this is not generally true. In the next step, we add theconstraints in

(0)(z0), (1)(z1).

4


Then, we add all constraints that are using relations from Γ∗, thus adding theconstraints in

R(w, v, z), (0)(w)

to C ′. Finally, we use the cone-definability result for ∆, and add the constraintsin

R(x, y′, z0), R(v, x′, z0),

concluding the construction of C ′. J

The following theorem combines existing ideas for kernelization lowerbounds due to several authors [33,35,70] with the formalism of cone-definition.

I Theorem 4.11 Let Γ be an intractable Boolean constraint language, and letk ≥ 1. If there exists R ∈ Γ such that R cone-defines k-OR, then CSP(Γ) does nothave a generalized kernel of size O(nk−ε), unless NP ⊆ coNP/poly.

Proof. We do a case distinction on k.

(k = 1) Suppose that there exists ε > 0 such that CSP(Γ) has a (gener-alized) kernel of size O(n1−ε) into a parameterized decision problem L.Let L := x#1` | (x, `) ∈ L denote the classical (non-parameterized) problemcorresponding to L, in which the parameter is written in unary and appendedto the end of each input string. Here 1 is an arbitrary character in the encodingalphabet, and # is a separator character that is freshly added to the alphabet.

Using this hypothetical generalized kernel, one could obtain a polynomial-time algorithm that takes as input a series of instances (C1, V1), . . . , (Ct, Vt) ofCSP(Γ), and outputs in polynomial time an instance x of L such that:

• x ∈ L if and only if (Ci, Vi) is a yes-instance of CSP(Γ) for all i ∈ [t], and

• x has bitsize O(N1−ε), where N := ∑ti=1 |Vi|.

To obtain such an AND-compression algorithm from a hypothetical generalizedkernel of CSP(Γ) into L, it suffices to do the following:

1. On input a series of instances (C1, V1), . . . , (Ct, Vt) of CSP(Γ), form a newinstance (C∗ :=

⋃ti=1 Ci, V∗ :=

⋃ti=1 Vi) of CSP(Γ). Hence we take the dis-

joint union of the sets of variables and the sets of constraints, and it followsthat the new instance has answer yes if and only if all the inputs (Ci, Vi)have answer yes.

2. Run the hypothetical generalized kernel on (C∗, V∗), which has |V∗| = Nvariables and is therefore reduced to an equivalent instance (x∗, k∗) of Lwith size and parameter bounded by O(N1−ε). Let x be the classicalinstance corresponding to (x∗, k∗).

4


If we apply this AND-compression scheme to a sequence of t1(m) := mα in-stances of m bits each (which therefore have at most m variables each), theresulting output has O(|V∗|1−ε) = O((m ·mα)1−ε) = O(m(1+α)(1−ε)) bits. Bypicking α large enough that it satisfies (1 + α)(1− ε) ≤ α, we therefore com-press a sequence of t1(m) instances of bitsize m into one instance expressingthe logical AND, of size at most t2(m) ≤ O(m(1+α)(1−ε)) ≤ C · t1(m) for somesuitable constant C. Drucker [35, Theorem 5.4] has shown that an error-freedeterministic AND-compression algorithm with these parameters for an NP-complete problem into a fixed decision problem L, implies NP ⊆ coNP/poly.Hence the lower bound for k = 1 follows since CSP(Γ) is NP-complete.

(k ≥ 2) For k ≥ 2, we prove the lower bound using a linear-parameter trans-formation (recall Definition 2.7). Let ∆ be the set of k-ary relations given by∆ := 0, 1k \ u | u ∈ 0, 1k. In particular, note that ∆ contains the k-OR

relation. Since R cone-defines k-OR, it is easy to see that by variable negations,R cone-defines all relations in ∆. Thereby, it follows from Proposition 4.9 thatthere is a linear-parameter transformation from CSP(Γ∗ ∪∆) to CSP(Γ). Thus,to prove the lower bound for CSP(Γ), it suffices to prove the desired lowerbound for CSP(Γ∗ ∪ ∆). We do a further case distinction on k.

Case (k = 2) If k = 2, we do a linear-parameter transformation from VERTEX

COVER to CSP(Γ∗ ∪ ∆). Since it is known that VERTEX COVER parameterizedby the number of vertices n has no generalized kernel of size O(n2−ε) for anyε > 0, unless NP ⊆ coNP/poly (Theorem 2.10), the result will follow.

Suppose we are given a graph G = (V, E) on n vertices and integer k ≤ n,forming an instance of the VERTEX COVER problem. The question is whetherthere is a set S of k vertices, such that each edge has at least one endpointin S. We create an equivalent instance (C, V′) of CSP(Γ∗ ∪ ∆) as follows. Weintroduce a new variable xv for each v ∈ V. For each edge u, v ∈ E, we addthe constraint 2-OR(xu, xv) to C.

At this point, any vertex cover in G corresponds to a satisfying assignment, andvice versa. It remains to ensure that the size of the vertex cover is bounded byk. Let Hn,k be the n-ary relation given by

Hn,k = (x1, . . . , xn) | xi ∈ 0, 1 for all i ∈ [n] and ∑i∈[n]

xi = k.

By Proposition 2.43, we obtain that Γ∗ pp-defines all Boolean relations. Itfollows from [70, Lemma 17] that Γ∗ pp-defines Hn,k using O(n + k) con-straints and O(n + k) existentially quantified variables. We add the constraintsfrom this pp-definition to C, and add the existentially quantified variablesto V′. This concludes the construction of C. It is easy to see that C has asatisfying assignment if and only if G has a vertex cover of size k. Furthermore,we used O(n + k) ∈ O(n) variables and thereby this is a linear-parametertransformation from VERTEX COVER to CSP(Γ∗ ∪ ∆).

4


Case (k ≥ 3) In this case there is a trivial linear-parameter transformationfrom CSP(∆) to CSP(Γ∗ ∪ ∆). It is easy to verify that CSP(∆) is equivalent tok-CNF-SAT. The result now follows from the fact that for k ≥ 3, k-CNF-SAT hasno kernel of size O(nk−ε) for any ε > 0, unless NP ⊆ coNP/poly (Theorem2.9). J

4.3 Classification of CSPs with worst-casesparsification

It is well known that k-CNF-SAT allows no non-trivial sparsification, for eachk ≥ 3 (Theorem 2.9). This means that we cannot efficiently reduce the numberof clauses in such a formula to O(nk−ε). The k-OR relation is special, in thesense that there is exactly one k-tuple that is not contained in the relation. Weshow in this section that when considering k-ary relations for which there ismore than one k-tuple not contained in the relation, a non-trivial sparsificationis always possible. In particular, the number of constraints of any input canefficiently be reduced to O(nk−1).

We start by showing that if R is a relation for which at least two k-tuples arenot contained in R, then R is captured by degree-(k− 1) polynomials. Recallfrom Section 4.1, that d-NAE-SAT was captured by polynomials of degree d− 1.In the next lemma, we will reuse some of the ideas that construction. The readermay verify that d-NAE-SAT can indeed be represented by a constraint languagefor which every relation contains 2d − 2 tuples.

Since relations with |R| = 2k have a sparsification of size O(1), as con-straints over such relations are satisfied by any assignment, this will allow us toobtain the desired sparsification for k-ary relations with |0, 1k \ R| 6= 1 usingTheorem 4.6.

I Lemma 4.12 Let R be a k-ary Boolean relation with |R| < 2k − 1. The relationR is captured by degree-(k− 1) polynomials.

Proof. We will prove this by showing that for every u ∈ 0, 1k \R, there exists ak-ary polynomial pu over Q of degree at most k− 1 satisfying requirements (4.1)and (4.2) from Definition 4.1, such that the result follows.

We will prove the existence of such a polynomial by induction on k. Fork = 1, the lemma statement implies that R = ∅. Thereby, for any u /∈ R, wesimply choose pu(x1) := 1. This polynomial satisfies the requirements, and hasdegree 0.

Let k > 1 and let u = (u1, . . . , uk) ∈ 0, 1k \ R. Since |R| < 2k − 1, we canchoose w = (w1, . . . , wk) such that w ∈ 0, 1k \ R and w 6= u. Choose such warbitrarily, we now do a case distinction.

(There exists no i ∈ [k] for which ui = wi) This implies ui = ¬wi for all i.One may note that for u = (0, . . . , 0) and w = (1, . . . , 1) this situation corre-

4

4.3 Classification of CSPs with worst-case sparsification 73

sponds to MONOTONE k-NAE-SAT. We show that there exists a polynomial pusuch that pu(u1, . . . , uk) 6= 0, and pu(x1, . . . , xk) = 0 for all (x1, . . . , xk) ∈ R.Hereby pu satisfies conditions (4.1) and (4.2) for u. For i ∈ [k], defineri(x) := (1− x) if ui = 1 and ri(x) := x if ui = 0. It follows immediately fromthis definition that ri(ui) = 0 and ri(wi) = 1 for all i ∈ [k]. Define

pu(x1, . . . , xk) :=k−1

∏i=1

(i−

k

∑j=1

rj(xj)

).

By this definition, pu has degree k− 1. It remains to verify that pu has thedesired properties. First of all, since ∑k

j=1 rj(uj) = 0 by definition, it followsthat

pu(u1, . . . , uk) =k−1

∏i=1

i 6= 0,

as desired. Since ri(wi) = 1 for all i, we obtain pu(w1, . . . , wk) = ∏k−1i=1 (i−

k) 6= 0, which is allowed since w /∈ R. It is easy to verify that in all other cases,∑k

j=1 rj(xj) ∈ 1, 2, . . . , k− 1 and thereby one of the terms of the product iszero, implying that pu(x1, . . . , xk) = 0.

(There exists i ∈ [k], such that ui = wi) Let u′ and w′ be defined as the re-sults of removing coordinate i from u and w respectively. Note that u′ 6= w′.Define

R′ := (x1, . . . , xi−1, xi+1, . . . , xk) | (x1, . . . , xi−1, ui, xi+1, . . . , xk) ∈ R.

By this definition, u′, w′ /∈ R′ and thereby R′ is a (k − 1)-ary relation with|R′| < 2k−1 − 1. By the induction hypothesis, there exists a polynomial pu′ ofdegree at most k− 2, such that pu′(u′1, . . . , u′k−1) 6= 0 and pu′(x′1, . . . , x′k−1) = 0for all x′ ∈ R′. Now define

pu(x1, . . . , xk) := (1− xi − ui) · pu′(x1, . . . , xi−1, xi+1, . . . , xk).

We show that pu has the desired properties. By definition, pu has the degree ofpu′ plus one. Since pu′ has degree k− 2 by the induction hypothesis, it followsthat pu has degree k− 1. Let (x1, . . . , xk) ∈ R. We do a case distinction on thevalue taken by xi.

• xi 6= ui. In this case, (1− xi − ui) = 0, and thereby pu(x1, . . . , xk) = 0, thussatisfying condition (4.1).

• xi = ui. Since x = (x1, . . . , xi−1, ui, xi+1, . . . , xk) ∈ R, it follows that(x1, . . . , xi−1, xi+1, . . . , xk) ∈ R′. By definition of pu′ , it follows that

pu′(x1, . . . , xi−1, xi+1, . . . , xk) = 0

and thus pu(x1, . . . , xk) = 0, showing (4.1).

4


It remains to show that pu(u1, . . . , uk) 6= 0. This follows from (1− ui − ui) ∈−1, 1, and pu′(u1, . . . , ui−1, ui+1, . . . , uk) 6= 0, showing that (4.2) holds.

Since we have shown for all u ∈ 0, 1k \ R that there exists a polynomial puover Q of degree at most k − 1 satisfying (4.1) and (4.2), R is captured bydegree-(k− 1) polynomials. J

The next lemma formalizes the idea that any k-ary relation with |0, 1k \R| = 1 is equivalent to k-OR, up to negation of variables. The dichotomy resultwill follow from Lemma 4.12, together with the next lemma and Theorem 4.11.

I Lemma 4.13 If R is a Boolean k-ary relation with |R| = 2k − 1, then Rcone-defines k-OR.

Proof. Let u = (u1, . . . , uk) be the unique k-tuple not contained in R. Definethe tuple (y1, . . . , yk) as follows. Let yi := xi if ui = 0, and let yi := ¬xiotherwise. Clearly, this satisfies the first two conditions of cone-definability.It remains to prove the last condition. Let f : x1, . . . , xk → 0, 1. Sup-pose ( f (x1), . . . , f (xk)) ∈ k-OR. We show ( f (y1), . . . , f (yk)) ∈ R. It followsfrom ( f (x1), . . . , f (xk)) ∈ k-OR, that there exists at least one i ∈ [k] such thatf (xi) 6= 0. Thereby, f (yi) 6= ui and thus ( f (y1), . . . , f (yk)) 6= u, implying( f (y1), . . . , f (yk)) ∈ R.

Suppose ( f (x1), . . . , f (xk)) /∈ k-OR, implying f (xi) = 0 for all i ∈ [k]. Butthis implies f (yi) = ui for all i ∈ [k] and thus ( f (y1), . . . , f (yk)) = u /∈ R. J

Using Lemmas 4.12 and 4.13, the next theorem will give a complete classifi-cation of the constraint languages that admit a non-trivial sparsification.

I Theorem 4.14 Let Γ be a finite intractable Boolean constraint language. Let kbe the maximum arity of any relation R ∈ Γ. The following dichotomy holds.

• If for all R ∈ Γ it holds that |R| 6= 2k − 1, then CSP(Γ) has a kernel withO(nk−1) constraints that can be stored in O(nk−1 log n) bits.

• If there exists R ∈ Γ with |R| = 2k − 1, then CSP(Γ) has no generalized kernelof bitsize O(nk−ε) for any ε > 0, unless NP ⊆ coNP/poly.

Proof. Suppose that for all R ∈ Γ, it holds that |R| 6= 2k − 1. We give thefollowing kernelization procedure. Suppose we are given an instance of CSP(Γ),with set of constraints C. We show how to define C ′ ⊆ C. For each constraintR(x1, . . . , x`) ∈ C where R is a relation of arity ` < k, add one such constraintto C ′ (thus removing duplicate constraints). Note that this adds at most O(n`)constraints for each `-ary relation R ∈ Γ.

Let Γk ⊆ Γ contain all relations of arity k for which |R| < 2k − 1. Itfollows from Lemma 4.12 that every relation in Γk is captured by degree-(k− 1)polynomials. Let Ck ⊆ C contain all constraints using a relation from Γk.Use Theorem 4.6 to obtain C ′k ⊆ Ck such that any assignment satisfying all

4

4.4 Towards a classification of CSPs with linear sparsification 75

constraints in C ′k satisfies all constraints in Ck, and |Ck| = O(nk−1). Add theconstraints in C ′k to C ′. This concludes the construction of C ′. Note that theprocedure removes constraints of the form R(x1, . . . , xk) with |R| = 2k, as theseare always satisfied. It is easy to verify that |C ′| ≤ |Γ| · O(nk−1) = O(nk−1).Since each constraint can be stored in O(log n) bits, this gives a kernel of bitsizeO(nk−1 log n).

Suppose that there exists R ∈ Γ with |R| = 2k − 1. It follows fromLemma 4.13 that R cone-defines k-OR. Since Γ is intractable, it now followsfrom Theorem 4.11 that CSP(Γ) has no generalized kernel of size O(nk−ε),unless NP ⊆ coNP/poly. J

4.4 Towards a classification of CSPs with linearsparsification

In this section we will introduce an algebraic property of constraint languages,namely whether a constraint language is balanced (introduced below). Wewill show that for Γ satisfying the property, CSP(Γ) has a kernel with O(n)constraints. This sparsification will be obtained using the method of representingconstraints by low-degree polynomials. The following definitions capture thekey notions for this universal-algebraic approach to sparsification via low-degreepolynomials. Relevant definitions relating to (partial) Boolean operations andCSPs can be found in Section 2.7.

Definition 4.15 A partial Boolean operation f : 0, 1k → 0, 1 is balanced ifthere exist integer values α1, . . . , αk, called the coefficients of f , such that

• ∑i∈[k] αi = 1,

• (x1, . . . , xk) is in the domain of f if and only if ∑i∈[k] αixi ∈ 0, 1, and

• f (x1, . . . , xk) = ∑i∈[k] αixi for all tuples in its domain.

Definition 4.16 We say that a Boolean relation is balanced if it is preserved byall balanced operations, and that a Boolean constraint language is balanced ifeach relation therein is balanced.

Definition 4.17 Define an alternating operation to be a balanced operationak : 0, 1k → 0, 1 such that k is odd and the coefficients alternate between+1 and −1, so that α1 = +1, α2 = −1, α3 = +1, . . ., αk = +1. For example, bythis definition we have

a3(x1, x2, x3) = x1 − x2 + x3

for all (x1, x2, x3) in the domain of a3, and the domain of a3 is (0, 0, 0), (0, 0, 1),(0, 1, 1), (1, 0, 0), (1, 1, 0), (1, 1, 1).

4


While alternating operations form a restricted type of balanced operations,the following proposition shows that being preserved by all balanced operationsis equivalent to being preserved by all alternating operations.

I Proposition 4.18 A Boolean relation R is balanced if and only if for all odd k ≥1, the relation R is preserved by the alternating operation of arity k.

Proof. It suffices to show that if a relation T is not balanced, then there existsan alternating operation that does not preserve T. Let f be a k-ary balancedoperation that does not preserve T. Then there exist tuples t1, . . . , tk in T suchthat α1t1 + · · ·+ αktk is not in T, where the sum of the αi is equal to 1 (andwhere we may assume that no αi is equal to 0). For each positive αi, replace αitiin the sum with ti + · · ·+ ti (αi times); likewise, for each negative αi, replaceαiti in the sum with −ti − · · · − ti (−αi times). Each tuple then has coefficient+1 or −1 in the sum; since the sum of coefficients is +1, by permuting the sum’sterms, the coefficients can be made to alternate between +1 and −1. J

As an example, consider the 2-OR relation given by (1, 1), (1, 0), (0, 1)and observe that it is not balanced, since (1, 0)− (1, 1) + (0, 1) = (0, 0) and(0, 0) /∈ 2- OR. The relation R := (0, 0, 1), (0, 1, 0), (1, 0, 0) (corresponding toMONOTONE EXACT 3-SAT), is an example of a relation that is balanced.

We will use the following straightforwardly verified fact tacitly, throughout(recall Definition 2.34).

Observation 4.19 Each balanced operation is idempotent and self-dual.The main result of this section is that when Γ is balanced, then CSP(Γ) has

a kernel with linearly many constraints. To prove this result, we will use twoadditional technical lemmas. When S is a matrix, we use si to refer to the i’throw of S. The proof of the following lemma was contributed by Emil Jerábek.

I Lemma 4.20 Let S be an m× n integer matrix. Let u ∈ Zn be a row vector.If u /∈ spanZ(s1, . . . , sm), then there exists a prime power q such that u /∈spanq(s1, . . . , sm). Furthermore, there is a polynomial-time algorithm thatcomputes a (possibly composite) integer q′ for which u /∈ spanq′(s1, . . . , sm).

Proof. Suppose u /∈ spanZ(s1, . . . , sm), thus u cannot be written as a linearcombination of the rows of S over Z; equivalently, the system yS = u has nosolutions for y over Z. We will show that there exists a prime power q, suchthat yS ≡q u has no solutions over Z/qZ and thus u /∈ spanq(s1, . . . , sm).

There exist an m×m matrix M and an n× n matrix N over Z, such that Mand N are invertible over Z and furthermore S′ := MSN is in Smith NormalForm (recall Definition 2.21). In particular, this implies that S′ is a diagonalmatrix. Furthermore, M, S, and N can be computed in polynomial time byTheorem 2.23. Define u′ := uN.

B Claim 4.21 If y′S′ = u′ is solvable for y′ over Z, then yS = u is solvable for yover Z.

4


Proof. Consider y′ such that y′S′ = u′. One can verify that y := y′M solvesyS = u, as

yS = y′MS = y′MSNN−1 = y′S′N−1 = u′N−1 = uNN−1 = u. C

B Claim 4.22 Let q ∈N. If yS ≡q u is solvable for y, then y′S′ ≡q u′ is solvablefor y′.

Proof. Let y be such that yS ≡q u. Define y′ := yM−1. We verify that y′S′ ≡q u′

as follows.

y′S′ = y′MSN = yM−1MSN = ySN ≡q uN = u′. C

Using these two claims, our proof proceeds as follows. From our startingassumption u /∈ spanZ(s1, . . . , sm), it follows by Claim 4.21 that y′S′ = u′

has no solution y′ over Z. Below, we prove that this implies there exists aprime power q such that y′S′ ≡q u′ is unsolvable. Furthermore we show how tofind a (possibly composite) integer q in polynomial time such that y′S′ ≡q u′

is unsolvable. By Claim 4.22 this will imply that yS ≡q u is unsolvable andcomplete the proof.

We inferred that y′S′ = u′ has no solutions over Z. Since all non-zeroelements of S′ are on the diagonal, this implies that either there exists i ∈ [n],such that u′i is not divisible by s′i,i, or s′i,i is zero while u′i 6= 0. We finish the proofby a case distinction.

• Suppose there exists i ∈ [n] such that s′i,i = 0, while u′i 6= 0. Choose a primepower q such that q - u′i. Observe that such q can be obtained in polynomial-time, for example by taking the smallest power of two that is larger than u′i.It is easy to see that thereby, u′i 6≡q 0. Since s′i,i ≡q 0 holds trivially in thiscase, the system y′S′ ≡q u′ has no solution.

• Otherwise, there exists i ∈ [n] such that s′i,i - u′i. Choose a prime power q suchthat q - u′i and q | s′i,i. Such a prime power can be chosen by letting q := p`

for a prime p that occurs ` ≥ 1 times in the prime factorization of s′i,i, butless often in the prime factorization of u′i. Thereby, u′i 6≡q 0, while s′i,i ≡q 0. Itagain follows that the system y′S′ ≡q u′ has no solutions.

It is not clear how to obtain q as described above in polynomial time, asit requires computing the prime decompositions of possibly large integers.However, observe that letting q := s′i,i means u′i 6≡q 0 while s′i,i ≡q 0, implyingthat for this choice of q the system y′S′ ≡q u′ again has no solutions. Sincethis q is trivial to obtain in polynomial time, that concludes the proof. J

Given that u /∈ spanZ(s1, . . . , sm), we can thus use Lemma 4.20 to obtainan integer q such that u /∈ spanq(s1, . . . , sm). The next lemma shows thatin this case the system S′x ≡q b, where b = (0, . . . , 0, c) and S′ is the matrixconsisting of rows s1, . . . , sm, u, has a solution x for some c 6≡q 0.

4


I Lemma 4.23 Let q > 1 be an integer. Let A be an m× n matrix over Z/qZ.Suppose am /∈ spanq(a1, . . . , am−1). Then there exists a constant c 6≡q 0 forwhich the system Ax ≡q b has a solution, where b := (0, . . . , 0, c)T is the vectorwith c on the last position and zeros in all other positions. Furthermore, x and ccan be computed in polynomial time.

Proof. Let A′ be the (m− 1)× n matrix consisting of the first m− 1 rows of A.Find the Smith decomposition (Definition 2.21) of A′ over Z, thus there existan (m− 1)× (m− 1) matrix M′ and an n× n matrix N, such that S′ := M′A′Nis in Smith Normal Form and M′ and N are invertible over Z. The only propertyof the Smith Normal Form we will rely on is that S′ is a diagonal matrix.

We show that similar properties hold over Z/qZ. Let (M′)−1, N−1 bethe inverses of M′ and N over Z. It is easy to verify that NN−1 = I ≡q Iand M′(M′)−1 = I ≡q I, such that M′ and N are still invertible over Z/qZ.Furthermore, S′ (mod q) remains a diagonal matrix.

Define M to be the following m×m matrix

M :=(

M′ 00 1

),

then M has an inverse over Z/qZ that is given by the following matrix

M−1 ≡q

((M′)−1 0

0 1

).

Define S := MAN and verify that

S := MAN ≡q

(S′

amN

), (4.3)

meaning that the first m− 1 rows of S are equal to the first m− 1 rows of S′,and the last row of S is given by the row vector amN.

The following two claims will be used to show that proving the lemmastatement for matrix S, will give the desired result for A.

B Claim 4.24 Let b := (0, . . . , 0, c) for some constant c. The system Sx′ ≡q bhas a solution, if and only if the system Ax ≡q b has a solution.

Proof. Let x be a solution for Ax ≡q b. Define x′ := N−1x. Then MANx′ ≡qMAx ≡q Mb. Observe that by the definitions of M and b, Mb ≡q b, whichconcludes this direction of the proof.

For the other direction, let x′ be a solution for MANx′ ≡q b. Definex := Nx′. Then M−1MANx′ ≡q M−1b and thus ANx′ ≡q M−1b and therebyAx ≡q M−1b. By the definition of M−1 and b, we again have M−1b ≡q b. C

4


B Claim 4.25 Let S be defined as above. Then sm ∈ spanq(s1, . . . , sm−1) ifand only if am ∈ spanq(a1, . . . , am−1).

Proof. Suppose sm ∈ spanq(s1, . . . , sm−1). This implies that there existα1, . . . , αm−1 such that ∑i∈[m−1] αisi ≡q sm ≡q amN. Thus, ∑i∈[m−1] αis′i ≡q

amN, and for α = (α1, . . . , αm−1) we therefore have αS′ ≡q amN, implying(αM′)A′N ≡q amN. Since N is invertible, it follows that

(αM′)A′ ≡q am

and thus am ∈ spanq(a1, . . . , am−1).For the other direction, suppose am ∈ spanq(a1, . . . , am−1). Thus, there

exists α ≡q (α1, . . . , αm−1) such that αA′ ≡q am. Let α′ := α(M′)−1. Then

α′S′ ≡q α′M′A′N ≡q αA′N ≡q amN ≡q sm,

and it follows from the definition of S in Equation (4.3) that thereby sm ∈spanq(s1, . . . , sm−1). C

It follows from Claims 4.24 and 4.25, that it suffices to show that Sx =(0, . . . , 0, c)T has a solution for some c 6≡q 0 if sm /∈ spanq(s1, . . . , sm−1).Observe that since S′ (the first m − 1 rows of S) is a diagonal matrix, theremust exist i ∈ [m− 1] for which there is no αi satisfying si,i · αi ≡q sm,i. Oth-erwise, it is easy to see that ∑i∈[m−1] αisi ≡q sm, contradicting that sm /∈spanq(s1, . . . , sm−1). We now do a case distinction.

Suppose there exists i ∈ [m− 1] such that si,i ≡q 0, while sm,i 6≡q 0. Letx = (0, . . . , 0, 1, 0, . . . , 0) be the vector with 1 in the i’th position and zeros in allother positions. It is easy to verify that Sx ≡q (0, . . . , 0, sm,i)

T and thereby thesystem Sx ≡q b has a solution for c = sm,i.

Otherwise, choose i such that si,i 6≡q 0 and there exists no integer αi satisfyingsi,i · αi ≡q sm,i. We consider the following two cases.

• Suppose gcd(si,i, q) | sm,i over the integers. Let d ∈ Z such that sm,i = d ·gcd(si,i, q) over the integers. It follows from Bézout’s identity (Theorem 2.15)that there exist integers a and b such that gcd(si,i, q) = a · si,i + b · q. Thereby,

sm,i ≡q d · (a · si,i + b · q) ≡q d · a · si,i

which is a contradiction with the assumption that no integer αi exists suchthat si,i · αi ≡q sm,i.

• Suppose gcd(si,i, q) - sm,i over the integers, let y := q/ gcd(si,i, q). Definex := (0, . . . , 0, y, 0, . . . , 0)T as the vector with y in position i. Then

Sx ≡q (0, . . . , 0, y · si,i, 0, . . . , 0, y · sm,i)T.

4


Then y · si,i = si,i · q/ gcd(si,i, q) = ±lcm(q, si,i) ≡q 0, recall that lcm standsfor least common multiple. It remains to show that y · sm,i 6≡q 0. Sup-pose for contradiction that y · sm,i ≡q 0, such that over the integers q ·sm,i/ gcd(si,i, q) = d · q for some d ∈ Z. But then sm,i/ gcd(si,i, q) = d ∈ Z

contradicting that gcd(si,i, q) - sm,i. It follows that y · sm,i 6≡q 0. Thus, forx defined as above and c := y · sm,i 6≡q 0 we obtain that Sx = (0, . . . , 0, c)T,concluding the proof.

Computing the Smith decomposition of a matrix can be done in polynomial time(Theorem 2.23) and computing the greatest common divisor of two integerscan be done in time polynomial in their binary encoding. Hereby, it is straight-forward to turn the above proof in a polynomial-time algorithm that computesboth x and c. J

In the context of capturing a Boolean relation R by degree-1 polynomials,the constructive proof of Lemma 4.23 effectively shows the following: givena ring Z/qZ over which a degree-1 polynomial exists that captures a certaintuple u /∈ R, one can constructively find the coefficients x of a polynomial thatcaptures u by following the steps in the proof. We will formalize this idea inTheorem 4.28, but first we prove the main sparsification result.

I Theorem 4.26 Let Γ be a balanced Boolean constraint language. Then CSP(Γ)has a kernel with O(n) constraints that are a subset of the original constraints.The kernel can be stored using O(n log n) bits.

Proof. We start by showing in the following claim that for all relations R ∈ Γwe can find linear polynomials over an appropriate ring that capture the tuplesthat are not in R. We will then use this representation to obtain our kernel.

B Claim 4.27 For all relations R in a balanced Boolean constraint language Γ,for all u /∈ R, there exists a linear polynomial pu over a ring Eu ∈ Z/quZ |qu is a prime power that captures u with respect to R.

Proof. Suppose for a contradiction that there exists R ∈ Γ and u /∈ R, suchthat no prime power q and polynomial p over Z/qZ exist that capture u withrespect to R. Let R = r1, . . . , r`. By the non-existence of p and q, the system

1 r1,1 r1,2 . . . r1,k1 r2,1 r2,2 . . . r2,k...

......

. . ....

1 r`,1 r`,2 . . . r`,k1 u1 u2 . . . uk

α0α1α2...

αk

≡q

00...0c

has no solution for any prime power q and c 6≡q 0. Otherwise, it is easy toverify that q is the desired prime power and p(x1, . . . , xk) := α0 + ∑k

i=1 αixi isthe desired polynomial.

4


The fact that no solution exists, implies that (1, u1, . . . , uk) is in the spanof the remaining rows of the matrix, by Lemma 4.23. But this implies thatfor any prime power q, there exist coefficients β1, . . . , β` over Z/qZ such thatu ≡q ∑ βiri. Furthermore, since the first column of the matrix is the all-onescolumn, we obtain that ∑ βi ≡q 1. By Lemma 4.20, it follows that there existinteger coefficients γ1, . . . , γ` such that ∑ γi = 1 and furthermore u = ∑ γiri.But it immediately follows that R ∈ Γ is not preserved by the balanced operationgiven by f (x1, . . . , x`) := ∑ γixi, which contradicts the assumption that Γ isbalanced. C

The theorem statement now follows directly from Theorem 4.6. J

The next theorem shows that given an arbitrary constraint language, it ispossible to decide in polynomial time whether it is balanced. Furthermore, forbalanced constraint languages there is a polynomial-time algorithm to obtainthe capturing polynomials. Here we assume that the relation R is encodedexplicitly as a list of tuples contained in the relation, together with a list oftuples not contained in the relation, making the total encoding size of a Booleank-ary relation at least 2k.

I Theorem 4.28 There is a polynomial-time algorithm that, given a k-aryBoolean relation R, decides whether R is balanced and for balanced R outputs apolynomial pu and integer qu for all u ∈ 0, 1k \ R such that pu captures u overZ/quZ.

Proof. Let R = r1, . . . , rm. Define si as (1, ri,1, . . . , ri,k) for i ∈ [m]. Relation Ris balanced if and only if there exists no u ∈ 0, 1k \ R such that ext(u) ∈spanZ(s1, . . . , sm), where ext(u) = (1, u1, . . . , uk). Thereby, R is balanced ifand only if for all u ∈ 0, 1k \ R, the following system has no solutions over Z.

(α1, . . . , αm

)

1 r1,1 r1,2 . . . r1,k1 r2,1 r2,2 . . . r2,k...

......

. . ....

1 rm,1 rm,2 . . . rm,k

=(1, u1, . . . , uk

)

The solvability of these systems over the integers can be tested in time that ispolynomial in the size of an explicit encoding of R that indicates for each tuplewhether or not it is contained in R. This computation can be done, for example,by first computing the Smith Normal Form of the matrix.

Suppose R is balanced. Let u ∈ 0, 1k \ R, we show how to find pu andqu. Let si be defined as above. Then ext(u) /∈ spanZ(s1, . . . , sm) since R isbalanced. Apply Lemma 4.20 to obtain qu ∈ N in polynomial time such thatext(u) /∈ spanqu(s1, . . . , sm). It follows from Lemma 4.23 that the following

4


system has a solution1 r1,1 r1,2 . . . r1,k1 r2,1 r2,2 . . . r2,k...

......

. . ....

1 rm,1 rm,2 . . . rm,k1 u1 u2 . . . uk

α0α1α2...

αk

≡qu

00...0c

for (α0, α1, . . . , αk)

T and some c 6≡qu 0 and that we can find it in polynomialtime. Let pu(x1, . . . , xk) := α0 + ∑k

i=1 αixi. It is straightforward to verify that pucaptures u with respect to R over Z/quZ. J

The kernelization result of Theorem 4.26 above is obtained by using the factthat when Γ is balanced, the constraints in CSP(Γ) can be replaced by linearpolynomials. We show in the next theorem that this approach fails when Γ isnot balanced.

I Theorem 4.29 Let R be a k-ary Boolean relation that is not balanced. Thenthere exists u ∈ 0, 1k \ R for which there exists no linear polynomial pu overany ring E that captures u with respect to R.

Proof. Suppose R is not balanced. By Proposition 4.18, this implies R is violatedby an alternating operation. Let f be an alternating operation that does notpreserve R, such that f (y1, . . . , ym) := ∑m

i=1(−1)i+1yi for some odd m, and forsome (not necessarily distinct) r1, . . . , rm ∈ R we have f (r1, . . . , rm) = u withu /∈ R (recall the definition of f (r1, . . . , rm) from Definition 2.35).

Suppose for a contradiction that there exists a linear polynomial pu that cap-tures u over a ring Eu. Let ri = (ri,1, . . . , ri,k) for i ∈ [m]. Since f (r1, . . . , rm) =u, we have the following equality over Z:

ui = r1,i − r2,i . . . + rm,i. (4.4)

Since rj,i ∈ 0, 1 for all i ∈ [k] and j ∈ [m], Equation (4.4) holds over anyring, so in particular over Eu.

Let pu(x1, . . . , xk) be given by pu(x1, . . . , xk) := β0 + β1 · x1 + β2 · x2 + . . . +βk · xk for ring elements β0, . . . , βk from Eu. By Definition 4.1, pu(ri,1, . . . , ri,k) =0 for all i ∈ [m]. Thereby the following equalities hold over Eu:

pu(u1, . . . , uk) = β0 +k

∑i=1

βi · ui

= β0 +k

∑i=1

βi · (r1,i − r2,i . . . + rm,i)

= β0 +k

∑i=1

βi · r1,i − βi · r2,i . . . + βi · rm,i

4

4.5 Symmetric CSPs 83

= (β0 +k

∑i=1

βi · r1,i)− (β0 +k

∑i=1

βi · r2,i) . . . + (β0 +k

∑i=1

βi · rm,i) (4.5)

= pu(r1,1, . . . , r1,k)− pu(r2,1, . . . , r2,k) . . . + pu(rm,1, . . . , rm,k)

= 0,

where the fourth equality follows from the fact that in line (4.5) all but oneof the terms β0 cancel, since the summation alternates between addition andsubtraction. This contradicts the fact that pu(u1, . . . , uk) 6= 0. Thereby, thereexists no linear polynomial that captures u with respect to R. J

4.5 Symmetric CSPsIn this section, we characterize the symmetric Boolean constraint languages Γ forwhich CSP(Γ) has a linear sparsification. For a Boolean tuple x = (x1, . . . , xk),let weight(x) denote the number of ones in the tuple, such that weight(x) :=∑i xi. We can now define symmetric relations.

Definition 4.30 We say a k-ary Boolean relation R is symmetric, if there ex-ists S ⊆ 0, 1, . . . , k such that a tuple x = (x1, . . . , xk) is in R if and only ifweight(x) ∈ S. We call S the set of satisfying weights for R. We will say that aconstraint language Γ is symmetric, if it only contains symmetric relations.

At the end of this section we will prove that for symmetric Boolean con-straint languages, CSP(Γ) has no sparsification with O(n2−ε) constraints if Γ isnot balanced, assuming NP * coNP/poly. Together with the results from theprevious section, this gives a dichotomy result for the sparsification of CSP(Γ)for symmetric constraint languages.

The following lemma shows that under certain conditions on the satisfyingweights, it follows that the relation cone-defines 2-OR, which we will use toobtain the kernelization lower bound.

I Lemma 4.31 Let R be a k-ary symmetric Boolean relation with satisfyingweights S ⊆ 0, 1, . . . , k. Let U := 0, 1, . . . , k \ S. If there exist a, b, c ∈ S andd ∈ U such that a− b + c = d, then R cone-defines 2-OR.

Proof. We first show the result when b ≤ a, b ≤ c, and b ≤ d. In this case, weuse the following tuple to express x1 ∨ x2.

(¬x1, . . . ,¬x1︸︷︷︸(a−b) copies

,¬x2, . . . ,¬x2︸︷︷︸(c−b) copies

, 1, . . . , 1︸︷︷︸b copies

, 0, . . . , 0︸︷︷︸(k−d) copies

).

For a = b or c = b, the tuple above does not give a valid cone definition.Observe however that in this case, c = d or a = d respectively, which cannothappen as a, c ∈ S while d /∈ S.

4


Let f : x1, x2 → 0, 1, then

(¬ f (x1), . . . ,¬ f (x1),¬ f (x2), . . . ,¬ f (x2), 1, . . . , 1, 0, . . . , 0)

has weight (a− b)(1− f (x1)) + (c− b)(1− f (x2)) + b. It is easy to verify thatfor f (x1) = f (x2) = 0, this implies the tuple has weight a + c− b = d /∈ S andthus the tuple is not in R. Otherwise, the weight is either a, b, or c. In thesecases the tuple is contained in R, as the weight is contained in S.

Note that the above case applies when b is the smallest of all four integers.We now consider the remaining cases. Suppose a ≤ b, a ≤ c, and a ≤ d (thecase where c is smallest is symmetric by swapping a and c). In this case, use thetuple

(¬x1, . . . ,¬x1︸︷︷︸(d−a) copies

, x2, . . . , x2︸︷︷︸(b−a) copies

, 1, . . . , 1︸︷︷︸a copies

, 0, . . . , 0︸︷︷︸(k−c) copies

).

Consider an assignment f satisfying x1 ∨ x2, verify that the weight of the abovetuple under this assignment lies in a, b, c, and thus the tuple is contained inR. Assigning 0 to both x1 and x2 gives weight d, such that the tuple is not in R.

Otherwise, we have d ≤ a, d ≤ b, and d ≤ c and use the tuple

( x1, . . . , x1︸︷︷︸(a−d) copies

, x2, . . . , x2︸︷︷︸(c−d) copies

, 1, . . . , 1︸︷︷︸d copies

, 0, . . . , 0︸︷︷︸(k−b) copies

).

It is again easy to verify that any assignment to x1 and x2 satisfies this tuple ifand only if it satisfies (x1 ∨ x2), using the fact that a− d + c = b ∈ S. J

We now give the main lemma that is needed to prove Theorem 4.34. Itshows that if a relation is symmetric and not balanced, it must cone-define 2-OR,by using the result from the lemma above.

I Lemma 4.32 Let R be a symmetric Boolean relation of arity k. If R is notbalanced, then R cone-defines 2-OR.

Proof. Let f be a balanced operation that does not preserve R. Since f has inte-ger coefficients, it follows that there exist (not necessarily distinct) r1, . . . , rm ∈R, such that r1 − r2 + r3 − r4 · · · + rm = u for some u ∈ 0, 1k \ R andodd m ≥ 3. Thereby, weight(r1)−weight(r2) + weight(r3)−weight(r4) · · ·+weight(rm) = weight(u). Let S be the set of satisfying weights for R and letU := 0, . . . , k \ S. Define si := weight(ri) for i ∈ [m], and t = weight(u),such that s1 − s2 + s3 − s4 . . . + sm = t, and furthermore si ∈ S for all i, andt ∈ U. We show that there exist a, b, c ∈ S and d ∈ U such that a− b + c = d,such that the result follows from Lemma 4.31. We do this by induction on thelength of the alternating sum.

If m = 3, we have that s1 − s2 + s3 = t and define a := s1, b := s2, c := s3,and d := t.

If m > 3, we will use the following claim.

4

4.5 Symmetric CSPs 85

B Claim 4.33 Let s1, . . . , sm ∈ S and t ∈ U such that s1 − s2 + s3 − s4 · · · +sm = t. There exist distinct i, j, ` ∈ [m] with i, j odd and ` even, such thatsi − s` + sj ∈ 0, . . . , k.

Proof. If there exist distinct i, j, ` ∈ [m] with i, j odd and ` even, such thatsi ≥ s` ≥ sj, then these i, j, ` satisfy the claim statement. Suppose these do notexist, we consider two options.

• Suppose si ≥ s` for all i, ` ∈ [m] with i odd and ` even. It is easy to see thatthereby, for any i, j, ` with i, j odd and ` even it holds that si − s` + sj ≥ 0.Furthermore, si − s` + sj ≤ s1 − s2 + s3 − s4 · · · + sm = t and t ≤ k sincet ∈ U. Thus, any distinct i, j, ` ∈ [m] with i, j odd and ` even satisfy thestatement.

• Otherwise, si ≤ s` for all i, ` ∈ [m] with i odd and ` even. It follows thatfor any i, j, ` with i, j odd and ` even si − s` + sj ≤ k, as si − s` ≤ 0 andsj ≤ k. Furthermore, si − s` + sj ≥ s1 − s2 + s3 − s4 · · ·+ sm = t and t ≥ 0by definition. Thus, any distinct i, j, ` ∈ [m] with i, j odd and ` even satisfythe statement. C

Use Claim 4.33 to find i, j, ` such that si − s` + sj ∈ 0, . . . , k. We considertwo options. If si − s` + sj ∈ U, then define d := si − s` + sj, a := si, b := s`,and c := sj and we are done. The other option is that si − s` + sj = s ∈ S.Replacing si − s` + sj by s in s1− s2 + s3− s4 · · ·+ sm gives a shorter alternatingsum with result t. We obtain a, b, c, and d by the induction hypothesis.

Thereby, we have obtained a, b, c ∈ S, d ∈ U such that a− b + c = d. It nowfollows from Lemma 4.31 that R cone-defines 2-OR. J

Using the lemma above, we can now prove the main result of this section.

I Theorem 4.34 Let Γ be a finite Boolean symmetric intractable constraintlanguage.

• If Γ is balanced, then CSP(Γ) parameterized by the number of variables n has akernel with O(n) constraints that can be stored in O(n log n) bits.

• If Γ is not balanced, then CSP(Γ) parameterized by the number of variables ndoes not have a generalized kernel of size O(n2−ε) for any ε > 0, unlessNP ⊆ coNP/poly.

Proof. If Γ is balanced, it follows from Theorem 4.26 that CSP(Γ) has a kernelwith O(n) constraints that can be stored in O(n log n) bits. Note that theassumption that Γ is symmetric is not needed in this case.

If the symmetric constraint language Γ is not balanced, then Γ containsa symmetric relation R that is not balanced. It follows from Lemma 4.32that R cone-defines the 2-OR relation. Thereby, we obtain from Theorem 4.11that CSP(Γ) has no generalized kernel of size O(n2−ε) for any ε > 0, unlessNP ⊆ coNP/poly. J

4


4.6 Full classification of arity-3 CSPsIn this section, we will give a full classification of the sparsifiability for constraintlanguages that consist only of low-arity relations. We assume all consideredconstraint languages to be Boolean. We start by providing some additionalproperties of cone-definability.

Observation 4.35 If a Boolean relation T of arity m is cone-definable from aBoolean relation U of arity n, then m ≤ n.Observation 4.36 Suppose a Boolean relation T is cone-definable from a Booleanrelation U, and that g is a partial operation that is idempotent and self-dual. If gpreserves U, then g preserves T.

The above observation follows quite straightforwardly from the definitions.For contradiction, suppose g does not preserve T. The proof strategy is to find awitness to this, and using the cone-definition of T from U to find a witness thatg also does not preserve U.

The next observation is that cone-definability is transitive.

Observation 4.37 Suppose that T1, T2, T3 are Boolean relations such that T2 iscone-definable from T1, and T3 is cone-definable from T2. Then T3 is cone-definablefrom T1.

The following two propositions are consequences of the above observations.We will tacitly use them in the sequel. Morally, they show that the propertiesof relations that we are interested in are invariant under cone-interdefinability.The first proposition is immediate from Observation 4.35.

I Proposition 4.38 If Boolean relations T, U are cone-interdefinable, then theyhave the same arity. J

The next proposition is follows directly from Observation 4.36.

I Proposition 4.39 Suppose that T and U are Boolean relations that are cone-interdefinable, and that g is a partial operation that is idempotent and self-dual.Then, g preserves T if and only if g preserves U. J

The next results will show that when the constraint language consists of onlylow-arity relations, if the constraint language is not balanced, it can cone-definethe 2-OR relation.

Observation 4.40 Each Boolean relation of arity 1 is balanced.I Theorem 4.41 A Boolean relation of arity 2 is balanced if and only if it is notcone-interdefinable with the 2-OR relation.

Proof. Let R ⊆ 0, 12 be a relation. We prove the two directions separately.(⇒) Proof by contraposition. Suppose that R is cone-interdefinable with

2-OR. Then in particular, R cone-defines the 2-OR relation. Let (y1, y2) be a tuplewitnessing cone-definability as in Definition 4.7. Since 2-OR is symmetric in itstwo arguments, we may assume without loss of generality that yi is either xi

4

4.6 Full classification of arity-3 CSPs 87

or ¬xi for i ∈ [2]. Define g : 0, 12 → 0, 12 by letting g(b1, b2) := (b1, b2)

where bi = bi if yi = xi and bi = 1− bi if yi = ¬xi for i ∈ [2]. By definitionof cone-definability we then have g(1, 0), g(0, 1), g(1, 1) ∈ R while g(0, 0) /∈ R.But g(1, 0)− g(1, 1) + g(0, 1) = g(0, 0), showing that R is not preserved by allalternating operations and therefore is not balanced.

(⇐) We again use contraposition. Suppose R is not balanced; we willprove R is cone-interdefinable with 2-OR. Let f : 0, 1k → 0, 1 be a bal-anced partial Boolean operation of minimum arity that does not preserve R.Let α1, . . . , αk ∈ Z be the coefficients of f , as in Definition 4.15. Let s1, . . . , sk ∈R such that f (s1, . . . , sk) = u ∈ 0, 12 \R witnesses that f does not preserve R.By Definition 4.15 we have u = ∑k

i=1 αisi and ∑ki=1 αi = 1. This shows that

if αi = 0 for some coordinate i, then that position does not influence the valueof f , implying the existence of a smaller-arity balanced relation that does notpreserve T. Hence our choice of f as a minimum-arity operation ensures that αiis nonzero for all i ∈ [k].B Claim 4.42 The tuples s1, . . . , sk are all distinct.

Proof. Suppose that si = sj for some distinct i, j ∈ [k], and assume withoutloss of generality that i = k− 1 and j = k. But then the balanced operation f ′

of arity k − 1 defined by the coefficients (α1, . . . , αk−2, αk−1 + αk) does notpreserve T since f ′(s1, . . . , sk−1) = f (s1, . . . , sk) = u /∈ R, contradictingthat f is a minimum-arity balanced operation that does not preserve R. C

B Claim 4.43 The arity k of operation f is 3.

Proof. Since s1, . . . , sk ∈ R ⊆ 0, 12 are all distinct, while u ∈ 0, 12 \ R, wehave k ≤ 3. We cannot have k = 1 since that would imply f (s1) = s1 ∈ R,contradicting that f (s1, . . . , sk) = f (s1) = u /∈ R. It remains to show that k 6=2. So assume for a contradiction that k = 2. Since s1 and s2 are distinct, thereis a position ` ∈ [2] such that s1,` 6= s2,`. Assume without loss of generalitythat s1,` = 1 while s2,` = 0. Since f (s1, . . . , sk) = f (s1, s2) = α1s1 + α2s2 =

u ∈ 0, 12 \ R, we find u` = α1s1,` + α2s2,` = α1 · 1+ α2 · 0 ∈ 0, 1. Since α1 isa nonzero integer, we must have α1 = 1. But since α1 + α2 = 1 by definition of abalanced operation, this implies α2 = 0, contradicting that f is a minimum-aritybalanced operation that does not preserve R. C

The previous two claims show that there are at least three distinct tuplesin R ⊆ 0, 12. Since u ∈ 0, 12 \ R it follows that |R| = 3. Hence R and2-OR are both Boolean relations of arity two that each have three tuples. Tocone-define one from the other, one may easily verify that it suffices to use thetuple (y1, y2), where yi = xi if ui = 0 and yi = ¬xi otherwise. J

In order to show the classification of the arity-3 relations with a linearsparsification in Theorem 4.46, we first present some additional lemmas and

4


definitions. Let U ⊆ 0, 1n be a relation. We say that w ∈ 0, 1n is a witnessfor U if w /∈ U, and there exists a balanced operation f : 0, 1k → 0, 1and tuples t1, . . . , tk ∈ U such that w = f (t1, . . . , tk). Observe that U is notbalanced if and only if there exists a witness for U.

I Lemma 4.44 Suppose that U ⊆ 0, 1n is a Boolean relation, and that thereexist an integer c and a natural number m > 1 such that, for each u ∈ U, it holdsthat

weight(u) ≡m c.

Then, if w is a witness for U, it holds that weight(w) ≡m c.

Proof. Since w is a witness for U, there exist tuples t1 = (t1,1, . . . , t1,n), . . . , tk =

(tk,1, . . . , tk,n) and a balanced operation f : 0, 1k → 0, 1 such that

f (t1, . . . , tk) = w.

Let α1, . . . , αk be the coefficients of f . From f (t1, . . . , tk) = w, we obtainthat α1weight(t1) + · · ·+ αkweight(tk) = weight(w). Since ∑i∈[k] αi = 1 bydefinition of a balanced operation, we have

α1weight(t1) + · · ·+ αkweight(tk) ≡m α1c + · · ·+ αkc =

∑i∈[k]

αi

c = c,

and the result follows. J

We will view Boolean tuples of arity n as maps f : [n]→ 0, 1, via the natu-ral correspondence where such a map f represents the tuple ( f (1), . . . , f (n)).We freely interchange between these two representations of tuples.

For S ⊆N, we say that f : S→ 0, 1 is a no-good of U ⊆ 0, 1n when:

• S ⊆ [n];

• each extension g : [n]→ 0, 1 of f is not an element of U; and

• there exists an extension h : [n]→ 0, 1 of f that is a witness for U.

We say that f : S → 0, 1 is a min-no-good if f is a no-good, but no properrestriction of f is a no-good. Observe that the following are equivalent, for arelation: the relation is not balanced; it has a witness; it has a no-good; it has amin-no-good.

When U ⊆ 0, 1n is a relation and S ⊆ [n], let s1 < · · · < sm denote theelements of S; then, we use U S to denote the relation (h(s1), . . . , h(sm)) | h ∈U.I Proposition 4.45 Let U ⊆ 0, 1n be a relation, let S ⊆ [n], and suppose thatf : S→ 0, 1 is a min-no-good of U. Then f is a min-no-good of U S.

4

4.6 Full classification of arity-3 CSPs 89

Proof. Observe that f is not in U S; since f has an extension that is a witnessfor U, it follows that f is a witness for U S. Thus, f is a no-good of U S. Inorder to obtain that f is a min-no-good of U S, it suffices to establish that, forany restriction f− : S− → 0, 1 of f , it holds that f− is a no-good of U if andonly if f− is a no-good of U S. This follows from what we have establishedconcerning f and the following fact: all extensions h : S→ 0, 1 of f− are notin U S if and only if all extensions h′ : [n]→ 0, 1 of f− are not in U. J

Using these tools we are finally in position to prove Theorem 4.46.

I Theorem 4.46 Suppose that U ⊆ 0, 13 is an arity-3 Boolean relation that isnot balanced. Then, the 2-OR relation is cone-definable from U.

Proof. Let f : S→ 0, 1 be a min-no-good of U.It cannot hold that |S| = 0, since then U would be empty and hence

preserved by all balanced operations. It also cannot hold that |S| = 1, sincethen f would be a min-no-good of U S (by Proposition 4.45), which is notpossible since U S would have arity 1 and hence would be preserved by allbalanced operations (by Observation 4.40).

For the remaining cases, by replacing U with a relation that is interdefinablewith it, we may assume that f : S→ 0, 1 maps each s ∈ S to 0.

Suppose that |S| = 2, and assume for the sake of notation that S =1, 2 (this can be obtained by replacing U with a relation that is interde-finable with it). By Proposition 4.45, f is a min-no-good of U S. By Theo-rem 4.41, we obtain that U S contains all tuples other than f , that is, wehave (0, 1), (1, 0), (1, 1) = U S. It follows that there exists a realization,where we define a realization to be a tuple (a1, a2, a3) ∈ 0, 13 such that(0, 1, a1), (1, 0, a2), (1, 1, a3) ∈ U. Let us refer to (0, 0, 1) and (1, 1, 0) as badtuples, and to all other arity 3 tuples as good tuples.

B Claim 4.47 If there is a realization that is a good tuple, then the 2-OR relationis cone-definable from U.

Proof. We show cone-definability via a tuple of the form (x1, x2, y) where y ∈0, 1, x1, x2,¬x1,¬x2. The right setting for y can be derived from the realizationthat forms a good tuple.

• choose y = 0 for (0, 0, 0);

• y = 1 for (1, 1, 1);

• y = x1 for (0, 1, 1);

• y = x2 for (1, 0, 1);

• y = ¬x1 for (1, 0, 0); and,

• y = ¬x2 for (0, 1, 0).

4


It is easy to verify that this choice of y gives the desired cone-definition. C

B Claim 4.48 There is a realization that is a good tuple.

Proof. Proof by contradiction. If there exists no realization that is a good tuple,every realization is a bad tuple; moreover, there is a unique realization, forif there were more than one, there would exist a realization that was a goodtuple. We may assume (up to interdefinability of U) that the unique realizationis (1, 1, 0). Then, U is the relation (0, 1, 1), (1, 0, 1), (1, 1, 0) containing exactlythe weight 2 tuples; applying Lemma 4.44 to U with a = 2 and m = 3, weobtain that for any witness w for U, it holds that weight(w) ≡3 2. This impliesthat f has no extension w′ that is a witness, since any such extension must haveweight(w′) equal to 0 or 1 as f maps both s ∈ S to 0; we have thus contradictedthat f is a no-good of U. C

Together, the two claims complete the case that |S| = 2.Suppose that |S| = 3. Since f is a min-no-good mapping all s ∈ S to 0, it

follows that (0, 0, 0) is a witness. Thereby, it follows that each of the weight 1tuples (1, 0, 0), (0, 1, 0), (0, 0, 1) is contained in U. We claim that U contains aweight 2 tuple; if not, then U would contain only weight 1 and weight 3 tuples,and by invoking Lemma 4.44 with a = 1 and m = 2, we would obtain thatweight((0, 0, 0)) ≡2 1, a contradiction. Assume for the sake of notation that Ucontains the weight 2 tuple (0, 1, 1). Then U cone-defines the 2-OR relation viathe tuple (0, x1, x2), since (0, 0, 0) /∈ R and (0, 1, 0), (0, 0, 1), (0, 1, 1) ∈ R. J

Combining the results in this section with the results in previous sections,allows us to give a full classification of the sparsifiability of constraint languagesthat only contain relations of arity at most three. Observe that any k-ary relationR such that R 6= ∅ and 0, 1k \ R 6= ∅ cone-defines the 1-OR relation. Sincewe assume that Γ is intractable in the next theorem, it follows that k is alwaysdefined and k ∈ 1, 2, 3.I Theorem 4.49 Let Γ be an intractable Boolean constraint language such thateach relation therein has arity ≤ 3. Let k ∈N be the largest value for which k-OR

can be cone-defined from a relation in Γ. Then CSP(Γ) has a kernel with O(nk)constraints that can be encoded in O(nk log k) bits, but for any ε > 0 there is nokernel of size O(nk−ε), unless NP ⊆ coNP/poly.

Proof. To show that there is a kernel with O(nk) constraints, we do a casedistinction on k.

(k = 1) If k = 1, there is no relation in Γ that cone-defines the 2-OR relation.It follows from Observation 4.40 and Theorems 4.41 and 4.46 that thereby, Γis balanced. It now follows from Theorem 4.26 that CSP(Γ) has a kernel withO(n) constraints that can be stored in O(n log n) bits.

4

4.7 Capturing polynomials versus compression via Maltsev embeddings 91

(k = 2) If k = 2, there is no relation R ∈ Γ with |R| = 23 − 1 = 7, asotherwise by Lemma 4.13 such a relation R would cone-define 3-OR whichis a contradiction. Thereby, it follows from Theorem 4.14 that CSP(Γ) hasa sparsification with O(n3−1) = O(n2) constraints that can be encoded inO(n2 log n) bits.

(k = 3) Given an instance (C, V), it is easy to obtain a kernel of with O(n3)constraints by simply removing duplicate constraints. This kernel can bestored in O(n3) bits, by storing for each relation R ∈ Γ and for each tuple(x1, x2, x3) ∈ V3 whether R(x1, x2, x3) ∈ C. Since |Γ| is constant and there areO(n3) such tuples, this results in using O(n3) bits.

It remains to prove the lower bound. By definition, there exists R ∈ Γ suchthat R cone-defines the k-OR relation. Thereby, the result follows immediatelyfrom Theorem 4.11. Thus, CSP(Γ) has no kernel of size O(nk−ε) for any ε > 0,unless NP ⊆ coNP/poly. J

4.7 Capturing polynomials versus compression viaMaltsev embeddings

In this section we compare our polynomial-based framework for linear sparsi-fication from Section 4.4 to the framework of Lagerkvist and Wahlström [70]based on Maltsev embeddings.

4.7.1 Maltsev embeddings and definitionsTo facilitate the discussion, we introduce some additional concepts. A ternaryoperation f : D3 → D over a domain D is a Maltsev operation if it satisfies theidentities f (x, x, y) = y and f (x, y, y) = x for all x, y ∈ D.

Definition 4.50 ([70, Definition 7]) A constraint language Γ over a domain Dadmits an embedding over the constraint language Γ′ over D′ ⊇ D if there existsa bijection h : Γ → Γ′ such that for every R ∈ Γ, if the arity of R is k then thearity of h(R) is k and h(R) ∩ Dk = R.

If Γ′ is preserved by a Maltsev operation, then Γ is said to admit a Maltsevembedding. Lagerkvist and Wahlström proved [70, Theorems 10–11] that if Γ is aconstraint language (over a possibly non-Boolean domain) that admits a Maltsevembedding over Γ′ with a finite domain, then CSP(Γ) has a kernel with O(n)constraints. Hence constraint languages admitting Maltsev embeddings overfinite domains admit linear sparsifications, just as balanced constraint languages.

In a quest to understand which CSPs can be sparsified through this route,they investigated which constraint languages admit Maltsev embeddings usinguniversal algebra. For this purpose, they defined a universal partial Maltsev

4


operation as a partial Boolean operation f such that each Boolean constraintlanguage Γ that admits a Maltsev embedding is preserved by f .

Definition 4.51 For a k-ary operation f : Dk → D on a domain D ⊇ 0, 1, thepartial Boolean operation f|B is the restriction of f to Boolean arguments thatresult in a Boolean value. Hence domain( f|B) = x ∈ 0, 1k | f (x) ∈ 0, 1,and for each k-tuple x in the domain of f|B we have f|B(x) = f (x).Definition 4.52 ([70, Definition 14]) The infinite domain D∞ is recursively de-fined as follows. It contains the elements 0 and 1 along with each triple (x, y, z)where x, y, z ∈ D∞ with x 6= y and y 6= z. The Maltsev operation u : D3

∞ → D∞is defined by setting u(x, x, y) = y, setting u(x, y, y) = x, and setting u(x, y, z) =(x, y, z) otherwise.

We define [u] as the set of all operations that can be defined as a termover the algebra (D∞, u). Hence an arity-k operation f ∈ [u] can be de-fined either as a projection, so that f (x1, . . . , xk) = xi for some i ∈ [k], or canbe recursively defined from operations f1, f2, f3 ∈ [u] via f (x1, . . . , xk) =u( f1(x1, . . . , xk), f2(x1, . . . , xk), f3(x1, . . . , xk)). We will use this recursive de-composition of operations in [u] later in our proofs. The universal partialMaltsev operations can be characterized precisely [70, Theorems 13–15, p.165]as the operations f|B for f ∈ [u].

Any Boolean constraint language that is preserved by all universal partialMaltsev operations, has [71, Theorem 28] a Maltsev embedding over D∞. How-ever, since only finite-domain Maltsev embeddings lead to linear sparsifications,this infinite-domain embedding does not directly have algorithmic applications.

In the remainder of this section, we explore relations between balancedBoolean constraint languages (which can be sparsified using capturing polyno-mials) and Boolean constraint languages admitting a Maltsev embedding.

4.7.2 Balanced constraint languages versus Maltsevembeddings

The next theorem shows that being balanced is at least as strong of a requirementas being preserved by all universal partial Maltsev operations. In particular,any balanced relation is preserved by all universal partial Maltsev operations.This implies that if the Maltsev approach does not apply to obtain a linearsparsification for a Boolean CSP, then the polynomial-based framework doesnot apply either.

I Theorem 4.53 Let Γ be a Boolean constraint language. If there exists auniversal partial Maltsev operation f such that Γ is not preserved by f , then Γ isnot balanced.

Proof. We introduce a function to associate an integer value to every elementof D∞, as follows. Let val : D∞ → N be given by val(0) := 0, val(1) := 1 and

4


val((d1, d2, d3)) := val(d1)− val(d2) + val(d3) for d1, d2, d3 ∈ D∞. We start byproving the following claim.

B Claim 4.54 Let f ∈ [u] have arity m. There is a sequence α1, . . . , αm ∈ Z

with ∑i∈[m] αi = 1, such that for all x1, . . . , xm ∈ 0, 1:

∑i∈[m]

αixi = val( f (x1, . . . , xm)).

Proof. We prove this by induction on the structure of f as given by a termover u.

(Base case) If f (x1, . . . , xm) := xj for j ∈ [m], we define αj := 1 and αi := 0for all i 6= j. By this definition, ∑i∈[m] αi = 1 and val( f (x1, . . . , xm)) = xj =∑i∈[m] αixi.

(Step) Let f (x1, . . . , xm) = u( f1(x1, . . . , xm), f2(x1, . . . , xm), f3(x1, . . . , xm)). Forb ∈ [3], choose coefficients αb,1, . . . , αb,m ∈ Z with ∑i∈[m] αb,i = 1 such that

∑i∈[m]

αb,ixi = val( fb(x1, . . . , xm))

for all x1, . . . , xm ∈ 0, 1. These coefficients exist by the induction hypothesis.For i ∈ [m], define αi := α1,i − α2,i + α3,i. By this definition,

∑i∈[m]

αi = ∑i∈[m]

α1,i − ∑i∈[m]

α2,i + ∑i∈[m]

α3,i = 1− 1 + 1 = 1,

as desired. Let x1, . . . , xm ∈ 0, 1 be given. To show that ∑i∈[m] αixi =

f (x1, . . . , xm) we distinguish three cases.

Suppose f1(x1, . . . , xm) = f2(x1, . . . , xm). Then f (x1, . . . , xm) = f3(x1, . . . , xm)by the definition of u, and thus

∑i∈[m]

αixi = ∑i∈[m]

α1,ixi − ∑i∈[m]

α2,ixi + ∑i∈[m]

α3,ixi

= val( f1(x1, . . . , xm))− val( f2(x1, . . . , xm)) + val( f3(x1, . . . , xm))

= val( f3(x1, . . . , xm)) = val( f (x1, . . . , xm)).

If f3(x1, . . . , xm) = f2(x1, . . . , xm), then a symmetric argument to the caseabove shows that indeed ∑i∈[m] αixi = f (x1, . . . , xm).

Otherwise, we know that f3(x1, . . . , xm) 6= f2(x1, . . . , xm) and f1(x1, . . . , xm) 6=f2(x1, . . . , xm). It follows that

f (x1, . . . , xm) = ( f1(x1, . . . , xm), f2(x1, . . . , xm), f3(x1, . . . , xm))

4


and we obtain

∑i∈[m]

αixi = ∑i∈[m]

α1,ixi − ∑i∈[m]

α2,ixi + ∑i∈[m]

α3,ixi

= val( f1(x1, . . . , xm))− val( f2(x1, . . . , xm)) + val( f3(x1, . . . , xm))

= val( ( f1(x1, . . . , xm), f2(x1, . . . , xm), f3(x1, . . . , xm)) )

= val( f (x1, . . . , xm)),

concluding the proof of this claim. C

Using the claim we prove Theorem 4.53. Let h : 0, 1m → 0, 1 be anm-ary universal partial Maltsev operation that does not preserve Γ. As describedin Section 4.7.1, there exists f ∈ [u] such that h = f|B. Let R ∈ Γ be a k-aryrelation such that R is not preserved by h. We show that there exists a balancedoperation g f that does not preserve R. By Claim 4.54 there exist coefficientsα1, . . . , αm ∈ Z such that

∑i∈[m]

αixi = val( f (x1, . . . , xm))

for all x1, . . . , xm ∈ 0, 1. Let g f be the balanced operation with coefficients αi.Since f|B does not preserve R, there are s1, . . . , sm ∈ R such that f|B(s1, . . . , sm)

is well-defined and f|B(s1, . . . , sm) /∈ R. Let si = (si,1, . . . , si,k). Then sincef|B(s1, . . . , sm) is well-defined, it evaluates to a 0/1-tuple and

val( f (s1,j, . . . , sm,j)) = f (s1,j, . . . , sm,j),

for all j ∈ [k]. Since ∑i∈[m] αixi = val( f (x1, . . . , xm)) for all x1, . . . , xm ∈ 0, 1,it follows that g f violates R. So g f is a balanced operation that does notpreserve Γ, concluding the proof. J

4.7.3 Balanced constraint languages versus finite-domainMaltsev embeddings

Theorem 4.53 implies [71, Theorem 28] that any balanced constraint languagehas a Maltsev embedding over the infinite domain D∞. We show in the nexttheorem that in fact, every balanced constraint language allows a Maltsevembedding over a finite domain. Thus, there are in fact two ways to obtain akernel with O(n) constraints for balanced constraint languages, one given byTheorem 4.26 and one via Maltsev embeddings [70, Theorems 10–11].

I Theorem 4.55 If Γ is a balanced Boolean constraint language, then Γ admitsa Maltsev embedding over a finite domain.

4


Proof. Since Γ is balanced, it follows from Claim 4.27 that for every R ∈Γ and u /∈ R there exists a linear polynomial pu,R and prime power qu,Rsuch that pu,R captures u with respect to R over Z/qu,RZ. Enumerate theobtained polynomials arbitrarily as p1, . . . , pm and their corresponding modulias q1, . . . , qm. Let D be the Cartesian product of the rings Z/qu,RZ we thusobtained, such that D = 0, . . . , q1 − 1 × 0, . . . , q2 − 1 × . . .× 0, . . . , qm −1. We equate the value (0, . . . , 0) ∈ D with the Boolean 0 and (1, . . . , 1) ∈ Dwith the Boolean value 1. Let ϕ : D3 → D be the Maltsev operation given byϕ(x, y, z) = w with wi = xi − yi + zi (mod qi) for all i ∈ [m], for x, y, z ∈ D.

We now define a constraint language Γ′ over D such that Γ′ is preservedby ϕ. For any k-ary relation R ∈ Γ, we show how to define a correspondingk-ary relation R′ ∈ Γ′. Initialize R′ as R. Then, recursively add any x ∈ Dk

to R′ for which there exist t1, t2, t3 ∈ R′ such that ϕ(t1, t2, t3) = x. Thiscompletes the definition of Γ′. It is easy to verify that Γ′ is preserved by ϕ. Tocomplete the Maltsev embedding, it remains to show that for all k-ary R ∈ Γ wehave R′ ∩ 0, 1k = R.

Suppose for contradiction that there exists R ∈ Γ such that R′ ∩ 0, 1k 6= R,which implies that there exists x ∈ 0, 1k such that x ∈ R′ but x /∈ R. From theconstruction of R′ it follows by an easy induction that there exist t1, . . . , t` ∈ Rwith ta = (ta,1, . . . , ta,k), such that for all entries i ∈ [k] of an assignment andall coordinates j ∈ [m] of a domain element, we have

xi ≡qj t1,i − t2,i + t3,i − t4,i . . . + t`,i.

Consider the index j ∈ [m] for which pj = px,R. Since px,R is a linearpolynomial, it can be written as px,R(y1, . . . , yk) = α0 + ∑k

i=1 αiyi for coeffi-cients α0, . . . , αk ∈ Z/qx,RZ. Interpreting this polynomial over the integers, wehave that pj(ta,1, . . . , ta,k) ≡qj 0 for all a ∈ [`], since pj captures x with respectto R and ta ∈ R. Then

pj(x1, . . . , xk) ≡qj α0 +k

∑i=1

αixi ≡qj α0 +k

∑i=1

αi(t1,i − t2,i . . . + t`,i)

≡qj α0 +k

∑i=1

`

∑a=1

(−1)a−1αi · ta,i

≡qj α0 +`

∑a=1

(−1)a−1k

∑i=1

αi · ta,i

≡qj

`

∑a=1

(−1)a−1

(α0 +

k

∑i=1

αi · ta,i

)

≡qj

`

∑a=1

(−1)a−1 pj(ta,1, . . . , ta,k) ≡qj 0.

4


Thereby, px,R(x1, . . . , xk) ≡qx,R 0, contradicting that px,R captures x with respectto R. This concludes the proof. J

Theorem 4.55 shows that balanced constraint languages have (finite-domain)Maltsev embeddings. In the next theorem, we prove a partial converse: con-straint languages that are not balanced, do not admit Maltsev embeddings ofa particular type over a (possibly infinite) domain. The particular type weconsider consists of embeddings for which the constraint language Γ′ into whichwe embed has a group structure on its domain, and the Maltsev operation thatpreserves Γ′ is the coset generating operation of the group. Let us give therelevant definitions.

Definition 4.56 Let (D, ·) be a group. The coset generating operation of thegroup is the Maltsev operation c : D3 → D defined by c(x, y, z) = x · y−1 · z.

I Theorem 4.57 Let Γ be a Boolean constraint language that is not balanced,and let (D, ·) be a group with coset generating operation c. Then there is noconstraint language Γ′ over domain D ⊇ 0, 1 that is preserved by c and forwhich Γ admits an embedding over Γ′, not even if the identity element of (D, ·) isallowed to differ from 0, 1.

Proof. Suppose for contradiction that there exists a group (D, ·) with cosetgenerating operation c and a constraint language Γ′ such that Γ′ is preservedby c and Γ admits an embedding over Γ′. Let R ∈ Γ be a relation that is notbalanced, suppose R has arity k. Since R is not balanced, take r1, . . . , rm ∈ Rwith r1 − r2 . . . + rm = u /∈ R. Let R ∈ Γ′ be the image of R under theconsidered Maltsev embedding. We start by proving the following claim.

B Claim 4.58 For all i ∈ [k], the following equation holds over the group (D, ·):

r1,i · r−12,i · . . . · rm,i = ui.

Proof. We show by induction that for all x1, . . . , xm ∈ 0, 1 with x1 − x2 . . . +xm = y for y ∈ 0, 1, it holds that x1 · x−1

2 · . . . · xm = y over D.

(Base case) If m = 1, we trivially obtain that x1 = y.

(Step) Suppose m > 1. We start by showing that there exists j ∈ [m− 1] suchthat xj = xj+1. Suppose not, then we are in one of the two cases below.

• x1 = x3 = . . . = xm = 0 and x2 = x4 = . . . = xm−1 = 1, or

• x1 = x3 = . . . = xm = 1 and x2 = x4 = . . . = xm−1 = 0.

In both cases it is easily verified that for this choice of variables, x1 − x2 . . . +xm /∈ 0, 1, which is a contradiction. Therefore, there exists j ∈ [m− 1] suchthat xj = xj+1. Suppose for ease of notation that j is even, the case when j isodd follows symmetrically. It is easy to verify that y = x1− x2 . . . + xj−1− xj +xj+1 − xj+2 . . . + xm = x1 − x2 . . . + xj−1 − xj+2 . . . + xm, and thus it follows

4


from the induction hypothesis that x1 · x−12 · . . . · xj−1 · x−1

j+2 · . . . · xm = y.Thereby

x1 · x−12 · . . . · xj−1 · x−1

j · xj+1 · x−1j+2 · . . . · xm =

x1 · x−12 · . . . · xj−1 · (x−1

j · xj) · x−1j+2 · . . . · xm =

x1 · x−12 · . . . · xj−1 · x−1

j+2 · . . . · xm−2 = y.

Since r1, . . . , rm were chosen such that r1 − r2 . . . + rm = u ∈ 0, 1k \ R, thestatement of the claim follows. C

Recall that R is preserved by c. We have the following claim.

B Claim 4.59 Let R be a k-ary relation and m ≥ 3 be odd. If R is pre-served by c, then it is preserved by the m-ary operation fm : Dm → D givenby fm(x1, . . . , xm) := x1 · x−1

2 · . . . · xm.

Proof. We show this by a simple induction. If m = 3, the statement is true sincein this case f and c are equivalent. Let m > 3 and let t1, . . . , tm ∈ R be given,we show f (t1, . . . , tm) ∈ R. Let t′ := c(t1, t2, t3), observe that t′ ∈ R since R ispreserved by c. Choose i ∈ [k] and observe that

fm(t1,i, . . . , tm,i) = t1,i · t−12,i · t3,i · t−1

4,i · . . . · tm,i

= c(t1,i, t2,i, t3,i) · t−14,i · . . . · tm,i

= t′ · t−14,i · . . . · tm,i.

Thereby, fm(t1, . . . , tm) = fm−2(t′, t4, . . . , tm). It follows from the inductionhypothesis that fm−2 is preserved by R and thus fm−2(t′, t4, . . . , tm) ∈ R. C

It follows from the fact that c preserves R and Claim 4.59, that fm preservesR. However, by Claim 4.58, it follows that fm(r1, . . . , rm) = u and thusu ∈ R, contradicting that we have given a valid Maltsev embedding of Γ, sinceu ∈ 0, 1k, but u /∈ R. J

We conclude the subsection with a discussion of the implications of The-orems 4.55 and 4.57. On the one hand, Theorem 4.55 shows that balancedconstraint languages admit Maltsev embeddings over finite domains. On theother hand, Theorem 4.57 shows that unbalanced constraint languages do notallow Maltsev embeddings over the coset generating operation of a group, noteven an infinite group. As a corollary to these results, we therefore obtainan infinite-domain to finite-domain transformation, for embeddings via cosetgenerating operations.

4


I Corollary 4.60 If Γ is a Boolean constraint language that has a Maltsev em-bedding over Γ′, where the domain D of Γ′ is a (possibly infinite) group and Γ′

is preserved by the coset generating operation of the group, then Γ has a Maltsevembedding over a finite domain. J

4.7.4 Preservation by balanced or universal partial Maltsevoperations

In this section we compare preservation by balanced operations to preserva-tion by universal partial Maltsev operations. Theorem 4.53 implies that everybalanced constraint language is preserved by all universal partial Maltsev op-erations. At this point, it is unknown whether the converse also holds. If theBoolean constraint language Γ is preserved by all universal partial Maltsevoperations, then is it also balanced?

We have not managed to resolve this question, but we present some insightsin this direction. Recall that ai is the alternating (partial) operation of arity i, forodd i ≥ 1. It is easy to verify that a3 is equivalent to u|B. We show the followingresult about a5.

I Theorem 4.61 Let Γ be a Boolean constraint language. If Γ is not preserved bya5, then there is a universal partial Maltsev operation that does not preserve Γ.

Proof. We start by considering the term f ∈ [u] defined as follows:

f (x1, . . . , x5) := u(x1, u(x2, x3, u(x1, x2, x3)), u(x5, x4, u(x3, x2, x1))).

The key of the proof is that the Boolean restriction f|B of f does not preserve Γ.We start by showing how f relates to the alternating operation of arity 5.

B Claim 4.62 For all x1, . . . , x5 ∈ 0, 1 such that a5(x1, . . . , x5) ∈ 0, 1, itholds that

f (x1, . . . , x5) = a5(x1, . . . , x5).

Proof. Suppose a5(x1, . . . , x5) ∈ 0, 1, we do a case distinction.

(u(x5,x4,u(x3,x2,x1)) ∈ 0, 1) Observe that in particular, this impliesthat u(x3, x2, x1) ∈ 0, 1, implying that x3 = x2 or x1 = x2. If x3 = x2,we obtain that u(x2, x3, u(x1, x2, x3)) = u(x3, x3, u(x1, x3, x3)) = x1, imply-ing that f (x1, . . . , x5) = u(x5, x4, u(x3, x2, x1)) and since u(a, b, c) = a −b + c if u(a, b, c) ∈ 0, 1, the result follows. Similarly, if x1 = x2, thenu(x2, x3, u(x1, x2, x3)) = u(x1, x3, u(x1, x1, x3)) = x1. Just like in the previouscase, this implies f (x1, . . . , x5) = u(x5, x4, u(x3, x2, x1)) and the result follows.

(u(x5,x4,u(x3,x2,x1)) /∈ 0, 1) We again consider two options, based onwhether u(x3, x2, x1) is in 0, 1 or not. Observe that if u(x3, x2, x1) ∈ 0, 1,the reason that u(x5, x4, u(x3, x2, x1)) /∈ 0, 1 is that x5 = u(x3, x2, x1) andx5 6= x4. But then by definition, x5 = x3− x2 + x1 and thus if x5 = 1, then x4 =

4


0 and we obtain x5 − x4 + x3 − x2 + x1 = 2 /∈ 0, 1, which is a contradiction.Similarly, if x5 = 0 we obtain that x5− x4 + x3− x2 + x1 = −1 /∈ 0, 1, whichis again a contradiction. We thereby conclude that u(x3, x2, x1) /∈ 0, 1.By definition, this implies x3 = x1 6= x2. It follows that x4 6= x5 to ensurethat a5(x1, . . . , x5) = x5 − x4 + x3 − x2 + x1 ∈ 0, 1. Furthermore, observethat x4 = x3 for this same reason. Thus, x1 = x3 = x4 6= x2 = x5. If followsthat a5(x1, . . . , x5) = x1. Furthermore, substituting this into the formula showsthat u(x5, x4, u(x3, x2, x1)) = u(x2, x1, u(x1, x2, x1)) = (x2, x1, (x1, x2, x1)) =u(x2, x3, u(x1, x2, x3)), implying that f (x1, . . . , x5) = x1, as desired. C

Now let Γ be a constraint language that is not preserved by a5. It follows fromClaim 4.62 that for any (x1, . . . , x5) ∈ domain(a5), it holds that a5(x1, . . . , x5) =f (x1, . . . , x5) and (x1, . . . , x5) ∈ domain( f|B). It follows that Γ is not preservedby f|B, which is a universal partial Maltsev operation [70, Theorem 15]. J

Theorem 4.61, together with the observation that a3 is equivalent to u|B,have the following consequence. If a constraint language Γ is unbalancedbecause some alternating operation of arity at most five does not preserve it,then Γ does not admit a Maltsev embedding, not even over an infinite domain.We leave it for future work to determine whether there exist Boolean constraintlanguages that admit finite-domain Maltsev embeddings, but are not balanced.Are there constraint languages for which the Maltsev framework yields linearkernelizations, but the polynomial-based framework does not?

4.7.5 Preservation by partial Maltsev operations versuscone-definability

There is a close relation between cone-definability of 2-OR and preserving theBoolean operation ϕ1, where ϕ1 is defined by ϕ1(x, x, y) = y, ϕ1(x, y, y) = x,and domain(ϕ1) = (0, 0, 0), (0, 0, 1), (0, 1, 1), (1, 0, 0), (1, 1, 0), (1, 1, 1).I Proposition 4.63 Let Γ be a Boolean constraint language. Then Γ is notpreserved by ϕ1 if and only if there exists R ∈ Γ such that R cone-defines 2-OR.

Proof. (⇒) Suppose R ∈ Γ is not preserved by ϕ1. We show that R cone-defines2-OR. Let m be the arity of R and let t1, t2, t3 ∈ R and u ∈ 0, 1m \ R such thatϕ1(t1, t2, t3) = u.

We cone-define the 2-OR relation on variables x1 and x2 via the tuple(y1, . . . , ym) as follows. Consider i ∈ [m], if t1,i = t2,i = t3,i = ui let y1have the constant value ui ∈ 0, 1. If (t1,i, t2,i, t3,i) = (1, 1, 0) let yi := x1, if(t1,i, t2,i, t3,i) = (0, 0, 1) let yi := ¬x1. Similarly, if (t1,i, t2,i, t3,i) = (0, 1, 1) letyi := x2, if (t1,i, t2,i, t3,i) = (1, 0, 0) let yi := ¬x2. Note that this covers all casesfor the value of (t1,i, t2,i, t3,i).

Let f : x1, x2 → 0, 1. We show that f (x1) ∨ f (x2) if and only if ( f (y1),. . . , f (ym)) ∈ R, with f as in Definition 4.7. We do a case distinction. If

4


f (x1) = 1 and f (x2) = 1, then one may verify that ( f (y1), . . . , f (ym)) = t2 ∈ R.If f (x1) = 0 and f (x2) = 1, then ( f (y1), . . . , f (ym)) = t3. If f (x2) = 0 andf (x1) = 1, then ( f (y1), . . . , f (ym)) = t1 ∈ R. Finally, if f (x1) = f (x2) = 0,then ( f (y1), . . . , f (ym)) = u /∈ R, as desired.

To complete the cone-definition, it remains to show that there are i, j ∈ [m]such that yi ∈ x1,¬x1 and yj ∈ x2,¬x2. Now observe that t1 6= t2 6= t3:if t1 = t2 then by definition of ϕ1 we have ϕ1(t1, t2, t3) = t3 ∈ R, and similarlyif t2 = t3 then ϕ1(t1, t2, t3) = t1 ∈ R. Since ϕ1(t1, t2, t3) /∈ R, this cannotbe. If t1 = t3 6= t2, then consider a coordinate k ∈ [m] for which t1,k =t3,k 6= t2,k; but then ϕ1 is not defined on this coordinate. Consequently, allthree tuples t1, t2, t3 are distinct. Now, since the argumentation above showsthat ( f (y1), . . . , f (ym)) can evaluate to each of t1, t2, t3, depending on thevalues assigned to x1 and x2, it follows that f depends on both x1 and x2 whichimplies the existence of the desired positions i, j ∈ [m].

(⇐) Suppose there exists R ∈ Γ of some arity m that cone-defines 2-OR. Let(y1, . . . , ym) be the tuple witnessing this, with yi ∈ 0, 1 ∪ x1,¬x1, x2,¬x2.Define f : x1, x2 → 0, 1 as f (x1) = 0 and f (x2) = 1. Let t1 := ( f (y1),. . . , f (ym)) ∈ R where f is the natural extension of f . Similarly, for f ′(x1) =

f ′(x2) = 1 let t2 := ( f ′(y1), . . . , f ′(ym)) and for f ′′(x1) = 1, f ′′(x2) = 0let t3 := ( ˆf ′′(y1), . . . , ˆf ′′(ym)). Finally, let u be the tuple witnessing that forf ′′′(x1) = f ′′′(x2) = 0, ( ˆf ′′′(y1), . . . , ˆf ′′′(ym)) = u /∈ R. We will show thatϕ1(t1, t2, t3) = u, thus showing that ϕ1 does not preserve R.

We show that ui = t1,i − t2,i + t3,i for each i ∈ [m]. Suppose yi = 0,then by the definition above t1,i = t2,i = t3,i = ui = 0 as f (yi) = 0 for anyf : x1, x2 → 0, 1. Thus, ui = t1,i − t2,i + t3,i in this case. Similarly, if yi = 1we obtain ui = 1 = t1,i − t2,i + t3,i.

Suppose yi = x1. Then by the definition above, t1,i = 0 and t2,i = t3,i = 1.Furthermore, ui = 0 = 1− 1 + 0 as desired. For yi = ¬x1, we have t1,i = 1and t2,i = t3,i = 0 and ui = 1, as desired. It is straightforward to verify thatalso when yi = x2 and yi = ¬x2 we obtain ui = t1,i − t2,i + t3,i, concluding theproof. J

In their work, [71, Theorem 32] Lagerkvist and Wahlström show the follow-ing. For every integer d ≥ 3 and for every finite set P of partial polymorphismsfor which the set of Boolean relations preserved by P can pp-define all Booleanrelations, there is a polynomial-parameter transformation [16,18] from d-CNF-SAT instances with n variables, to equivalent instances of CSP(Γ) on O(nc)variables. Here c ∈ N depends only on P and Γ is a finite Boolean constraintlanguage whose relations are preserved by all operations in P. AssumingNP * coNP/poly, Theorem 2.9 states that the problem d-CNF-SAT has no kernelof bitsize O(nd−ε) for any ε > 0. Via the cited transformation, this impliesCSP(Γ) has no sparsification of bitsize O(nd/c−ε). Hence knowing that theconstraint language is preserved by any finite set of partial polymorphisms

4

4.8 Conclusion 101

does not guarantee any polynomial compressibility. Note that any constraintlanguage that can cone-define k-OR for some k ≥ 2 can also cone-define 2-OR,while Proposition 4.63 shows that being able to cone-define 2-OR is equivalentto being violated by the partial operation ϕ1. A constraint language preservedby the single partial polymorphism ϕ1 therefore does not cone-define k-OR forany k ≥ 2. Using the transformation and incompressibility results mentionedabove, we find (assuming NP * coNP/poly) that for any d′ ∈ R there is a finiteBoolean constraint language Γd′ that does not cone-define 2-OR or larger, butfor which CSP(Γd′) does not have a sparsification of bitsize O(nd′). A gen-eral characterization of optimal sparsification size by the arity of the largestcone-definable OR is therefore impossible.

4.8 ConclusionThe ultimate goal of this line of research is to fully classify the sparsifiabilityof CSP(Γ), depending on Γ. In particular, we would like to classify those Γ forwhich O(n) sparsifiability is possible. In this chapter, we have shown that Γ be-ing balanced is a sufficient condition to obtain a linear sparsification; it is tempt-ing to conjecture that this condition is also necessary. The simplest example of aBoolean constraint language for which we currently do not understand whetheror not it has a linear sparsification, consists of a single Boolean relation R∗ ofarity nine. Relation R∗ has five satisfying assignments s1, . . . , s5 ∈ 0, 19:

R∗ =

s1 = (1, 0, 0, 1, 1, 1, 0, 0, 1),s2 = (0, 0, 0, 0, 1, 0, 0, 1, 1),s3 = (0, 1, 0, 1, 1, 0, 1, 1, 0),s4 = (0, 0, 0, 1, 0, 1, 1, 0, 0),s5 = (0, 0, 1, 0, 0, 1, 1, 1, 1)

Relation R∗ is not balanced, since

s1 − s2 + s3 − s4 + s5 = (1, 1, 1, 1, 1, 1, 1, 1, 1) /∈ R∗.

By Theorem 4.61, it follows that R∗ is violated by some universal partial Maltsevoperation. Hence neither our polynomial framework for compression, northe Maltsev-based approach yields a linear sparsification for CSP(R∗). Onthe other hand, no superlinear lower bound is currently known. Resolvingthe sparsification complexity of CSP(R∗) is the first obstacle in a generalclassification of linearly-compressible Boolean CSPs.

Note that the matrix consisting of the five satisfying assignments of R∗ hasa very succinct description: there is one column for each vector x ∈ 0, 15

for which x1 − x2 + x3 − x4 + x5 = 1, except for the all-ones column. Thepresence of these columns ensures that ϕ1 preserves R∗, for the simple reasonthat ϕ1(si, sj , sk) for i, j, k ∈ [5] is only defined when i = j or j = k, in which

4


case the output tuple equals one of the input tuples. In other cases, there isan index ` for which si,` − sj,` + sk,` ∈ −1, 2, making the output undefined.Hence, using Proposition 4.63, relation R∗ does not cone-define 2-OR.

Observe that CSP(R∗) is NP-complete by Schaefer’s dichotomy theorem(Theorem 2.37): R∗ does not have the constantly-1 operation as a polymor-phism, nor the constantly-0 operation; the tuples s1, s2 show that R∗ is notpreserved by the binary AND operation, nor by the binary OR operation; andthe tuples s1, s2, s3 show that R∗ is not preserved by the ternary majority orminority operations.

Resolving the sparsification complexity of CSP(R∗), and subsequentlyobtaining a complete characterization of the Boolean CSPs that admit a lin-ear compression, form the main open problems that remain. Other directionsinclude the investigation of constraint languages of larger arity and the charac-terization of the CSPs that admit sparsifications with a quadratic, or even largerpolynomial number, of constraints.

Acknowledgements We would like to thank Emil Jerábek for the proof ofLemma 4.20. We thank Andrei Bulatov for insightful discussions, and thankMagnus Wahlström for sharing his ideas and an initial version of the proof ofTheorem 4.61.

5

Chapter 5Sparsification Lower Bounds for

H-Coloring

In this chapter, we will provide a sparsification lower bound for the H-COLORING

problem, showing that for a large class of graphs H, it does not have a non-trivial sparsification unless NP ⊆ coNP/poly. This adds to the growing list ofproblems for which the existence of non-trivial sparsification algorithms hasbeen ruled out under the assumption that NP * coNP/poly, which includesVERTEX COVER [33], DOMINATING SET [59], FEEDBACK ARC SET [59], andTREEWIDTH [56]. To the best of our knowledge, to date there is no non-trivialsparsification algorithm for any NP-hard problem that is defined on generalgraphs. This chapter adds another problem to this growing list of graph problemsnot admitting a non-trivial sparsification.

A second motivation for studying H-COLORING, is its relation to constraintsatisfaction problems, as described in detail in Section 2.8. In short, any NP-hard H-COLORING problem translates into a CSP with a non-Boolean domainin which constraints have arity two. There have been a number of non-trivialadvances in the sparsification for CSPs in over the Boolean domain, as can beseen from Chapters 3 and 4 in this thesis, and from recent work by Lagerkvistand Wahlström [70]. A natural step in that line of research is to consider CSPsover larger domain sizes, of which H-COLORING problems are an example.

Finally, the lower bound for H-COLORING provided in this chapter general-izes earlier results on sparsification of q-COLORING problems [60].

5

104 Sparsification Lower Bounds for H-Coloring

Related work It is well-known that H-COLORING is NP-hard if H is non-bipartite and contains no self-loops [49], and polynomial-time solvable other-wise. The classical complexity of H-COLORING has also been investigated whenrestricted to planar [76], minor-closed [36], and bounded-degree [43,84] inputgraphs G.

Overview We start by showing that when H corresponds to Cα for some oddvalue for α, the H-COLORING problem has no non-trivial sparsification unlessNP ⊆ coNP/poly, in Section 5.1. This will be shown by giving a degree-2 cross-composition from CLIQUE to Cα-COLORING. Recall that Cα is a the cycle on αvertices.

In Section 5.2 we provide some additional preliminaries. In the remain-der of this chapter (Section 5.3), we transfer the lower bound thus shown forCα-COLORING to other choices for the graph H, using linear-parameter transfor-mations. In particular, we will show the lower bound holds in case H contains atriangle, and in case H has odd-girth α > 3 and no edge in H is contained intwo different Cα-subgraphs.

To obtain a linear-parameter transformation, we cannot easily re-use thetechniques that were used in the NP-completeness proofs for H-COLORING

when H is nonbipartite. In particular, the constructions used by Hell andNešetril introduce gadgets for every edge in the graph, which is not allowed ina linear-parameter transformation. For the same reasons, alternative proofs forthe H-COLORING dichotomy cannot be used in our setting. Instead, we showthat if H satisfies the relevant preconditions, then there is a subset B of thevertices of H, such that H[B] is homomorphically equivalent to an odd cycleand furthermore, there exists a reduction from Cα-COLORING to H-COLORING

that attaches a constant-size gadget to every vertex, forcing it to be coloredwith a color in B. Hence an H-coloring satisfying the constraints enforced bythese gadgets only uses the vertices of a cycle in H to color G, and is thereforea Cα-coloring

5.1 Sparsification lower bound for Cα-ColoringIn this section, we will prove a sparsification lower bound for Cα-COLORING.We show that Cα-COLORING on n vertices has no (generalized) kernel of sizeO(n2−ε) for any ε > 0 and odd α ≥ 3, unless NP ⊆ coNP/poly. We will dothis by giving a degree-2 cross-composition from CLIQUE to Cα-COLORING, as inDefinition 2.13. An input to the decision problem CLIQUE consists of a graph Gand integer k, and asks whether G has a clique on k vertices.

For ease of presentation, we will first prove the lower bound for Cα-LIST

COLORING instead of Cα-COLORING. The input to Cα-LIST COLORING is a graph Gtogether with a function L that assigns each vertex v of G a list L(v) ⊆ V(Cα).

5

5.1 Sparsification lower bound for Cα-Coloring 105

γ1 γ2

α − 3 vertices

α ≥ 5 α = 5

γ1 γ2

α = 3

γ1 γ2

Figure 5.1 The gadget ineq-gadget for Lemma 5.1, depicted for α ≥ 5 (left), α = 5(middle) and α = 3 (right).

The question is whether there is a homomorphism from G to Cα that maps eachvertex v to a color on its list L(v). We will show later that this construction willalso imply the sparsification bound for Cα-COLORING.

Before presenting the cross-composition, we need the following additionallemmas.

I Lemma 5.1 For all odd α ≥ 3, there exists a Cα-COLORING instance with O(α)vertices called ineq-gadget that has distinguished vertices γ1, γ2, such that anymapping f : γ1, γ2 → V(Cα) can be extended to a Cα-coloring of the ineq-gadgetif and only if f (γ1) 6= f (γ2).

Proof. A gadget with the desired properties was also developed by Maurer etal. [79], we will use a simplified version of their gadget.

Define the ineq-gadget as a path on α− 1 vertices. Let the first vertex of thepath be γ1 and let the last vertex be γ2. The gadget is depicted in Figure 5.1.Observe that for α = 3, the gadget simply consists of the vertices γ1 and γ2,connected by an edge.

Since the gadget is a path of length α − 2, the colors used on this pathby any Cα-coloring form a walk in Cα of length α − 2. Suppose a mappingf : γ1, γ2 → V(Cα) can be extended to a coloring f ′ of the whole gadget. Ifnow, f (γ1) = f (γ2), the existence of f ′ means that there is a closed walk oflength α− 2 in Cα, implying that there is an odd cycle in Cα of length strictlysmaller than α, which is a contradiction. Thus, f (γ1) 6= f (γ2).

In the other direction, given a Cα-coloring f : γ1, γ2 for which f (γ1) 6=f (γ2) we can extend it to a Cα-coloring of the entire gadget by choosing alength-(α− 2) walk in Cα from f (γ1) to f (γ2). Color the vertices of the pathusing the vertices from this walk, in the same order. J

In our construction below, we will use the phrase connect u to v with anineq-gadget to mean the following: create a new ineq-gadget and identify uwith γ1 and v with γ2. The ineq-gadget can be used to make sure two verticesof the graph are colored with two distinct colors.

We will now define so-called blocking-gadgetsα, that allow us to forbid acertain coloring on a set of vertices in a Cα-LIST COLORING instance. Thegadget will be based on a gadget for classic graph coloring created by Jaffke andJansen [55]. The following lemma is a rephrased version of Lemma 14 in [55].

5


I Lemma 5.2 There exists a polynomial-time algorithm that, given a tuple c =(c1, . . . , cm) ∈ [q]m, outputs a q-LIST COLORING instance called blocking-gadget(c)that is a path of length O(m) and contains distinguished vertices (π1, . . . , πm).A mapping f : πi | i ∈ [m] → [q] can be extended to a q-list coloring ofblocking-gadget(c) if and only if f (πi) = ci for some i ∈ [m]. J

Using this gadget for q-LIST COLORING, we show how to construct a gadget forCα-LIST COLORING in the following lemma.

I Lemma 5.3 There is a polynomial-time algorithm that, given c = (c1, . . . , cm) ∈[V(Cα)]m, outputs a Cα-LIST COLORING instance with O(α · m) vertices calledblocking-gadgetα(c) that contains distinguished vertices (π1, . . . , πm).

A mapping f : πi | i ∈ [m] → [V(Cα)] can be extended to a Cα-coloring ofblocking-gadgetα(c) if and only if f (πi) = ci for some i ∈ [m].

Proof. Enumerate the vertices in Cα such that V(Cα) = 1, . . . , α arbitrarilyand rename the colors in c accordingly. To obtain blocking-gadgetα(c) startfrom blocking-gadget(c), with the same lists L(v) for each vertex.

Replace every edge in blocking-gadget(c) by an ineq-gadget as follows. Ifu, v is an edge in blocking-gadget(c), remove it and add a new ineq-gadgetwith distinguished vertices γ1 and γ2. Identify vertex γ1 with u and γ2 with v.This concludes the construction of blocking-gadgetα(c).

Since blocking-gadget(c) is a path, it only has O(m) edges and vertices.Thereby, blocking-gadgetα(c) has at most O(αm) vertices.

One can now verify that by the properties of the ineq-gadget given inLemma 5.1 and the properties of a blocking-gadget given in Lemma 5.2, itfollows that a coloring f of π1, . . . , πm can only be extended to a Cα-coloring ofblocking-gadgetα(c) if there exists i ∈ [m] such that f (πi) = ci. J

We will use the above lemma when we want to forbid one particular col-oring c1, . . . , cm ∈ V(Cα) from appearing on a particular sequence of verticesv1, . . . , vm of the graph under construction, without adding further restrictions.This can now be achieved by adding a blocking-gadgetα((c1, . . . , cm)). Thenfor each i ∈ [m], connect πi to vi with an ineq-gadget. This ensures that forany proper coloring f , it holds that f (vi) 6= f (πi) for all i ∈ [m]. Thereby,if f (vi) = ci for all i (which we want to forbid), then f (πi) 6= ci for all iand thus this coloring cannot be extended to properly color all vertices ofblocking-gadgetα(c). If however f (vi) 6= ci for some i, vertex πi can be col-ored with ci and this coloring can be extended to color blocking-gadgetα(c), asdesired.

Using the gadgets introduced above, we can now prove the sparsificationlower bound for Cα-COLORING.

I Theorem 5.4 Let α ≥ 3 be an odd integer. Cα-COLORING parameterized by thenumber of vertices n admits no generalized kernel of size O(n2−ε) for any ε > 0,unless NP ⊆ coNP/poly.

5


Proof. We will start by giving a degree-2 cross-composition from CLIQUE toCα-LIST COLORING. We define a polynomial equivalence relation R (Def. 2.12)on instances of CLIQUE. Let any two instances that ask for a clique that islarger than their respective number of vertices be equivalent; these are alwaysno-instances. Let two instances of CLIQUE be equivalent under R, when theinput graphs have same number of vertices and the problems ask for a cliqueof the same size. It is easy to verify that R is indeed a polynomial equivalencerelation.

By duplicating one of the inputs multiple times if needed, we can assumethe number of inputs to the cross-composition is a square. Therefore, assumewe are given t instances of CLIQUE, such that t′ :=

√t is integer and such

that each instance has n vertices and asks for a clique of size k. Enumeratethe given instances as Xi,j for i, j ∈ [t′] and let Gi,j denote the correspondinggraph. Name the vertices in each instance arbitrarily as x1, . . . , xn. We nowcreate an instance G′ of the Cα-LIST COLORING problem. To do this, we usethe colorset L, L′, R, R′, C ⊆ V(Cα). For α ≥ 5 the colors are chosen suchthat (L, L′, C, R′, R) forms a (simple, but not necessarily induced) path in Cα.If α = 3, simply pick three distinct colors L, R, and C and let L′ := R andR′ := L. Observe that the pairs (L, C), (C, C), and (C, R) have a commonneighbor in L′, R′, but that (L, R) does not (this pair has C as a commonneighbor but that is irrelevant to the construction). Refer to Figure 5.2 for asketch of G′. We construct a Cα-LIST COLORING instance consisting of a graph G′

and list L(v) ⊆ V(Cα) for each v ∈ V(G′).

1. For each j ∈ [t′], ` ∈ [n], and m ∈ [k] create vertices pj`,m and qj

`,m. Let

L(qj`,m) := L, C and L(pj

`,m) := R, C. Let Qj contain all created vertices

pj`,m and qj

`,m. Let Q :=⋃

j∈[t′ ] Qj.

2. Let c := (L, C). For each j ∈ [t′], ` ∈ [n], and m ∈ [k], create a newblocking-gadgetα(c) and let π1, π2 be its distinguished vertices. Connect π1

to qj`,m with an ineq-gadget and connect π2 to pj

`,m via an ineq-gadget. This

ensures that any coloring that assigns L to qj`,m, must assign color R to pj

`,m.

3. For each f ∈ ([k]2 ), each e = (e1, e2) ∈ [n]2, and each i ∈ [t′], create vertices

rie, f and si

e, f . Let Si := ri(e1,e2), f , si

(e1,e2), f | f ∈ ([k]2 ), (e1, e2) ∈ [n]2. Note

that Si contains (k2) vertices for each possible edge e that could exist in an

n-vertex graph (including self-loops). Let S :=⋃

i∈[t′ ] Si. For every vertex inS, let its list be L′, R′.

The goal of the construction is to ensure that the Cα-LIST COLORING instance(G′,L) acts as the logical OR of the CLIQUE instances Xi,j, so that G′ has a Cα-coloring respecting the lists if and only if some input graph Gi,j has a clique

5


1, 2 1, 3 2, 3f =

e = (1, 1)

(1, 2)

(1, 3)

(1, 4)

(2, 2)

(2, 3)

(2, 4)

(2, 1)

(3, 2)

(3, 3)

(3, 4)

(3, 1)

(4, 2)

(4, 3)

(4, 4)

(4, 1)

S1 S2 S3

Q1 Q2

a1 a2 a3

b1 b2 b3

q13,1

1, 2 1, 3 2, 3f =

e = (1, 1)

(1, 2)

(1, 3)

(1, 4)

(2, 2)

(2, 3)

(2, 4)

(2, 1)

(3, 2)

(3, 3)

(3, 4)

(3, 1)

ColorsL

′,R′

Colors L, C

Colors L, C

p14,2 p1

4,3

p11,3

q13,2

s2(3,4),2,3

s1(4,4),1,2

1, 2 1, 3 2, 3f =

e = (1, 1)

(1, 2)

(1, 3)

(1, 4)

(2, 2)

(2, 3)

(2, 4)

(2, 1)

(3, 2)

(3, 1)

(4, 3)

(4, 4)

· · ·

· · ·

· · ·

Q3

L,CR

,C

r1(4,4),1,2

Figure 5.2 A sketch of G′ for t = 9, k = 3, and n = 4. The only edges drawn are the edgescreated in Step 4, for e = (3, 4), i = 2, and j = 1, in case the edge x3, x4 /∈ E(G2,1).

of size k. The part of G′ constructed so far allows colorings of G′ to encodethe vertex set of a k-clique through its behavior on a set Qj. To find a properlist coloring of G′ entails highlighting vertices from one of the sets Qj thatcorrespond to a clique in instance Xi,j for some i ∈ [t′]. The highlighting property

will be enforced by ensuring at least one vertex in each set qj`,m | ` ∈ [n]

receives color L. The index of the vertex that is colored L encodes a vertex in theclique to which the coloring corresponds. The vertices in Si are used to verifythat the selected vertices form a clique in Gi,j. The next steps add additionalvertices and edges, in order to achieve these properties.

4. For each i, j ∈ [t′], consider instance Xi,j. For all f ∈ ([k]2 ) and e = (e1, e2) ∈[n]2, connect qj

e1, f1to ri

e, f , and connect pje2, f2

to sie, f , whenever xe1 , xe2 /∈

5


E(Gi,j). Here f1 < f2 are such that f = f1, f2.

The above step uses vertices rie, f and si

e, f to ensure that when xe1 , xe2 is notan edge in some instance, we cannot select both vertices in a clique for thisinstance.

5. Add vertices ai with list L, C for all i ∈ [t′], and let A := ai | i ∈ [t′].

6. Similarly, add vertices bj with list L, C for all j ∈ [t′] and let B := bj | j ∈[t′].

7. Let c := (L, . . . , L) be the tuple consisting of t′ copies of color L. Add ablocking-gadgetα(c) and for all i ∈ [t′] connect the distinguished vertex πiof blocking-gadgetα(c) to ai. Add another blocking-gadgetα(c) and for allj ∈ [t′] connect the distinguished vertex πj of this new gadget to bj.

The steps above ensure that there exist vertices ai ∈ A and bj ∈ B that receivecolor C. This will indicate that Xi,j is selected. Now we put further constraintson the coloring of Qi and Sj when they correspond to a selected instance.

8. Add blocking-gadgetsα and ineq-gadgets to ensure that rie, f and si

e, f receivethe same color whenever ai has color C, as follows. For all i ∈ [t′], f ∈([k]2 ), and e ∈ [n]2, for all (c1, c2, c3) ∈ L, L′, C, R′, R3 such that c1 = Cand c2 6= c3, add a new blocking-gadgetα(c). Connect ai to π1 with anineq-gadget, connect ri

e, f to π2 with an ineq-gadget and connect sie, f to π3

with an ineq-gadget.

9. Let c := (C, . . . , C) be the tuple consisting of n + 1 copies of color C. Foreach j ∈ [t′], and m ∈ [k], add a new blocking-gadgetα(c). Connect πn+1

to bj with an ineq-gadget and for each ` ∈ [n] connect qj`,m to π` with an

ineq-gadget.

This concludes the construction of G′. We will start by proving a number ofclaims about the construction, starting with the “selection property” of the setsA and B.

B Claim 5.5 Let c be a valid Cα-list coloring of G′. Then there exist i, j ∈ [t′] suchthat c(ai) = c(bj) = C.

Proof. Since c is a proper coloring, it properly colors the blocking-gadgetsαcreated for sets A (respectively, B) in Step 7 of the construction. By Lemma 5.3,there exists a vertex πi in the first gadget and a vertex πj in the second gadgetsuch that c(πi) = c(πj) = L. As a result of the ineq-gadgets added in this step,c(πi) 6= c(ai) and c(πj) 6= c(bj). Since L(ai) = L(bj) = L, C, it follows thatc(bj) = c(ai) = C. C

5


We will now show that in any proper Cα-LIST COLORING, there exists a jsuch that k vertices in Qj will be selected, meaning they receive color L.

B Claim 5.6 Let c be a valid Cα-list coloring of G′. There exists j ∈ [t′] suchthat for all m ∈ [k], the set qj

`,m | ` ∈ [n] contains at least one vertex q withc(q) = L.

Proof. By Claim 5.5, there exists j ∈ [t′] such that c(bj) = C. We show that thischoice for j has the desired property. Suppose for a contradiction that for somem ∈ [k], for all q ∈ qj

`,m | ` ∈ [n] it holds that c(q) = C. Let c := (C, . . . , C)and consider the blocking-gadgetα(c) added in Step 9 for which for all ` ∈ [n]vertex π` was connected to qj

`,m with an ineq-gadget and bj was connectedto πn+1 with an ineq-gadget. It follows from Lemma 5.1 that for this gadgetc(πi) 6= C for all i. Using Lemma 5.3, this contradicts that c is a proper coloringof G′.

Thereby, each set qj`,m | ` ∈ [n] contains at least one vertex q with

c(q) 6= C. Since L(q) = L, C, it follows that c(q) = L. C

We continue by proving a claim about the coloring of the set Si, when i issuch that ai received color C.

B Claim 5.7 Let c be a valid Cα-list coloring of G′. There exists i ∈ [t′] such thatfor all e ∈ [n]2 and f ∈ ([k]2 ) it holds that c(ri

e, f ) = c(sie, f ).

Proof. Choose i ∈ [t′] such that c(ai) = C, such a value for i exists by Claim 5.5.We show that this choice for i has the desired property.

Suppose for contradiction that there exist e ∈ [n]2, f ∈ ([k]2 ) such thatc(ri

e, f ) 6= c(sie, f ). Let c := (C, c(ri

e, f ), c(sie, f )). Verify that in Step 8, a blocking-

gadgetα(c) was added and its distinguished vertices π1, π2, and π3 were con-nected with ineq-gadgets to ai, ri

e, f , and sie, f , respectively. By Lemma 5.1, it

follows that c(π1) 6= C, c(π2) 6= c(rie, f ) and c(π3) 6= c(si

e, f ). But by Lemma 5.3this contradicts that c properly colors this blocking-gadgetα, which concludesthe proof. C

B Claim 5.8 Let c be a valid Cα-list coloring of G′. Let j ∈ [t′], m ∈ [k], and` ∈ [n]. Then c(qj

`,m) = L implies that c(pj`,m) = R.

Proof. Suppose for contradiction that c(qj`,m) = L but c(pj

`,m) 6= R. Since

L(pj`,m) = R, C this implies that c(pj

`,m) = C. Let c := (L, C). In Step 2, ablocking-gadgetα(c) was added and its distinguished vertex π1 was connectedto qj

`,m with an ineq-gadget, and similarly π2 was connected to pj`,m with an

ineq-gadget. By Lemma 5.1 it follows that c(π1) 6= c(qj`,m) = L and c(π2) 6=

5


c(pj`,m) = C. But by Lemma 5.3 it follows that this coloring cannot be extended

to color the blocking-gadgetα, which contradicts that c is a proper coloring ofG′. C

When G′ is Cα-list-colorable, we need to prove that one of the input instanceshas a clique of size k. The following claim will be used to show that we neverselect two vertices into this clique if there is no edge between them.

B Claim 5.9 Let c be a valid Cα-list coloring of G′, such that c(rie, f ) = c(si

e, f )

for some e = (e1, e2) ∈ [n]2, f = f1, f2 ∈ ([k]2 ) with f1 < f2, and i ∈ [t′].

If xe1 , xe2 /∈ E(Gi,j) for some j ∈ [t′], then c(qje1, f1

) 6= L or c(qje2, f2

) 6= L.

Proof. Let c be a valid coloring. Suppose for contradiction that there existi, j ∈ [t′], e = (e1, e2) ∈ [n]2, and f = f1, f2 ∈ ([k]2 ) such that c(ri

e, f ) = c(sie, f )

and xe1 , xe2 /∈ E(Gi,j), and suppose c(qje1, f1

) = c(qje2, f2

) = L. It follows from

Claim 5.8 that c(pje2, f2

) = R.

Since xe1 , xe2 /∈ E(Gi,j), it holds that rie, f , qj

e1, f1, si

e, f , pje2, f2 ∈ E(G′) by

Step 4 of the construction. Since c(rie, f ) = c(si

e, f ) and L(rie, f ) = L(si

e, f ) =

L′, R′ there should exist a single color in L′, R′ to properly color bothvertices. This is however impossible, since ri

e, f is connected to a vertex with

color L, while sie, f is already connected to a vertex of color R. Coloring c(ri

e, f ) =

c(sie, f ) = R′ would not properly color the edge ri

e, f , qje1, f1 since R′ and L are

not neighbors in Cα, while setting c(rie, f ) = c(si

e, f ) = L′ would not properly

color edge sie, f , pj

e2, f2 since L′ and R are not neighbors in Cα (note that in the

case of α = 3, indeed L′ = R and R′ = L). This contradicts that c is a properCα-coloring of G′. C

The structural properties derived so far, lead to the proof of the follow-ing claim. Combined with Claim 5.11, it will show that the composed in-stance (G′,L) acts as the logical OR of the CLIQUE inputs.

B Claim 5.10 If some input graph Gi∗ ,j∗ has a clique of size k, then G′ is Cα-listcolorable.

Proof. Take such i∗ and j∗, and let x1, . . . , xn be the vertices of Gi∗ ,j∗ . PickI = (i1, . . . , ik) ⊆ [n] such that xi | i ∈ I is a clique of size k in Gi∗ ,j∗ , therebyall indices in I are distinct. We show how to define a proper Cα-list coloring cof G′.

Let c(ai∗) := c(bj∗) := C. For all i ∈ [t′] with i 6= i∗, let c(ai) := L and forall j ∈ [t′] with j 6= j∗ let c(bj) := L.

5


For all m ∈ [k], let c(qj∗

im ,m) := L and let c(pj∗

im ,m) := R. Color all remainingvertices in Q with color C.

For all e = (e1, e2) ∈ [n]2 and f = f1, f2 ∈ ([k]2 ) with f1 < f2, let c(ri∗e, f ) :=

c(si∗e, f ) := L′ when vertex ri∗

e, f is connected to vertex qj∗

e1, f1and c(qj∗

e1, f1) = L

(note that the color of this vertex is already defined). Otherwise, let c(ri∗e, f ) :=

c(si∗e, f ) := R′. For all i ∈ [t′] such that i 6= i∗, let c(ri

e, f ) := L′ and define

c(sie, f ) := R′.

Before showing how to extend c to properly color the blocking-gadgetsα andineq-gadgets, we first show that the coloring defined so far is proper. It is easyto verify that the defined coloring respects the lists of the vertices. The onlyedges to consider are those between a vertex in S and a vertex in Q. There aretwo types of edges in the graph.

• Edges of the form rie, f , qj

`,m for e = (e1, e2) and f = f1, f2 with f1 < f2.For this edge to exist, it must hold that ` = e1 and m = f1. If j 6= j∗ thisedge is properly colored since c(qj

`,m) = C and c(rie, f ) ∈ L′, R′. If i 6= i∗,

this edge is properly colored since c(qj`,m) ∈ L, C and c(ri

e, f ) = L′. So

suppose i = i∗ and j = j∗. By the definition of the coloring, if c(qj`,m) = L

and this edge exists, it follows that c(rie, f ) was defined as L′ and the edge is

properly colored. Otherwise, c(qj`,m) = C (or the edge does not exist), and

since c(rie, f ) ∈ R

′, L′ the edge is again properly colored.

• Edges of the form sie, f , pj

`,m for e = (e1, e2) and f = f1, f2 with f1 < f2.For this edge to exist, it must hold that ` = e2 and m = f2. Suppose j 6= j∗,then since c(si

e, f ) ∈ L′, R′ and c(pje2, f2

) = C this edge is properly colored.

Similarly, if i 6= i∗ then c(sie, f ) = R′. Since c(pj

e2, f2) ∈ R, C the edge is

properly colored.

So suppose i = i∗ and j = j∗. If c(pj∗

e2, f2) = C or c(si∗

e, f ) = R′ it is easy to verify

that this coloring is always proper. So suppose c(pj∗

e2, f2) = R and c(si∗

e, f ) = L′.

The choice of c(si∗e, f ) = L′ implies that ri∗

e, f is connected to qj∗

e1, f1and that

c(qj∗

e1, f1) = L. The fact that c(pj∗

e2, f2) = R implies that we defined c(qj∗

e2, f2) = L.

Note that by definition, f1 6= f2. But c(qj∗

e2, f2) = c(qj∗

e1, f1) = L implies that

e1, e2 ∈ I and e1 6= e2 since f1 6= f2. However, the existence of the edgesri∗

e, f , qj∗

e1, f1 and si∗

e, f , pj∗

e2, f2 in G′ implies that xe1 , xe2 /∈ E(Gi∗ ,j∗), which

contradicts that xi | i ∈ I is a clique in Gi∗ ,j∗ . Thus we conclude that the

5


edge sie, f , pj

`,m is properly colored.

It remains to extend the given coloring to properly color all gadgets. Wewill use that whenever vertices π1, . . . , πm of some blocking-gadgetα(c) areconnected only to vertices v1, . . . , vm with ineq-gadgets, it follows that thecoloring of v1, . . . , vm can be extended to color blocking-gadgetα(c) and theineq-gadgets connecting them whenever there exists i ∈ [m] such that c(vi) 6= ci.This follows immediately from Lemmas 5.1 and 5.3. We will show how to colorthe gadgets created in each step of the construction.

Consider a blocking-gadgetα(c) added in Step 2, note that in this casec = (c1, c2) = (L, C). For every j ∈ [t′], ` ∈ [n], and m ∈ [k] we defined thateither c(qj

`,m) = C 6= c1 or c(qj`,m) = L and c(pj

`,m) = R 6= c2. In both cases, itis easy to see that this coloring can be extended to the blocking-gadgetα and alladded ineq-gadgets.

It is easy to see that c can be extended to color the two blocking-gadgetsαand the ineq-gadgets added in Step 7, since we defined c(ai∗) = C 6= L andc(bj∗) = C 6= L.

Consider a blocking-gadgetα((c1, c2, c3)) added in Step 8, let it be added forsome i ∈ [t′], f ∈ ([k]2 ) and e ∈ [n]2. Note that c1 = C and c2 6= c3. If i 6= i∗,we defined ai = L 6= c1 and we can extend the coloring to this gadget and theconnecting ineq-gadgets. If i = i∗, we defined c(ri∗

e, f ) = c(si∗e, f ) and this coloring

can again be extended to all gadgets.Consider a blocking-gadgetα(c) added in Step 9. Note that c = (C, . . . , C).

If j 6= j∗ we defined c(bj) = L 6= C and the coloring can be extended to thegadgets. If j = j∗, let this gadget be added for some m ∈ [k]. Then we definedc(qj∗

im ,m) = L 6= C and again c can be extended to the gadgets.Thereby, we have shown that c can be extended to a valid Cα-list coloring of

G′, and thus G′ is Cα-list colorable. C

B Claim 5.11 If G′ has a proper Cα-list coloring, then there exist i∗, j∗ ∈ [t′] suchthat Gi∗ ,j∗ has a clique of size k.

Proof. Let c be a proper Cα-list coloring of G′. Pick j∗ such that for all m ∈ [k],the set qj∗

`,m | ` ∈ [n] contains at least one vertex of color L, using Claim 5.6.

Choose i∗ such that for all e ∈ [n]2, f ∈ ([k]2 ) it holds that c(ri∗e, f ) = c(si∗

e, f ), byClaim 5.7.

For each m ∈ [k], choose exactly one index im such that c(qj∗

im ,m) = L. Notethat such an im exists by the choice of j∗. We define a clique Z in Gi∗ ,j∗ asfollows. Remember V(Gi∗ ,j∗) = x1, . . . , xn.

Z := xim | m ∈ [k].

5


We start by verifying that |Z| ≥ k. It is sufficient to show that all im are distinct,which we prove by showing that there are no ` ∈ [n] and m, m′ ∈ [k] such thatm 6= m′ and c(qj∗

`,m) = c(qj∗

`,m′) = L. This follows immediately from the choiceof i∗ and Claim 5.9, since x`, x` /∈ E(Gi∗ ,j∗) because the input graphs have noself-loops. It follows that |Z| = k. It remains to verify that Z is a clique in Gi∗ ,j∗ .

Suppose for contradiction that there exist distinct vertices x`, x`′ ∈ Z suchthat x`, x`′ /∈ E(Gi∗ ,j∗). Since x`, x`′ ∈ Z, it follows that there exist m, m′ ∈ [k]

with m 6= m′ such that c(qj∗

`,m) = c(qj∗

`′ ,m′) = L.

However, it follows from the choice of i∗ and Claim 5.9 that c(qj∗

`,m) 6= L or

c(qj∗

`′ ,m′) 6= L. This is a contradiction with the definition of Z. It follows that Zis a clique of size k in Gi∗ ,j∗ and thus Xi∗ ,j∗ is a yes-instance. C

It is easy to see that the construction of G′ can be done in polynomial time.It follows from Claims 5.10 and 5.11 that G′ is Cα-list colorable if and only ifthere exist i, j ∈ [t′] such that Xi,j is a yes-instance. It remains to bound the sizeof G′.B Claim 5.12 The number of vertices of G′ is bounded by O(

√t · n2k2α).

Proof. We bound the number of vertices of G′. The step of the construction inwhich the vertices were added to G′ is indicated.

|V(G′)| ≤ nk√

t · 2︸︷︷︸Step 1

+ nk√

t · O(α)︸︷︷︸Step 2

+ n2k2√

t · 2︸︷︷︸Step 3

+ 2 ·√

t︸︷︷︸Step 5,6

+ 2 · O(α√

t)︸︷︷︸Step 7

+ n2k2√

t · O(α)︸︷︷︸Step 8

+ k√

t · O(nα)︸︷︷︸Step 9

= O(√

t · n2k2α). C

It follows that we have given a degree-2 cross-composition from CLIQUE toCα-LIST COLORING and the sparsification lower bound for Cα-LIST COLORING

follows from Theorem 2.14. To show the bound for Cα-COLORING, we modify G′

to a Cα-COLORING instance, such that we obtain a degree-2 cross-compositionfrom CLIQUE to Cα-COLORING.

Start from Cα-LIST COLORING instance G′. We show how to construct anequivalent Cα-COLORING instance G′′. Add a cycle on α vertices to G′. We nowdo a case distinction on the value of α.

If α > 3, choose distinguished vertices L, L′, C, R′, R from the newly addedcycle such that (L, L′, C, R′, R) is a (not necessarily induced) path in this cycle.For any v ∈ V(G′), for any c /∈ L(v), connect c (the vertex from the newlyadded cycle) to v with an ineq-gadget.

If α = 3, let the three vertices of the newly added cycle be L, C, R. Forany v ∈ V(G′), if color C /∈ L(v) connect v to C. If both L′ and R are notcontained in L(v), connect v to vertex R. Finally, if R′ and L are not contained

5

5.2 Additional preliminaries 115

(1, 1) (1, 2) (1, 3) (1, 4) (1, 5)

(2, 1)

(3, 1)

(4, 1)

(5, 1)(5, 2) (5, 3) (5, 4)

Figure 5.3 This figure depicts C25 (see Definition 5.13), where C5 is the five-cycle with

vertex set 1, 2, 3, 4, 5.

in L(v), connect v to L. As such, for C3-coloring, the we have that colors L′ andR coincide, and that colors R′ and L coincide.

It is easy to see that G′′ is Cα-colorable if and only if G′ is Cα-list colorable.Furthermore, |V(G′′)| ≤ O(α) · |V(G′)|+ α = O(

√t · n2k2α2).

We have given a degree-2 cross-composition from CLIQUE to Cα-COLORING.It now follows from Theorem 2.14 that for odd α ≥ 3, Cα-COLORING parame-terized by the number of vertices n does not have a generalized kernel of sizeO(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly. J

5.2 Additional preliminariesIn the remainder of this chapter, we will heavily rely on the information given inSection 2.8. In this section, we provide some additional preliminaries. We willregularly use the graph Cα, which is defined as the cycle on α vertices. Whendoing so, we will label the vertices of Cα with the numbers 1 up to α, such thatV(Cα) := 1, . . . , α and E(Cα) := α, 1 ∪ i, i + 1 | i ∈ [α− 1].

We continue by two additional definitions for graphs.

Definition 5.13 (Gk) If G is a graph and k ∈ N+, then Gk is the graph onvertex set V(G)k where vertices (x1, . . . , xk) and (y1, . . . , yk) are adjacent ifxi, yi ∈ E(G) for all i ∈ [k]. Refer to Figure 5.3 for an example.

For k ≥ 1, graph Gk is G-colorable by the mapping f : (x1, . . . , xk) 7→ x1.Hence Gk ↔ G (recall Definition 2.45).

For an equivalence relation R on a set D, we denote by [u]R the equivalenceclass of u ∈ D. We omit the subscript if it is clear from the context.

Definition 5.14 (G/R) Let G be a graph (potentially with self-loops) and let Rbe an equivalence relation on the vertices of G. Then G/R is the graph on vertexset [v]R | v ∈ V(G) and edge set [u], [v] | ∃u′ ∈ [u], ∃v′ ∈ [v] : u′, v′ ∈

5


E(G).Observe that even when G is assumed to be simple, G/R may contain

self-loops.We give a special name for pp-definable relations (Def. 2.40) of arity 1,

which will be of particular interest.

Definition 5.15 Let Γ be a constraint language over domain D. A definablesubdomain of Γ is a subset S of D that is pp-definable over Γ. It is strict if S ( D.

The following corollary is immediate from Theorem 2.42.

I Corollary 5.16 Let Γ be a constraint language over domain D, and let B bea non-empty subset of D. If B is not a definable subdomain of Γ, then thereis a polymorphism of Γ that is not a polymorphism of B; that is, there exists apolymorphism f : Dn → D of Γ such that f (b1, . . . , bn) /∈ B for some b1, . . . , bn ∈B. J

When presenting pp-definitions of relations, for notational conveniencewe often take the equivalent viewpoint via positive-primitive formulas in-volving the relations of Γ and the equality relation (cf. [21, Definition 1]).Hence R ⊆ Dk is pp-definable over Γ if and only if there is a Boolean for-mula ψ(x1, . . . , xk) with free variables x1, . . . , xk, built using existential quanti-fiers, conjunctions, the binary equality relation, and applications of relationsof Γ, such that (x1, . . . , xk) ∈ R if and only if ψ(x1, . . . , xk) holds.

The following two propositions are known and relatively straightforward toverify; we provide brief explanations for the sake of completeness. Recall thatG∗ is defined as the constraint language that contains the relation F = (u, v) |u, v ∈ E(G) and, for each v ∈ V(G) , the arity-1 relation (v) (see alsoDefinition 2.47).

I Proposition 5.17 Let G be a graph, B ⊆ V(G) a definable subdomain of G∗,and G′ := G[B]. If B′ ⊆ B is a definable subdomain of (G′)∗, then B′ is a definablesubdomain of G∗. J

Proposition 5.17 can be argued in the following way: consider a pp-formulaψ′B′(x) defining B′ over (G′)∗. Let V be the set of all variables occurring inψ′B′(x) (including x). We can define B′ over G∗ by the formula

ψB′(x) := ψ′B′(x) ∧∧

v∈VψB(v),

where ψB(v) is the formula defining B over G∗. It is easy to verify that ψB′(x) isa pp-formula defining B′ over G∗.

I Proposition 5.18 Let G be a graph, let R be an equivalence relation on V(G)that is pp-definable in G∗, and let G′ := G/R. If B′ ⊆ V(G′) is a definablesubdomain of (G′)∗, then the pre-image B′′ := v ∈ V(G) | [v] ∈ B′ is adefinable subdomain of G∗. J

5

5.3 Sparsification lower bounds for H-Coloring 117

The key ingredient for the proof of Proposition 5.18 is that for u, v ∈ V(G)one can pp-define the condition [u], [v] ∈ E(G′) over G∗ via ∃u′, ∃v′ : uRu′ ∧vRv′ ∧ EG(u′, v′). Similarly, for u, v ∈ V(G) one can pp-define [u] = [v] by∃u′, ∃v′ : u′Ru ∧ v′Rv ∧ u′ = v′. As such, given a pp-definition of B′ over G′,one may obtain a pp-definition of B′′ over G by applying the above substitutionsto the formula.

5.3 Sparsification lower bounds for H-ColoringIn this section we will prove that several classes of H-COLORING problems donot have a non-trivial sparsification unless NP ⊆ coNP/poly.

We will start this section by showing that when H is a core (recall Defini-tion 2.48) and has a definable subdomain B such that H[B] is homomorphicallyequivalent to Cα, we can transfer the lower bounds for Cα-COLORING that wehave proven in the previous section, to H-COLORING.

Note that for a graph H that is a core, the only unary operations on V(H)that preserve the edge relation of H are bijections. Otherwise, such an operationwould provide a homomorphism from H to a proper induced subgraph of H,contradicting that H is a core. Hence Theorem 2.44 implies the following.

I Lemma 5.19 Let H be a graph that is a core. Then there is a linear-parametertransformation from CSP(H∗) parameterized by the number of variables, toCSP(H) (corresponding to H-COLORING) parameterized by the number of vari-ables. J

The transformation of Lemma 5.19 has a simple graph-theoretic interpre-tation: an instance of CSP(H∗), which asks whether a given graph G can beH-colored while respecting fixed colors pre-assigned to a subset of the verticesof G, is equivalent to the instance of H-COLORING that is obtained from the dis-joint union G′ of G and H by identifying all vertices precolored with x ∈ V(H)with the copy of x in G′. Hence the number of variables (vertices) increases byat most an additive term |V(H)|.

When Γ is a constraint language over domain D and B ⊆ D, we use Γ|B todenote the constraint language R|B | R ∈ Γ, where, for a relation R of arity k,we define R|B as R ∩ Bk.

I Theorem 5.20 Let Γ be a constraint language over domain D, and let B be adefinable subdomain of Γ. Then there is a linear-parameter transformation fromCSP(Γ|B) to CSP(Γ).

Proof. Fix a set of constraints I over Γ, using variable set V = v, x1, . . . , xm,such that ∃x1, . . . , xm : I witnesses the pp-definability (Def. 2.40) of B over Γ.In other words, I is chosen such that for all d ∈ D, it holds that d ∈ B if andonly if the mapping sending v to d can be extended to a satisfying assignmentof I .

5


Given an instance (C, v1, . . . , vn) of CSP(Γ|B), the transformation pro-duces the instance described as follows. For each i ∈ [n], let I(vi, x1

i , . . . , xmi )

denote the set of constraints in I but with the variables renamed accordingly.Then, the produced instance has variable set

⋃i∈[n]vi, x1

i , . . . , xmi , and its con-

straints are the constraints in C along with all of the constraints in the setsI(vi, x1

i , . . . , xmi ), over i ∈ [n]. Observe that the new instance’s variable set has

size n(m + 1). The transformation is correct, as a satisfying assignment forthe original instance of CSP(Γ|B) can be extended to a satisfying assignmentfor the produced instance due to choice of the instance (I(v, x1, . . . , xm), V)of CSP(Γ); in the other direction, a satisfying assignment for the producedinstance must map each variable vi to a value in B, due to the same choice,and hence restricting such an assignment to v1, . . . , vn yields a satisfyingassignment for the original instance of CSP(Γ|B). J

Using the results obtained above, we can now show how the lower boundfor Cα-COLORING transfers to H-COLORING, under the right conditions on H.

I Lemma 5.21 Let H be a graph that is a core such that H∗ has a definablesubdomain B ⊆ V(H) satisfying H[B] ↔ Cα for some odd α ≥ 3. Then H-COLORING parameterized by the number of vertices n does not admit a generalizedkernel of size O(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly.

Proof. By the lower bound for Cα-COLORING of Theorem 5.4, together withthe known fact that linear-parameter transformations transfer generalized ker-nelization lower bounds (Theorem 2.8), it suffices to give a linear-parametertransformation from Cα-COLORING to H-COLORING, both parameterized by thenumber of vertices. We build it by applying several sub-transformations insequence, using the relation to CSPs.

Recall (Definition 2.39) that H∗ is the constraint language over domain V(H)consisting of the edge relation of H and unary singleton relations forcing a valueto a constant. Hence (H∗)|B (cf. Theorem 5.20) is the constraint language overdomain B consisting of the edge relation of H[B] and constants for b ∈ B, so itis equivalent to (H[B])∗. Since H[B] ↔ Cα, there is a trivial linear-parametertransformation from Cα-COLORING to CSP((H[B])∗). Then Theorem 5.20 givesa linear-parameter transformation from CSP((H[B])∗) to CSP(H∗). Since H isa core, by Lemma 5.19 there is a linear-parameter transformation from CSP(H∗)parameterized by the number of variables, to CSP(H) parameterized by thenumber of variables (i.e., H-COLORING). Since linear-parameter transformationscompose, this yields a linear-parameter transformation from Cα-COLORING

parameterized by the number of variables to H-COLORING, which concludes theproof. J

Using Lemma 5.21, the proof of the sparsification lower bound we givefor H-COLORING (Theorem 5.44) essentially reduces to showing that H hasa definable subdomain that induces a subgraph homomorphically equivalent

5


to an odd cycle. In Section 5.3.1, we show how this can be done for graphsin which no edge is contained in two distinct minimum-length odd cycles. InSection 5.3.2, we obtain such definable subdomains for all graphs that contain atriangle. This leads to a proof of the main result of this chapter (Theorem 5.44)in Section 5.3.3.

5.3.1 Defining a subdomain homomorphically equivalent toan odd cycle

We say a nonbipartite graph G with odd girth α has the no-overlap property ifthere is no edge in G that is contained in two distinct Cα subgraphs of G. Herewe consider two subgraphs distinct if their edge-sets are distinct. The main resultof this subsection will be that if a graph G satisfies the no-overlap property, thenG∗ has a definable subdomain B such that G[B] is homomorphically equivalentto Cα. This will allow us to prove the sparsification lower bound for H-COLORING

when H satisfies the no-overlap property, at the end of this chapter.The result is obtained by a careful analysis of the structure of graph powers of

odd cycles, resulting in a lemma about projectivity. It generalizes Bulatov’s [21]results for homomorphisms of powers of triangles into graphs that have no edgein two triangles, to statements about homomorphisms of powers of Cα intographs of odd girth α with the no-overlap property.

We use Ckα as shorthand for (Cα)k (recall Definition 5.13). We will start by

proving that for all k ≥ 1 and α ≥ 3, adding any edge to Ckα will reduce the

odd girth of Ckα, or introduce an edge that lies on two distinct Cα-subgraphs. To

prove this, we start by showing a number of relevant properties of the graphsCα and Ck

α.Given a walk X = (x0, . . . , xk), the walk Y is a subwalk of X there exist

i0 < i1 < . . . < i` ∈ 0, . . . , k such that Y = (xi0 , xi1 , . . . , xi`). First of all, weobserve for any graph G, every odd closed walk in G, contains an odd closedsubwalk that is an odd cycle. Thus, to show that G has an cycle of odd length atmost m, it is sufficient to show that there is closed walk of odd length at most m.

Observation 5.22 Let G be a simple graph. If G contains a closed walk X oflength α for some odd α ≥ 3, then there exists a closed subwalk Y of X, such thatY is an odd cycle in G of length at most α.

For a cycle of odd length, there are always two edge-disjoint paths betweenany two points on the cycle. It is easy to observe that one of these paths musthave an even length. From this, we have the following observation.

Observation 5.23 Let α ≥ 3 be odd and let x, y ∈ V(Cα), not necessarily distinct.There exists a walk from x to y in Cα of length exactly α− 1.

The following lemma is the key ingredient in the proof of Lemma 5.25.

I Lemma 5.24 Let α ≥ 3 be odd and k ≥ 1. Let x, y ∈ V(Ckα), not necessarily

distinct, with x, y /∈ E(Ckα). There are two distinct walks of length α− 1 from x

5


y

x

y

x

x′x′′ x′′

A B

x′

Figure 5.4 Example of two distinct walks of length α− 1 from x to y with k = 1 andα = 7.

to y in Ckα.

Proof. We will prove the result by induction on k. For k = 1, it is easy to verifythat there is a simple path P from x to y of even length m, with m < α− 1.Observe that m = 0 is allowed. We use this path P to define the two desiredwalks A and B of length α− 1. Let x′, x′′ ∈ V(Cα) be the two neighbors of x inCα, implying x′ 6= x′′. Define

A = (x, x′, x, x′, . . . , x, x′︸︷︷︸α−1−m vertices

, P)

and defineB = (x, x′′, x, x′′, . . . , x, x′′︸︷︷︸

α−1−m vertices

, P).

See Figure 5.4 for an example of the paths defined above.It can be verified that A and B are two walks from x to y of length α− 1.

They are distinct since P has length at most α− 3, and thus the second vertexvisited by A is x′ and the second vertex visited by B is x′′, and x′ 6= x′′. Thisconcludes the base case.

Let k > 1 and suppose the statement holds for all smaller values of k. Fori ∈ [k], let x(i) := (x1, . . . , xi−1, xi+1, . . . , xk) be the vector x with coordinatei removed. Define y(i) accordingly. Choose i ∈ [k] such that x(i), y(i) /∈E(Ck−1

α ), which exists since x, y /∈ E(Ckα). For ease of notation, we from

now on assume i = k. By the induction hypothesis, there are two distinctwalks from x(k) to y(k) in Ck−1

α of length α− 1, let these be A′ and B′. LetA′ = (a1

′, . . . , a′α) and let B′ = (b1′, . . . , b′α), where x(k) = a1

′ = b1′ and

y(k) = a′α = b′α. Furthermore, let S = (xk = s1, s2, . . . , sα−1, sα = yk) bea walk from xk to yk in Cα of length exactly α − 1. Such a walk exists byObservation 5.23.

We define walks A := (a1, . . . , aα) and B := (b1, . . . , bα) in Ckα as follows;

see Figure 5.5 for a sketch. Let ai := (a′i,1, . . . , a′i,k−1, si), and define bi :=(bi,1, . . . , bi,k−1, si) for i ∈ [α]. Since A′ and B′ are distinct walks, it is easy to

5


xk = s1 = s3

yk = s7

s2 = s4

Ss5

s6

x = (a1, s1)

y = (a7, s7)

(b6, s6)(a6, s6)(b5, s5)

(b4, s4)

(b3, s3)(b2, s2)(a2, s2)

(a3, s3)

(a4, s4)

(a5, s5)

Figure 5.5 (left) Walk of even length from xk to yk in Cα. (right) Two distinct closedwalks of length α = 7 containing the edge x, y.

verify that A and B are distinct walks. Since A′ and B′ are walks in Ck−1α from

x(k) to y(k) and S is a walk in Cα from xk to yk, it follows that A and B areindeed walks from x to y in Ck

α of length α− 1. J

We can now show that adding an edge to Ckα will decrease the odd girth of

Ckα, or introduce an edge that lies in two distinct Cα subgraphs (such that the

no-overlap property no longer holds).

I Lemma 5.25 Let α ≥ 3 be odd and k ≥ 1. Let x, y ∈ V(Ckα) be two distinct

vertices, such that x, y /∈ E(Ckα). Let G be the graph obtained by adding edge

x, y to Ckα. Then

• The odd girth of G is smaller than α, or

• G contains an edge that is contained in at least two cycles of length α.

Proof. It follows directly from Lemma 5.24 that the edge x, y lies on twodistinct closed walks A and B of length α in G. If one of A and B is not a cycle,it follows from Observation 5.22 that the odd girth of G is smaller than α. Ifboth A and B are cycles, then it immediately follows that G contains an edgethat lies on two distinct cycles of length α. J

Using this result, we will prove that if G has the no-overlap property, then ithas a definable subdomain isomorphic to Ck

α. We need the following additionallemmas and definitions, following Bulatov’s notation [21].

Definition 5.26 (ker) For a function f : S→ T, define

ker( f ) := (s, s′) | s, s′ ∈ S ∧ f (s) = f (s′).

Hence ker( f ) is the equivalence relation pairing up all elements with thesame image under f , represented as a set of pairs.

Definition 5.27 (πI) For a domain D, integer k ≥ 1, and index set I =i1, . . . , i|I| ⊆ [k] with i1 < . . . < i|I|, we use πI to denote the functionthat projects any tuple x = (x1, . . . , xk) ∈ Dk onto index set I, so that πI(x) :=(xi1 , . . . , xi|I|).

5


Observe that by this definition, ker(πI) = (x, y) | xi = yi for all i ∈ I.I Proposition 5.28 Let α ≥ 3 be odd. For all vertices b0, bα ∈ V(Cα) there existwalks

(b0, b1, . . . , bα−1, bα) and (b0 = b′0, b′1, . . . , b′α−1, b′α = bα)

of length α in Cα such that b1 6= b′1, and if b0 6= bα then bα−1 = b′α−1.

Proof. Assume without loss of generality that b0 = 1. In case that b0 = bα, thewalks (1, 2, 3, . . . , α, 1) and (1, α, α− 1, . . . , 2, 1) suffice.

Assume b0 6= bα. Let P = (p1, . . . , pm) be a path of odd length from b0 tobα. Observe that such a path exists and has length at most α− 2, as b0 6= bα.Let b1 and b′1 be the neighbors of b0. Consider the walks given by (b0, b′1, P)and (b0, b1, P), pad both paths with repetitions of (pm−1, pm) as needed suchthat they have length exactly α. It is easy to see that these walks satisfy therequirements. J

For x = (x1, . . . , xk−1) ∈ V(Ck−1α ) and c ∈ V(Cα), let (x, c) denote the tuple

formed by (x1, . . . , xk−1, c); note that such a tuple represents a vertex in Ckα. We

sometimes omit the brackets for readability.

I Lemma 5.29 Let G be a nonbipartite graph with odd girth α that has theno-overlap property. Let ϕ : V(Ck

α)→ V(G) be a homomorphism for some k ≥ 2.If there exist x, y ∈ V(Ck−1

α ) and c, d ∈ V(Cα) such that ϕ(x, c) = ϕ(y, d) andc 6= d, then

ker(π[k]\k) ⊆ ker(ϕ),

that is, ϕ(x′) = ϕ(y′) for all x′, y′ ∈ V(Ckα) that agree on the first k− 1 coordi-

nates.

Proof. Let ϕ : V(Ckα)→ V(G) be a homomorphism. We start with the following

claim.

B Claim 5.30 Let ((x0, a0), . . . , (xα, aα)) be a walk in Ckα such that a0 6= aα,

ϕ(x0, a0) = ϕ(xα, aα), and xi ∈ V(Ck−1α ) for all 0 ≤ i ≤ α. Let b1, b′1 be the

neighbors of a0 in Cα. Then

ϕ(x1, b1) = ϕ(x1, b′1).

Proof. Use Proposition 5.28 to obtain walks (a0 = b0, b1, . . . , bα = aα) and(a0 = b′0, b′1, . . . , b′α = aα) in Cα such that bα−1 = b′α−1. Consider the followingwalks in Ck

α:

W := ((x0, b0), . . . , (xα, bα)), and W ′ = ((x0, b′0), . . . , (xα, b′α)).

The images of W and W ′ under ϕ form walks of length α in G. Let these imagesbe Y and Y′ respectively. Observe that Y and Y′ are closed walks in G, by theassumption that ϕ(x0, a0) = ϕ(xα, aα). Since the odd girth of G is α, it followsusing Observation 5.22 that Y and Y′ must be cycles in G.

5


Since bα−1 = b′α−1, the last edge of walks W and W ′ is the same. Thereby,the same holds for Y and Y′, such that Y and Y′ are cycles of length α inG that share an edge. It follows from the fact that G has the no-overlapproperty, that thereby Y = Y′. Hence ϕ(xi, bi) = ϕ(xi, b′i) for all 0 ≤ i ≤ α,implying ϕ(x1, b1) = ϕ(x1, b′1). C

We will prove that there exists a (k− 1)-tuple y0 ∈ V(Ck−1α ) such that for

all a, a′ ∈ V(Cα) it holds that ϕ(y0, a) = ϕ(y0, a′). Towards achieving this, weprove the following claim.

B Claim 5.31 Suppose that x0 ∈ V(Ck−1α ), that b, b′ ∈ V(Cα) are distinct vertices

that have a common neighbor a in Cα, and that ϕ(x0, b) = ϕ(x0, b′). Supposefurther that x1 ∈ V(Ck−1

α ) is such that there exists a walk (x0, x1, . . . , x0) of lengthα in Ck−1

α . Let a and a′ be the two neighbors of b; then

ϕ(x1, a) = ϕ(x1, a′).

Proof. From Proposition 5.28, there is a walk W = (b, c, . . . , b′) of length α fromb to b′ in Cα. Consider the walk U = ((x0, b), (x1, c), . . . , (x0, b′)) in Ck

α obtainedfrom combining the walk in the hypothesis and the walk W. By Claim 5.30applied to U, we obtain the conclusion. C

From the hypothesis of the lemma and from Proposition 5.28, there is alength-α walk from a vertex (x, c) ∈ Ck

α to a vertex (y, d) ∈ Ckα where both

vertices are mapped to the same value under ϕ and such that c 6= d. We canapply Claim 5.30 to this walk to obtain y0 ∈ Ck−1

α , and vertices b0, b′0 ∈ Cα

sharing a neighbor (in Cα) such that ϕ(y0, b0) = ϕ(y0, b′0). We assume upto symmetry that b0 = α and b′0 = 2, so we have ϕ(y0, α) = ϕ(y0, 2). Letz0 be such that there exists a walk W0 = (y0, z0, . . . , y0) of length α in Ck−1

α .By applying Claim 5.31, we may obtain that ϕ(z0, 1) = ϕ(z0, 3). By applyingthis same claim to the vertices 3, 1 ∈ Cα and the length α walk (z0, y0, . . . , z0)obtained by rotating W0, we obtain that ϕ(y0, 2) = ϕ(y0, 4). Repeatedly arguingin this zipper-like fashion, we obtain that ϕ(y0, α) = ϕ(y0, 2) = ϕ(y0, 4) = · · · .Since 2 generates the integers modulo any odd number, we obtain that for allvalues b, b′ it holds that ϕ(y0, b) = ϕ(y0, b′).

We use this to argue that in fact for an arbitrary x0 ∈ V(Ck−1α ) it holds that

ϕ(x0, a) = ϕ(x0, a′) for all a, a′ ∈ V(Cα). To show this, we will show that ifthe statement holds for x0 and x1 is a neighbor of x0, then the statement holdsfor x1.

B Claim 5.32 Let x0 ∈ V(Ck−1α ) and let x1 be a neighbor of x0 in Ck−1

α . Ifϕ(x0, b) = ϕ(x0, b′) for all b, b′ ∈ V(Cα), then for all b, b′ ∈ V(Cα):

ϕ(x1, b) = ϕ(x1, b′).

Proof. It is sufficient to prove that ϕ(x1, b) = ϕ(x1, b′) for all b, b′ that havea common neighbor in Cα. After all, by using this repeatedly we then obtain

5


ϕ(x1, 1) = ϕ(x1, 3) = . . . = (x1, α) = ϕ(x1, 2) = . . . and obtain the result forall b, b′ ∈ V(Cα).

Let such b, b′ be given with common neighbor a and let x0 and x1 be given.Let a′ be the other neighbor of b. It is easy to verify that there exists a walk(x0, x1, x2, . . . , xα−1, x0) in Ck−1

α for some x2, . . . , xα−1 and furthermore thereexists a walk (a, a1, a2, . . . , aα−1, a′) in Cα. Consider the walk

W = ((x0, a), (x1, a1), (x2, a2), . . . , (xα−1, aα−1), (x0, a′)).

Given that b and b′ are the neighbors of a, we obtain from Claim 5.30 thatϕ(x1, b) = ϕ(x1, b′), as desired. C

The fact that ϕ(x0, b) = ϕ(x0, b′) for arbitrary x0, b, b′ now follows from theexistence of y0 such that ϕ(y0, b) = ϕ(y0, b′) for all b, b′ ∈ Cα, the fact thatCk−1

α is connected, and Claim 5.32.To see that this indeed implies that ker(π[k]\k) ⊆ ker(ϕ), suppose (x′, y′) ∈

ker(π[k]\k). Let x′′ = π[k]\k(x′) and similarly let y′′ = π[k]\k(x′), such thatx′′ = y′′. It now follows from the claim above that ϕ(x′) = ϕ(x′′, xk) =ϕ(y′′, xk) = ϕ(y′′, yk) = ϕ(y′) and thus (x′, y′) ∈ ker(ϕ), as desired. J

Definition 5.33 Let H be a graph and k ≥ 1. A homomorphism ϕ from Hk

to a graph G is projective, if there exists a nonempty index set I ⊆ [k] suchthat ϕ(x) = ϕ(y) if and only if πI(x) = πI(y).

Hence in a projective homomorphism, the image of x ∈ V(Hk) is determineduniquely by the coordinates in I, and distinct values for those coordinates leadto distinct images. The lemma above allows us to prove the following.

I Lemma 5.34 Let G be a nonbipartite graph with odd girth α that has theno-overlap property. Let k > 0 and let ϕ : V(Ck

α) → V(G) be a homomorphismfrom Ck

α to G. Then ϕ is projective.

Proof. By symmetry of the graph Ckα, Lemma 5.29 implies that for any coor-

dinate i ∈ [k], if there are vertices x, y ∈ V(Ckα) that disagree on coordinate i

but have the same image under ϕ, then any x′, y′ ∈ V(Ckα) that agree on all

coordinates except i satisfy ϕ(x′) = ϕ(y′).Let I′ ⊆ [k] consist of those indices i for which there exist x, y ∈ V(Ck

α)that differ on coordinate i but have the same image under ϕ. We can showthat any two vectors that differ only in coordinates of I′ have the same imageunder ϕ. Suppose x, y ∈ V(Ck

α) only differ on coordinates in I′. Then thereexist x1, . . . , xm ∈ V(Ck

α) such that x = x1, y = xm, and for all j ∈ [m− 1] wehave that xj and xj+1 only differ on a single coordinate, and this coordinateis in I′. We then obtain that ϕ(xj) = ϕ(xj+1) for all j by the reasoning above,and thus ϕ(x) = ϕ(y).

By definition of I′, for any index i ∈ [k] \ I′, vectors that differ on coordinate imap to distinct images. Hence for I := [k] \ I′ we have ϕ(x) = ϕ(y) if and onlyif πI(x) = πI(y), which proves that ϕ is projective. J

5


Using this result, we can prove the following lemma.

I Lemma 5.35 Let G be a nonbipartite graph with odd girth α that has theno-overlap property, and let k > 0 be an integer. Let ϕ : V(Ck

α) → V(G) be ahomomorphism from Ck

α to G. Let B := ϕ(v) | v ∈ V(Ckα) ⊆ V(G). Then the

induced subgraph G[B] is isomorphic to Cmα for some m ∈ [k].

Proof. Note first that if Cmα is a spanning subgraph of G[B], then Cm

α must infact be isomorphic to G[B]: by Lemma 5.25, if Cm

α is a spanning strict subgraphof G[B], then G[B] can be obtained from Cm

α by adding one or more edges andtherefore G has odd girth less than α, or the no-overlap property is violated.Hence to complete the proof, it suffices to show that Cm

α is a spanning subgraphof G[B] for some m.

By Lemma 5.34 there is a nonempty I ⊆ [k] such that ϕ(x) = ϕ(y) if andonly if πI(x) = πI(y). Let m := |I|. Hence the image B := ϕ(x) | x ∈ V(Ck

α)consists of exactly αm points. For y ∈ V(Cm

α ), let g(y) ∈ V(Ckα) denote the

vector that agrees with y on the coordinates indexed by I, and has value 1 atall other coordinates. By the projectivity of ϕ, the function ϕ′(y) := ϕ(g(y))is a bijection from V(Cm

α ) to B. To show that Cmα is isomorphic to a spanning

subgraph of G[B], it remains to show that for all edges x, y ∈ E(Cmα ) we

have ϕ′(x), ϕ′(y) ∈ E(G). Consider such a pair x, y ∈ E(Cmα ). There

exist x′, y′ ∈ V(Ckα) such that x′, y′ ∈ E(Ck

α) while πI(x′) = x and πI(y′) = y.For example, such a pair can be obtained by padding x with the value 1 andpadding y with the value 2 on all positions not included in I. Projectivityof ϕ ensures ϕ(x′) = ϕ′(x) and ϕ(y′) = ϕ′(y). Since ϕ is a homomorphismand x′, y′ is an edge of Ck

α, it follows that ϕ(x′), ϕ(y′) is an edge of G.Hence Cm

α is isomorphic to a spanning subgraph of G[B], which completes theproof. J

Lemma 5.35 is the key ingredient in the proof of the main result of thissubsection, given below. The statement of the next lemma is a generalizationof [21, Claim 4] to general values of α; our proof strategy is similar now thatwe have obtained the necessary lemmata.

I Lemma 5.36 Let G be a graph with odd girth α that has the no-overlap property.Then G∗ has a definable subdomain B such that G[B]↔ Cα.

Proof. We will construct a sequence of subsets B1, B2, . . . of V(G) such thatBi ⊆ Bi+1 for all i, and furthermore each G[Bi] is isomorphic to Cki

α for someki ≥ 1. Since Cα ↔ Ck

α for all k ≥ 1 (projecting onto any fixed coordinate givesa valid Cα-coloring), this yields the desired result. Let B1 be the vertex set ofan arbitrary cycle of length α in G, and start from i = 1. If Bi is a definablesubdomain of G∗, we are done.

Now suppose Bi has been constructed, but Bi is not a definable subdomainof G∗. We show how to construct Bi+1. Since Bi is not a definable subdomainof G∗, it follows from Corollary 5.16 that there exists a polymorphism f of G∗

5


of some arity n, and vertices u1, . . . , un ∈ Bi, such that f (u1, . . . , un) /∈ Bi. Notethat, since f is a polymorphism of G∗, which includes singleton relations forcinga variable to a constant value v ∈ V(G) for all v ∈ V(G), it must hold that f isidempotent (i.e., f (v, . . . , v) = v).

We can consider f as a homomorphism from Gn to G, mapping vertex(v1, . . . , vn) ∈ V(Gn) to f (v1, . . . , vn) ∈ V(G). Verify that f is indeed ahomomorphism: if (u1, . . . , un), (v1, . . . , vn) ∈ E(Gn), then it follows thatui, vi ∈ E(G) for all i ∈ [n] by the definition of Gn. Since f is a polymorphismfor G, it thus follows that f (u1, . . . , un), f (v1, . . . , vn) ∈ E(G), as desired.

Let f ′ := f |Bi be defined as f restricted to Bi, such that f ′ is a homomorphism

from G[Bi]n to G. Since G[Bi] is isomorphic to Cki

α , it follows that f ′ is ahomomorphism from G[Bi]

n = Cn·kiα to G.

Define Bi+1 as the image of f ′ in G. Since f ′ is idempotent, we obtain thatBi is a subset of Bi+1. Furthermore it is easy to see from the definition thatBi ( Bi+1. It follows from Lemma 5.35 that G[Bi+1] is isomorphic to Cm

α forsome m.

Since G is finite and Bi is a strict subset of Bi+1 for every i, there existsan i such that Bi is a definable subdomain of G∗. Since by definition, G[Bi] isisomorphic to Cm

α for some m, this concludes the proof. J

5.3.2 Obtaining a definable subdomain when an edge is intwo α-cycles

The results of Section 5.3.1 can only be applied if G has the no-overlap property,so that no edge of G is contained in two distinct minimum-length odd cycles.In this section we develop constructions that can be used to define strict sub-domains that induce nonbipartite graphs, if G does not have the no-overlapproperty. We start by defining a relevant binary relation on the vertex set of agraph, using E(u, v) as the edge predicate of the graph.

Definition 5.37 (Rα) Let Rα be defined as follows:

Rα(u, v) =

∃x1, . . . , xα, y1, . . . , yα :( ∧

i∈[α−1]

E(xi, xi+1)

)∧( ∧

i∈[α−1]

E(yi, yi+1)

)∧ E(xα, x1) ∧ E(yα, y1) ∧ (y1 = x1) ∧ (yα = xα) ∧ (x2 = u) ∧ (y2 = v).

Intuitively, Rα relates vertices u and v that lie on closed walks of length αsharing a common edge x1, xα, when u and v lie at distance one from vertex x1of this shared edge along the closed walk. Note that such vertices must receivethe same color in any Cα-coloring of G. Using this relation, we can generalizeBulatov’s [21] analysis of chains of rhombuses to graphs of odd girth largerthan 3. See Figure 5.6 for an illustration. Note that Rα is symmetric and that

5


x1

xα

Cα

y2 = vu = x2

Cα

xα−1 yα−1

x1

x3

u v

Figure 5.6 A general depiction of uRαv (left) and a depiction of the uR3v (right).

any vertex that lies on a cycle of length α is related to itself. The followingproperty of Rα, Rn

α, and R+α (recall Definition 2.31) will be essential in a number

of proofs in this section.

I Lemma 5.38 Let G have odd girth α, such that each vertex of G is contained ina cycle of length α. For any n ≥ 0, the relations Rα, Rn

α, and R+α are pp-definable

over G∗.

Proof. The definition of Rα straightforwardly translates into a pp-definition byusing the constant relations of G∗ to enforce x2 = u and y2 = v, and identifyingvariables that the formula requires to be equal.

Using this definition, we can define Rnα using the same process on the

expression

Rnα(u, v) := ∃x1, . . . , xn+1 :

∧i∈[n]

xiRαxi+1

∧ (x1 = u) ∧ (xn+1 = v).

Here applications of Rα can obviously be replaced by the pp-definition of Rα toobtain a pp-definition for Rn

α. Now observe that Rα is reflexive since eachvertex of G lies on a cycle of length α. Hence for n := |V(G)| we havethat Rn

α = R+α , showing that the transitive closure can also be pp-defined

by a finite expression. J

Before presenting further lemmata, let us discuss the overall approach andthe relevance of the relation Rα. To obtain a lower bound via Lemma 5.21, wewant to show that for any (core) graph H with a triangle, H∗ has a definablesubdomain B such that H[B]↔ Cα for some odd α ≥ 3. If H has the no-overlapproperty, then we can obtain such a subdomain using Lemma 5.36. If H hasan edge contained in two distinct triangles, then the private vertices of theseoverlapping triangles are related under R3. Hence in the quotient graph H′ :=H/R+

3, these two private vertices are identified into the same equivalence

class, effectively reducing the number of overlapping triangles. Repeating thisprocess, we eventually arrive at a graph with the no-overlap property to whichLemma 5.36 can be applied. The challenge that then arises is the following:

5


[x]

G

G/R G/R[N([x])]

a

b

c

[a][a]

u

v

[u]

[v]

xG a

b

c

u

v

x

BB

B′

Figure 5.7 Illustration of the potential difficulties in lifting definable subdomains fromquotient graphs, attacked in Lemma 5.40. Here R = R+

3 . Graph G on the top hasoverlapping triangles through the edge u, v. In the quotient graph G/R (bottom-left),the definable subdomain B′ consisting of all neighbors of vertex [x] induces a graph(homomorphically) equivalent to C3, highlighted on the bottom-right. However, thepre-image B of B′ in graph G (highlighted in blue on the top-right) induces a bipartitesubgraph of G, not equivalent to C3. Using the pp-definability of the set B we candefine a subdomain B (highlighted in red on the top-left) of G∗ for which G[B] ↔ C3via B(y) := ∃x1, ∃x2 : x1 ∈ B ∧ x2 ∈ B ∧ EG(c, x1) ∧ EG(x1, x2) ∧ EG(x2, y).

if B′ is a definable subdomain of (H′)∗ such that H′[B′]↔ Cα′ where α′ is theodd girth of H′, then we want to lift B′ to a definable subdomain B of H∗ suchthat H[B]↔ Cα′ . This can fail in two significant ways:

• It is possible that while H′[B′] has a triangle, the vertices B that form thepre-image of B′ induce a bipartite subgraph of H. This behavior is illustratedin Figure 5.7. In this case, we show that the subdomain B′ can be exploitedin a novel way to provide a definable subdomain of H equivalent to an oddcycle.

• The quotient graph H′ may have strictly smaller odd girth than the odd girth αof H, which happens when H′ has a self-loop while H does not. Since Cα′

for odd α′ < α does not admit a homomorphism into a graph of odd girth α,there exists no induced subgraph of H homomorphically equivalent to Cα′ .This issue is resolved by showing that, rather than lifting B′ to a definablesubdomain of B, we directly define a strict nonbipartite subdomain of Husing the fact that the odd girth decreases when taking the quotient graph.Therefore we get closer to the no-overlap case in a different manner.

Note that, by using induction, it is sufficient if our constructions yield a strictsubdomain that induces a graph of the same odd girth, rather than one thatinduces a graph equivalent to an odd cycle. We therefore start by recalling

5


a simple condition under which a nonbipartite subdomain can be defined(cf. [21, §3.1]).

I Lemma 5.39 Let G have odd girth α > 1. If some v ∈ V(G) is not on anycycle of length α, then G∗ has a strict definable subdomain B such that G[B] hasodd girth α.

Proof. Let B be the set of vertices that lie on at least one cycle of length α.Clearly, if G has a vertex v that does not lie on a cycle of length α, it holds thatB ( V(G). Furthermore from this definition it follows G[B] has odd girth α.It remains to show that B is indeed a definable subdomain of G∗. Define B asfollows

B(x) := ∃x0, . . . , xα :

∧i∈[α]

E(xi−1, xi)

∧ (x0 = xα = x).

Observe that B(x) holds exactly if x lies on a closed walk of length α in G. SinceG has odd girth α, it follows from Observation 5.22 that thereby x lies on acycle of length α. J

Using the property that every vertex lies on a minimum-length odd cycle,we now deal with the first type of lifting issue described above.

I Lemma 5.40 Let G be a graph of odd girth 3, such that every vertex of G iscontained in a triangle. Let G′ := G/R+

3. Assume G′ has odd girth 3. If (G′)∗ has

a strict definable subdomain B′ such that G′[B′] has odd girth 3, then G∗ has astrict definable subdomain B such that G[B] has odd girth 3.

Proof. Let B′′ ⊆ V(G) denote the pre-image of B′. Proposition 5.18 showsthat B′′ is a definable subdomain of G∗, since R+

3 is pp-definable by Lemma 5.38.Since B′ ( V(G′) we also have B′′ ( V(G). If furthermore G[B′′] containsa triangle, the lemma statement follows. For the remainder of the proof, wetherefore assume G[B′′] does not contain a triangle.

Choose n1, n2, n3 satisfying the following, such that n1 + n2 + n3 is mini-mized:

1. n1 ≥ n2 ≥ 0, and n1 ≥ n3 ≥ 0.

2. There exist v1, v′1, v2, v′2, v3, v′3 ∈ B′′ such that viRni3 v′i for all i ∈ [3], edge

v′i, vi+1 ∈ E(G) for i ∈ [2], and v′3, v1 ∈ E(G).

To see that such values exist, consider a triangle b′1, b′2, b′3 in G′[B′]. By Defini-tion 5.14, the fact that G′[B′] contains an edge b′i , b′j for (i, j) ∈ (1, 2), (2, 3),(3, 1) implies that in G there is an edge between a vertex v′i ∈ [b′i ] and avertex vj ∈ [b′j]. For the vertices defined in this way, appropriate values of ni

5


n1 = 3 n2 = 1 n3 = 2

v′1v2 v′2 v3 v′3v1 sC

x′2 x3 x′3n3 + n1 − 1 = 4

x2

v′1 ∈ B

Figure 5.8 This figure describes the situation in the proof of Lemma 5.40, it depicts asubgraph of G. Vertices in B′′ are marked in white. The triangle C that is part of B isindicated in bold green. Be aware that not all depicted vertices and edges are necessarilydistinct.

exist since R+3 is the transitive closure of R3: any two elements vi, v′i ∈ [b′i ]

related under R+3 are related by a finite number of applications of R3.

Now take vertices v1, v′1, v2, v′2, v3, v′3 witnessing the minimal value of n1 +n2 + n3 in the statement above. Observe that n1 > 0, as otherwise the abovestatement shows that vertices v1, v2, v3 ∈ B′′ form a triangle in G, contradictingour assumption on G[B′′].

We now give a definable subdomain B of G∗ that contains a triangle.Since B′′ is a definable subdomain of G∗, we can write statements like x ∈ B′′

in this definition. We can also use R3 and powers of this relation in our pp-expression, by Lemma 5.38.

B(y) := ∃x2, x3, x′1, x′2, x′3 ∈ B′′ : x′1 = v′1 ∧ EG(x′1, x2) ∧ x2Rn23 x′2 ∧

EG(x′2, x3) ∧ x3Rn1+n3−13 x′3 ∧ EG(x′3, y).

Observe that, by the above definition, a vertex y ∈ B may or may not lie in B′′

but all vertices that are quantified over do lie in B′′. Refer to Figure 5.8 for asketch of the situation. It remains to show that G[B] contains a triangle andthat B ( V(G). Let s ∈ V(G) such that v1R3s and sRn1−1

3 v′1. Note that such avertex exists by the assumption that v1Rn1

3 v′1.We start by showing that G[B] has a triangle containing v1. Since v1R3s,

there exist vertices a1, a2 such that the edge a1, a2 lies on a triangle with sand on a triangle with v1. By letting xi := vi and x′i := v′i for i ∈ [3], it is easyto verify that v1 ∈ B. Furthermore, letting x′1 := v′1, x2 := x′2 := v2, x3 := v′1and x′3 := s shows that any vertex at distance one from s is in B, implying a1and a2 are in B. Thereby, G[B] contains a triangle.

It remains to show that B ( V(G). To this end we will prove that v′1 /∈ B. As-sume for a contradiction that v′1 ∈ B. Thus, there exist witnesses x2, x3, x′1, x′2, x′3for this containment. Let x1 := x′1 = v′1. Observe that hereby, x1R0

3x′1, x2Rn23 x′2,

5


Cα

u

xα−1xα−2

x3x4

x1

Cα

v

yα−1

W2

W1

yα−2

y3y4

xα

Figure 5.9 Depiction of the situation in the proof of Lemma 5.41. The figure showsvertices u and v such that uRαv and a sketch of the proof that all vertices in the figureother than u and v can be reached by a walk of length exactly α− 2 from u. Note thatnot all vertices are necessarily distinct, even if they are shown as such.

x3Rn1+n3−13 x′3, and x′1, x2, x′2, x3, x′3, x1 ∈ E(G). As such, substituting the

values n′1 = 0, n′2 := n2, n′3 := n1 + n3− 1 for n1, n2, n3 satisfies statement 2 andn′1 + n′2 + n′3 < n1 + n2 + n3. Renaming the variables to ensure that n′1 ≥ n′2and n′1 ≥ n′3 (note that this can be done; the definition is entirely symmetric)ensures that also statement 1 is satisfied, which contradicts the minimality ofour choice for n1, n2 and n3. J

The next lemma gives a relevant property of the set of vertices that witnessesuRαv in a graph G, that will be useful when proving the inclusion of thesevertices in a definable subdomain.

I Lemma 5.41 Let G be a graph with odd girth α, let u, v ∈ V(G) such thatuRαv (recall Definition 5.37) and let S = x1, . . . , xα, y1, . . . , yα with x2 = uand y2 = v be vertices witnessing that uRαv. If x ∈ S \ u, v, then there is awalk in G of length α− 2 from u to x.

Proof. For a sketch of the situation and the proof, refer to Figure 5.9.Consider any other vertex xi for i ∈ [α] \ 2. Consider the walks P :=

(u, x3, x4, . . . , xi−1, xi) and P′ := (u, x1, xα, xα−1, . . . , xi+1, xi), which both havelength at most α− 2. Since the combined length of these walks is α, one ofthese two walks is odd and has length less than α. Thereby, there is a walk oflength α− 2 from u to xi.

It remains to show that there is a walk of length α− 2 from u to yi for alli 6= 2. For i = 1 and i = α the result was shown above. We do a case distinctionon the value of i. If 1 < i < α is odd, consider the walk

W1 := (x2, x1, y2, y3, . . . , yi−1, yi),

observe that it has length exactly i < α. Since i is odd and α is odd, it followsthat there is a walk of length exactly α− 2 from u = x2 to yi by padding W1 asneeded.

5


If 1 < i < α is even, implying i ≥ 4 as i 6= 2, consider the walk

W2 := (x2, x1, xα, yα−1, . . . , yi+1, yi),

which has length 2 + α− i ≤ α− 2. Observe that since α is odd, while 2 andi are even, the walk has odd length and thus there is a walk of length exactlyα− 2 from u to yi. J

The next lemma will be used to solve the second type of lifting issue wedescribed, by giving the subdomain construction used when taking the quotientgraph reduces the odd girth. Observe that the lemma statement supports graphsof any odd girth α ≥ 3, meaning that the first type of lifting issue truly is theobstacle to proving a more general lower bound for H-COLORING.

I Lemma 5.42 Let G be a graph of odd girth α > 1 such that each vertex of Glies on a cycle of length α. If G/R+

αhas smaller odd girth than G, then there is a

strict definable subdomain B of G∗ such that G[B] contains a cycle of length α.

Proof. Suppose G/R+α

has smaller odd girth than G. Take minimal n1, . . . , nα−2 ∈N with n1 ≥ ni for all i such that there exist v1, v′1, . . . , vα−2, v′α−2 with viR

niα v′i

for all i ∈ [α− 2] and v′i, vi+1 ∈ E(G) for all i ∈ [α− 3] and v′α−2, v1 ∈E(G). Note that some of the ni may be zero. We say a choice for n1, . . . , nα−2 isminimal simply if their sum is minimized along all possible valid choices.

Similarly as in the proof of Lemma 5.40, such values always exist. Theyare obtained by considering a closed walk of length α − 2 in G/R+

α, which

exists since G/R+α

has odd girth at most α− 2. We now show how to define asubdomain for G, using a case distinction on n1.

(n1 is zero) Since n1 ≥ ni for all i ∈ [α− 2] it follows that all ni are zero. Onemay verify that in this case, there is a closed walk of length α− 2 in the graph,contradicting that the odd girth is α by Observation 5.22.

(n1 > 0 is odd) Refer to Figure 5.10 for an illustration of this case. Let n′1 :=

bn1/2c and pick a vertex u such that uRn′1α v′1 and v1Rn′1+1

α u. Note that such avertex exists by definition since n′1 + (n′1 + 1) = n1. Using Lemma 5.38, defineB as

B(y) := ∃x2, . . . , xα−2, x′1, . . . , x′α−2 : uRn′1α x′1 ∧

∧1≤i≤α−3

EG(x′i , xi+1) ∧∧2≤i≤α−2

xiRniα x′i ∧ EG(x′α−2, y).

We show that G[B] has a cycle of length α, and that some vertex of V(G) is notincluded in B. We start by showing that v′1 /∈ B. Suppose towards a contradic-tion that B(v′1). Then by definition, there exist vertices x2, . . . , xα−2, x′1, . . . , x′α−2

5


uv1 v′1

n1 = 3

sC

n2 = 2 n3 = 1

v2 v3 v′3v′2

x′1 x2

x′2

x3 x′3

v′1 ∈ B

Figure 5.10 Illustration of the proof of Lemma 5.42, where α = 5, n1 = 3, n2 = 2, andn3 = 1. The gray part depicts the situation when v′1 ∈ B, which leads to a contradiction.Be aware that not all depicted vertices and edges are necessarily distinct.

witnessing B(v′1). As such, uRn′1α x′1 and by symmetry of the relation it follows

that x′1Rn′1α u. Furthermore u was defined such that uRn′1

α v′1. Thereby, x′1R2n′1α v′1

with 2n′1 = n1 − 1. Defining x1 := v′1 for convenience, it follows that xiRniα x′i

for all i ≥ 2, x1Rn1−1α x′1, x′i , xi+1 ∈ E(G) for all i, and x′α−2, x1 ∈ E(G)

(as x1 = v′1 and v′1 ∈ B). But this contradicts the minimality of the choice forn1, . . . , nα−2, since it follows that the sequence n1− 1, n2, . . . , nα−2 (potentiallyre-ordered to put the largest value at the front) satisfies all requirements whileit has a strictly smaller sum.

It remains to show that G[B] has a cycle of length α. Let s be a vertex such

that v1Rαs and sRn′1α u (note that if n1 = 1 then s = v′1). Thereby, there exist

vertices c1, . . . , cα and c′1, . . . , c′α witnessing v1Rαs , such that v1 = c2. Let C bethe closed walk given by (c1, . . . , cα, c1) and observe that since the odd girth ofG is α, C is a cycle in G by Observation 5.22. We will show that V(C) ⊆ B.

One may verify that B includes all vertices that can be reached via a walk oflength α− 2 from s by choosing x′1 = s and xi = x′i for all i ≥ 2 in the definitionof B, using that Rα is reflexive. It follows from Lemma 5.41 that all verticesof C except v1 are in B. It remains to show v1 ∈ B, which is straightforwardfrom the definition of B and the choice of n1, . . . , nα−2. Thus, G[B] containsall α vertices of cycle C.

(n1 > 0 is even) Note that n1 > 1. Refer to Figure 5.11 for a sketch of the

situation. Let n′1 := n1/2. Pick v, v′ such that v1Rn′1α v, vRαv′, and v′Rn′1−1

α v′1.Since v′Rαv, there are vertices a1, . . . , aα and a′1, . . . , a′α with v = a2 and v′ = a′2witnessing this. We define u := a1 = a′1 and u′ := aα = a′α, such that u, u′is the edge in which the cycles through v and v′ overlap. Given u and u′, we

5


vv1 sC

x′1 x2

x′2

x3 x′3

v′1 ∈ B

v′1

u

u′

n2 = 2 n3 = 1

v2 v3 v′3v′2

n1 = 4

x′

v′

Figure 5.11 Illustration of the proof of Lemma 5.42 for α = 5, where n1 = 4, n2 = 2,and n3 = 1. Depicted vertices need not always be distinct. Indicated in gray is thesituation when v′1 ∈ B, which leads to a contradiction.

now define B as

B(y) := ∃x′, x2, . . . , xα−2, x′1, . . . , x′α−2 : x′Rn′1−1α x′1 ∧∧

1≤i≤α−3

EG(x′i , xi+1) ∧∧

2≤i≤α−2

xiRniα x′i ∧ EG(x′α−2, y) ∧

(∃y1, . . . , yα : y1 = u ∧ yα = u′ ∧ y2 = x′ ∧ EG(yα, y1)∧∧1≤i<α

EG(yi, yi+1)).

We start by showing that v′1 /∈ B, such that indeed B ( V(G). Suppose forcontradiction that v′1 ∈ B and let x′, x2, . . . , xα−2, x′1, . . . , x′α−2 and y1, . . . , yα be

the vertices witnessing this. By definition, v′Rn′1−1α v′1. Furthermore, x′Rn′1−1

α x′1by definition. We start by showing x′Rαv′ to obtain that x′1R2n′1−1

α v′1. Bydefinition, we have that y1 = u, yα = u′ and y2 = x′, furthermore a′1 = u,a′α = u′ and a′2 = v′. It follows from the definitions that hereby x′Rαv′ and

thus x′1R2n′1−1α v′1 with 2n′1 − 1 = n1 − 1. Similar to the argument given in the

case for n1 is odd, we observe that introducing x1 := v′1 gives x1, . . . , xα−2and x′1, . . . , x′α−2 satisfying xiR

niα x′i for all i ≥ 2, x′i , xi+1 ∈ E(G) for all i,

and x′α−2, x1 ∈ E(G). This contradicts the minimality of the choice forn1, . . . , nα−2, since the sequence n1− 1, n2, . . . , nα−2 (possibly after reordering)has a smaller total sum and satisfies all requirements.

It remains to show that G[B] has a cycle of length α. The proof will followfrom similar arguments as in the case where n was odd. Observe that B(v1)follows from substituting x′ with v′, xi with vi, and x′i with v′i. Choose s such

that v1Rαs and sRn′1−1α v. Thus, there exist vertices c1, . . . , cα and c′1, . . . , c′α

witnessing v1Rαs, such that v1 = c2. Let C be the closed walk given by(c1, . . . , cα, c1) and observe that since the odd girth of G is α, C is a cycle in G

5


by Observation 5.22. We will show that V(C) ⊆ B. Letting x′ := v, x′1 := s,and x′i = xi for i = 2, . . . , α− 2 shows that B contains all vertices reachable bya walk of length α− 2 from vertex s. It now follows from Lemma 5.41 that allvertices in C \ v1 are in B and thus V(C) ⊆ B, concluding the proof. J

Using the two results above, we can not prove that when G contains atriangle, then it has a definable subdomain B such that G[B] is homomorphicallyequivalent to a triangle. This is the last ingredient we will need to prove themain lower bound result of this chapter.

I Lemma 5.43 Let G be a graph with odd girth 3. Then G∗ has a definablesubdomain B ⊆ V(G) such that G[B] is homomorphically equivalent to C3, thatis, G[B]↔ C3.

Proof. Proof by induction on |V(G)|. If G has the no-overlap property, then byapplying Lemma 5.36 we obtain a definable subdomain B such that G[B]↔ C3.

In the remainder, we assume G has an edge that is contained in two distincttriangles. We first deal with an easy case. If there is a vertex of G that is notcontained in any triangle, then Lemma 5.39 yields a strict definable subdomain Bof G∗ such that G′ := G[B] contains a triangle. Since B ( V(G), we may applyinduction to obtain a definable subdomain B′ of (G′)∗ such that G′[B′] ↔C3. By Proposition 5.17 the set B′ is also a definable subdomain of G∗, andsince G[B′] = G′[B′]↔ C3, this concludes the proof for this case.

It remains to deal with graphs in which some edge belongs to distincttriangles, in which every vertex lies on a triangle. Note that if some edge u, vis in two triangles, say u, v, x1 and u, v, x2 for distinct x1, x2 ∈ V(G),then x1R3x2. Additionally, since every vertex lies on a triangle we have vR3vfor all v ∈ V(G). It follows that R3 is reflexive on G and therefore that R+

3 is anequivalence relation on V(G) that has strictly fewer than |V(G)| equivalenceclasses. Consider G/R+

3. Since any closed walk in G projects to a closed walk

in G/R+3

, the odd girth of G/R+3

is at most that of G, so 3. If G/R+3

has a self-

loop, which occurs when some equivalence class of R+3 is not an independent

set, then G/R+3

has odd girth 1. We distinguish these two cases.

(G/R+3

has odd girth 1) By Lemma 5.42, G∗ has a definable strict subdomain B

such that G′ := G[B] contains a triangle, and therefore has odd girth 3. Byinduction, there is a definable subdomain B′ of (G′)∗ such that G′[B′] ↔ C3.As before, Proposition 5.17 ensures B′ is also a definable subdomain of G∗,and G[B′] = G′[B′]↔ C3.

(G/R+3

has odd girth 3) Let G′ := G/R+3

, so that |V(G′)| < |V(G)|. By induc-

tion on G′, we find a definable subdomain B′ of (G′)∗ such that G′[B′]↔ C3.We distinguish two cases, depending on whether B′ is a strict subdomainof (G′)∗ or not.

5


1. If B′ is a strict subdomain of (G′)∗, then Lemma 5.40 guarantees that G∗ hasa strict definable subdomain B such that G[B] has a triangle and thereforehas odd girth 3. As before, we may now apply induction to find a definablesubdomain B of (G[B])∗ such that G[B] ↔ C3. By Proposition 5.17 weknow B is also a definable subdomain of G, which concludes the proof.

2. If B′ is not strict, then B′ = V(G′) and hence G′ ↔ C3. Since the quotientgraph G′ has odd girth 3, it does not have a self-loop, and therefore allequivalence classes of R+

3 in G are independent sets. It follows that G → G′

by mapping each vertex of G to the vertex representing its equivalence classin G′, and therefore G → G′ → C3. Since G has a triangle by assumption,we also have C3 → G, so that G ↔ C3. Hence the desired subdomainis B := V(G), which is trivially definable.

This concludes the proof of Lemma 5.43. J

5.3.3 Sparsification lower bound for H-ColoringThe constructions for definable subdomains given by Lemmata 5.36 and 5.43now allow us to prove the H-COLORING lower bound when H has the no-overlapproperty or contains a triangle, via Lemma 5.21. The main step of the proofis to show that graphs with a triangle have cores with a triangle, and thatnonbipartite graphs with the no-overlap property have nonbipartite cores withthe no-overlap property.

I Theorem 5.44 Let H be a simple nonbipartite graph whose shortest odd cyclehas length α. If α = 3, or no edge of H is contained in two distinct Cα-subgraphs,then H-COLORING parameterized by the number of vertices n does not admit ageneralized kernel of size O(n2−ε) for any ε > 0, unless NP ⊆ coNP/poly.

Proof. Let H be a graph satisfying the preconditions. Let H′ be the core of H,that is, the unique (up to isomorphism) minimal induced subgraph H′ of H forwhich H → H′, so that in fact H ↔ H′. We claim that H′ is nonbipartite andthat the odd girth of H′ equals α. If C = (x0, x1, . . . , xα) with x0 = xα is a cycleof length α in H, then the image of C under any homomorphism is a closedwalk of length α. Since H → H′, the graph H′ has a closed walk of length α,and hence an odd cycle of length at most α by Observation 5.22. Since H′ is asubgraph of H, this cycle cannot be shorter than α, which proves that H′ hasodd girth α.

We show that there is a definable subdomain B of (H′)∗ such that H′[B]↔Cα. If H has the no-overlap property, then no edge is in two distinct minimum-length odd cycles. Since the odd girth of H′ equals that of H, and H′ is aninduced subgraph of H, it follows that H′ also has the no-overlap property, sothat the existence of such a subdomain follows from Lemma 5.36. If H has oddgirth 3, then H′ also has odd girth 3 as shown above, and Lemma 5.43 impliesthe existence of the desired subdomain.

5

5.4 Conclusion 137

It follows from Lemma 5.21, that the definable subdomain B of (H′)∗

shows that H′-COLORING does not have a generalized kernel of size O(n2−ε)for any ε > 0 unless NP ⊆ coNP/poly. Now, since H′ ↔ H it follows thata graph G has an H-coloring if and only if it has an H′-coloring, so that thedecision problems H′-COLORING and H-COLORING are equivalent. Hence thesame lower bound applies to the latter problem. J

5.4 ConclusionFor a large class of nonbipartite graphs H, we proved that H-COLORING isunlikely to admit a non-trivial polynomial-time sparsification algorithm. Ourresults imply that for the more general H-LIST COLORING problem, where theinput graph G is given along with a list L(v) ⊆ V(H) for each v ∈ V(G) and thequestion is whether G has a homomorphism to H mapping each vertex to a coloron its list, there is no simple nonbipartite graph for which H-LIST COLORING

has a non-trivial polynomial-time sparsification (unless NP ⊆ coNP/poly). Thisfollows from the fact that, by using the lists to force all vertices of G to bemapped to the vertex set of an induced odd cycle in H, one obtains a linear-parameter transformation from Cα-COLORING to H-LIST-COLORING. Hence thesparsification lower bound for the former problem, given in Theorem 5.4, alsoapplies to the latter.

Our main open problem is to determine whether our sparsification lowerbound for the cases of H-COLORING described by Theorem 5.44, can be gen-eralized to all nonbipartite graphs H. One route to such a proof would be toshow that if H is a simple nonbipartite core graph, then H∗ has a definablesubdomain that induces a subgraph homomorphically equivalent to an oddcycle; this would imply the lower bound via Lemma 5.21.

5

6

Chapter 6An Optimal Kernel for H-Coloring

In this chapter, we will study the kernelization complexity of the q-COLORING

problem and its generalization H-COLORING. Since 3-COLORING is alreadyNP-hard, studying the problem parameterized by the number of colors does notyield interesting results from a parameterized perspective. As such, coloringproblems are often studied under parameters that capture the complexity ofthe input graph. For example, Fiala et al. [38] compared the parameterizedcomplexity of several coloring problems when parameterized by vertex cover,to the complexity when parameterized by treewidth. Jansen and Kratsch [57]studied the q-COLORING problem by a hierarchy of different parameters.

In this earlier work [57], Jansen and Kratsch provided a kernel for q-COLORING parameterized by the size of a vertex cover with O(kq) vertices thatcan be encoded in O(kq) bits. Furthermore they showed that for q ≥ 4, a kernelof bitsize O(kq−1−ε) does not exist, unless NP ⊆ coNP/poly. In Chapter 5, itwas shown that also for q = 3 a kernel of bitsize O(kq−1−ε) is unlikely.

Unfortunately, there is a gap of a factor k between these bounds, and itremained unclear whether the upper or the lower bound had to be strengthened.In this chapter, we close this gap by improving the kernel. We show that q-COLORING has a kernel of bitsize O(kq−1 log k) when parameterized by vertexcover, completely settling the kernelization size for the q-COLORING problem.

We further generalize the kernelization result by using the parameter twin-cover, which is smaller or equal to the size of a vertex cover. We obtain a kernelfor the H-COLORING problem parameterized by the size of a twin-cover of size

6

140 An Optimal Kernel for H-Coloring

O(k∆(H) log k). Since Kq-COLORING corresponds to the normal q-COLORING

problem, this indeed generalizes the size-O(kq−1 log k) kernel for q-COLORING.

Related work Ganian introduced twin-cover as a new parameter [44] andgave relations to existing parameters. For example, a minimum twin-cover isnot larger than a minimum vertex cover, but twin-cover is incomparable totreewidth. The paper also gives an FPT algorithm for PRECOLORING EXTENSION

parameterized by the size of a twin-cover, and studies a number of otherproblems using this parameter.

Overview We start by providing the general idea towards obtaining a kernel ofsize O(k2 log k) for the 3-COLORING problem parameterized by vertex cover inSection 6.1. We then show how to use the ideas presented for the 3-COLORING

kernel, to obtain the kernel for H-COLORING parameterized by twin-cover inSection 6.2.

6.1 Kernel for 3-Coloring parameterized by VertexCover

To get a better feeling for the general idea behind the kernel for H-COLORING

presented in Section 6.2, we start by showing the kernel for 3-COLORING

parameterized by the size of a vertex cover. First, we will quickly recap theexisting kernel for 3-COLORING given by Jansen and Kratsch [57], which hassize O(k3). Then, we will show how using the results from Chapter 3 allows usto improve the size of this kernel to O(k2 log k).

6.1.1 Existing kernel of cubic sizeLet us start by looking at the kernel by Jansen and Kratsch [57]. While theirkernel works for general q-COLORING, we focus on the case of 3-COLORING here.

As an input for the kernelization algorithm, we assume we are given a graphG with vertex cover S of size k. We can assume this without loss of generality,since it is easy to obtain a 2-approximate vertex cover if no vertex cover is given.Let us start by some general observations. First of all, once the vertices in S arecolored, it is easy to verify if this coloring can be extended to the entire graph.This is the case since the vertices in V(G) \ S form an independent set. As such,by coloring S, their entire neighborhood is colored. All that needs to be verified,is that in each neighborhood there is at least one color that remains unused.

By looking at the problem this way, we see that each vertex v ∈ G \ S puts aconstraint on the coloring of the vertices in S: the vertices in the neighborhoodof v should not use all three available colors. The reverse also holds; anycoloring of S satisfying all such constraints can be extended to color the entire

6

6.1 Kernel for 3-Coloring parameterized by Vertex Cover 141

S Sa

b

c

d

G G′

Figure 6.1 The result of applying the simple O(k3)-size kernel for 3-COLORING to thegraph G depicted on the left, with the corresponding colorings. The fact that in this caseG′ is larger than G is caused by the choice of this (too small) example, observe that wedid in fact make some progress. In particular, vertex b has been removed. Furthermore,vertex c is made redundant by the vertices added due to the existence of vertex d.

graph. In fact, when trying to extend the coloring of S to a vertex v in V(G) \ S,one may verify that it suffices to check for each size-3 subset of N(v) that itdoes not use all three available colors. After all, if all three colors are used byN(v), then there is a size-3 subset of N(v) that witnesses this.

Kernel Given a 3-COLORING instance G with vertex cover S, we may obtain akernelized instance G′ as follows. Initialize G′ as G[S]. For every size-3 subsetS′ of S, check if there exists a vertex v ∈ V(G) \ S such that S′ ⊆ N(v). If so,add a new vertex v′ to G′ and connect v′ to all vertices in S′. Refer to Figure 6.1for an illustration of this procedure. We will not argue the correctness here infull detail, it can be found in Lemma 1 in [57]. The general idea is to prove thatwhen G′ is 3-colorable, we can color G by taking the same 3-coloring on S andextending this coloring to the remainder of the graph G.

Size The graph G′ has at most |S|+ |V(G′) \ S| ≤ k + (k3) = O(k3) vertices.

The nice thing about this kernel is that it can even be stored in O(k3) bits,by storing the graph structure of G′[S] in O(k2) bits, together with a Booleanvector of size O(k3) that indicates for every size-3 subset of S, whether thesevertices have a common neighbor in V(G′) \ S.

6.1.2 Improved kernelTo improve the existing kernel, we use some similar ideas. Again, consider agraph G with vertex cover S of size k. For each size-3 subset S′ ⊆ S that hasa common neighbor in V(G) \ S, we want to express that S′ does not use allthree colors. This can be seen as a constraint. In Section 6.1.1, we modeled thisconstraint by adding a vertex that was connected to all vertices in S′. Doing thisfor all size-3 subsets of S, resulted in a kernel with O(k3) vertices. To improve

6


the size of the kernel, we will first try to reduce the number of constraintsfrom O(k3) to O(k2), and then remove vertices and edges that corresponded toredundant constraints.

To reduce the number of constraints, we will use the method for sparsifyingconstraint satisfaction problems where the constraints are given as equalities oflow-degree polynomials, that we described in Chapter 3. The first step is thusto model the relevant constraints as an instance of 2-POLYNOMIAL ROOT CSP.To do this, we start by introducing three Boolean variables for each vertex in S,such that we have variable set

yv,c | v ∈ S, c ∈ [3].

Let y be a vector containing all these variables in arbitrary order. A coloring ofthe vertices corresponds to an assignment to the variables by letting yv,c = 1if vertex v has color c, and yv,c = 0 otherwise. To formalize this, we will saythat y is given a choice assignment if for all v ∈ S, there is exactly one c ∈ [3]such that yv,c = 1, meaning that each vertex in S is assigned exactly one color.

For distinct vertices S′ = u, v, w ⊆ S, consider the following polynomialover the integers modulo two:

pS′(y) := yu,1 · yv,1 + yu,1 · yw,1 + yv,1 · yw,1+

yu,2 · yv,2 + yu,2 · yw,2 + yv,2 · yw,2+

yu,3 · yv,3 + yu,3 · yw,3 + yv,3 · yw,3.

We now observe the following. If y is given a choice assignment, correspondingto a coloring where u, v, and w receive distinct colors, then pS′(y) ≡2 0. Ifhowever y is given a choice assignment, corresponding to a coloring where u, v,and w use at most two distinct colors, then pS′(y) ≡2 1. A generalization of thisresult will be formally proven in Lemma 6.10. Informally, the idea is as follows.

• If u, v, and w each have a different color, then none of the terms of thepolynomial evaluate to 1. After all, each term corresponds to two verticesreceiving the same color. As such, we trivially obtain pS′(y) ≡2 0.

• If u, v, and w do not receive distinct colors, we consider two options. If all ofthem have the same color, it is easily verified that there will be exactly threeterms of pS′(y) that evaluate to one. For example, if u, v, and w all receivecolor 1, we see that yu,1 · yv,1 = yu,1 · yw,1 = yv,1 · yw,1 = 1. Thereby, weobtain pS′(y) = 3 ≡2 1. In the remaining case, exactly two vertices receivethe same color, and the remaining vertex receives a different color. It is easyto verify that hereby exactly one term of the polynomial evaluates to 1, andagain pS′(y) ≡2 1.

Kernel Given a graph G with vertex cover S, the kernel is now obtained asfollows. We create a 2-POLYNOMIAL ROOT CSP instance on the variable set

6

6.1 Kernel for 3-Coloring parameterized by Vertex Cover 143

yv,c | v ∈ S, c ∈ [3]. For each vertex v ∈ V(G) \ S, for each S′ ⊆ N(v) of sizethree, we add the constraint pv,S′(y) ≡2 1, or equivalently pv,S′(y)− 1 ≡2 0,to obtain a set L of polynomial equalities. The additional subscript v is onlyused to keep track of which polynomial was added for which vertex. UsingTheorem 3.1, we obtain L′ ⊆ L with |L′| ≤ (3k)2 + 1, such that any assignmentsatisfying all equations in L′, satisfies all equations in L.

We now obtain G′ from G by removing every vertex in V(G) \ S for whichall corresponding equalities are in L \ L′, implying they were redundant. Fur-thermore, we remove an edge u, v with v ∈ V(G) \ S if there is no equalitypv,S′(y)− 1 ≡2 0 in L′ with u ∈ S′.

Correctness We will informally argue the correctness of the above procedure.Since G′ is a subgraph of G, it is easy to verify that G′ is 3-colorable if G is 3-colorable. The opposite direction is more interesting. Suppose G′ is 3-colorable,we show that it is possible to 3-color G. For every vertex in S, use the samecoloring as in G′. Observe that for every vertex in V(G) \ S, its neighborhood isnow completely colored. We show that none of these neighborhoods uses allthree available colors, to conclude the proof. Suppose towards a contradictionthat there is a vertex v ∈ V(G) \ S to which the coloring of S cannot be extended.As such, there are neighbors v1, v2, v3 ∈ N(v) such that vi has color i for i ∈ [3].Let S′ := v1, v2, v3. Hereby, pv,S′(y) ≡2 0, as we have seen before. Observethat pv,S′(y)− 1 ≡2 0 is one of the equations in the set L. We show that allequalities in L′ are satisfied by the coloring of S. By choice of L′, this impliesthat all equalities in L are satisfied, which is a contradiction with the fact thatthis equality is not.

Let pv,S′(y)− 1 ≡2 0 be an arbitrary equality in L′. Since this equality exists,v is a vertex of G′ and v is connected to all three vertices in S′. Since we startedfrom a valid 3-coloring of G′, this equality must be satisfied by the coloring,as otherwise the neighborhood of v uses all three available colors. This wouldcontradict that there is a proper 3-coloring of G′, assigning the consideredcoloring to the vertices in S.

We conclude that we can extend the coloring of S to color all vertices in G.

Size The graph G′[S] has k vertices by definition, and thereby has at most k2

edges. Additionally, G′ contains at most one vertex and at most three edges forevery equality in L′, leading to a total of O(k2) edges and vertices. By using anadjacency-list representation, this kernel can be stored in O(k2 log k) bits.

Extensions Extending the idea described in this section to obtain a kernel forq-COLORING parameterized by vertex cover of size O(kq−1 log k) is in principlestraightforward, the only necessary ingredient is to find a degree-(q− 1) poly-nomial expressing the constraint that “these q vertices do not all receive distinctcolors”. We will see how this constraint can be formulated as a polynomial in

6


Lemma 6.10. To further extend the results to H-COLORING, some additionalwork is needed as not only the number of colors that is used in the neighborhoodof a vertex matters, but also which colors specifically. Due to these additionalconstraints on the coloring, the kernel for H-COLORING will use two types ofpolynomial equalities, as we will see in Section 6.2.3.

While in this section we assumed that a vertex cover of the input graph isgiven, we show in the next section that this assumption is not needed. Theoverall strategy is as follows. Instead of only adding the necessary polynomialequalities corresponding to vertices not in the vertex cover to L, we simply con-sider the relevant polynomial equalities for all vertices in the graph, meaning wealso introduce variables for all vertices in the input graph. Then, we iterativelycheck if one of the edges of the graph is redundant, by verifying whether theconstraints in the graph obtained by removing this edge, imply the original setof constraints. If this is the case, we update the graph accordingly and recom-pute the set of constraints. Of course, now we can no longer straightforwardlyapply our results from Chapter 3 to obtain a size bound on our kernel, as wehave O(n) variables instead of O(k) variables. However, with some additionaleffort we can show that as long as the set of constraints is still large, the abovereduction must be applicable. In this way, we obtain the same size bound.

Furthermore, we will extend the ideas described in this section to obtain akernel parameterized by twin-cover, instead of vertex cover, but this does notadd too many technical complications.

6.2 Kernel for H-Coloring parameterized byTwin-Cover

In this section, we give a kernel for H-COLORING parameterized by the size ofa twin-cover. We start by giving some additional preliminaries and showinghow to partition the graph into vertex sets that are twins in Section 6.2.1. Weintroduce some of the polynomial equalities that we use and their propertiesin Section 6.2.2, and use them in Section 6.2.3 to define the set of equalitiesthat is constructed for a given input graph. In Section 6.2.4 we define the threereduction rules our kernel will use and prove that they are safe. Finally, inSection 6.2.5 we give the kernel.

6.2.1 Twin-CoverTo define the parameter twin-cover, we first need to define the notion of twins.We say vertices u and v ∈ V(G) are (true) twins whenever NG[u] = NG[v].Note that this relation is transitive.

Definition 6.1 We say X ⊆ V(G) is a twin-cover [44] of G, if for every edgeu, v ∈ E(G), vertex u ∈ X, or v ∈ X, or u and v are twins.

6

6.2 Kernel for H-Coloring parameterized by Twin-Cover 145

We will regularly use the following two observations about H-colorings of agraph. Both follow straightforwardly from the definition of H-coloring.

Observation 6.2 Let S ⊆ V(G) such that G[S] is a clique and let f be an H-coloring of G. Define X := f (v) | v ∈ S. Then H[X] is a clique in H and allvertices in S receive a different color, so that |S| = |X|.

Observation 6.3 Let v ∈ V(G) and let f be an H-coloring of G. Then the numberof colors used to color NG(v) is bounded by ∆(H).

Computing a minimum TWIN-COVER is NP-hard, since VERTEX COVER is NP-hard on graphs where no two vertices are twins7. We will therefore constructthe kernel for H-COLORING without knowing a twin-cover of the input graph.In order to do this, we decompose the graph into vertex sets consisting of twins.Recall that throughout this chapter, twins are vertices with the same closedneighborhood.

Definition 6.4 A partial twin decomposition of a graph G is a partition Π =P1, . . . , Pm of V(G), such that any two vertices in the same partite set aretwins. Partition Π is a twin decomposition if furthermore any two vertices indifferent partite sets are not twins.

To be able to use the twin decomposition for the kernelization procedure,we show how it can efficiently be computed.

I Lemma 6.5 A twin decomposition of a graph G can be computed inO(|V(G)|+|E(G)|) time.

Proof. This is for example stated in [86, Exercise 2.17] for the case of findingfalse twins, which are vertices such that NG(u) = NG(v). Finding (true) twinsis similar. An example solution uses the adjacency-list representation, and addseach vertex to its own adjacency list. Then we efficiently sort the adjacencylists by bucket sort. We partition these sorted lists into sets of duplicates by arecursive process, which splits the sets into groups based on the first elementof each adjacency list, which is removed from the recursively partitioned lists.The running time is linear since the sum of the lengths of all adjacency lists isO(|E(G)|), while the work done in each iteration is proportional to the decreasein total volume of the lists. J

The next lemma shows how the twin decomposition and a minimal twin-cover may intersect.

I Lemma 6.6 Let G be a graph with twin decomposition Π and a minimaltwin-cover S. Then for any partite set P ∈ Π it holds that either P ⊆ S orP ∩ S = ∅.

7In a graph where no two vertices are twins, vertex covers and twin-covers are the same.

6


Proof. Let P ∈ Π. Suppose P ∩ S 6= ∅ and P \ S 6= ∅. Let u ∈ S ∩ P andv ∈ P \ S. We show that S \ u is a twin-cover of G, which contradicts theassumption that S is minimal.

Let u, w be any edge in G. If w = v, then since u, v ∈ P, this is anedge between twins. If w 6= v, then since u and v are twins, it follows thatv, w ∈ E(G). Thereby, either w ∈ S and thus edge u, w is covered by w,or w and v are twins. In this case, by transitivity of being twins, u and w arealso twins. This proves that S \ u is indeed a twin-cover of G, which is acontradiction. J

6.2.2 Modeling constraints as polynomial equalitiesAs explained in Section 6.1.2, the kernelization is based on a connection toconstraint satisfaction problems. To find the kernel, we represent the constraintsthat a vertex set puts on the coloring of its neighborhood, as polynomial equali-ties. We then use this representation to find redundant vertices and edges in thegraph. Recall that a monomial of degree d is the product of d variables, with theunique monomial of degree zero being the constant 1. For example, x1 · x3 · x3is a monomial of degree three. A monomial is multilinear if each variable occursat most once (refer to Section 2.6 for more details).

To find redundant vertices and edges, we will need to find which polynomialequalities are redundant. To be able to do this, we relate polynomials andpolynomial equalities to the vectors that we will use to represent them.

Definition 6.7 (vect) Let p(x1, . . . , xn) be a multivariate polynomial in (a subsetof) the variables x1, . . . , xn, evaluated over the integers modulo 2, of degree atmost d for some fixed d. Hence p is a weighted sum of monomials of degree atmost d over x1, . . . , xn. For some fixed ordering of the monomials of degree dover x1, . . . , xn, let vect(p) denote the vector containing the coefficients of thecorresponding monomials in p. We may use this notation as well for polyno-mial equalities where the right-hand side is zero; for a polynomial equalityp(x1, . . . , xn) ≡2 0 over the integers modulo 2, such that p has degree at most d,we let vect(p(x1, . . . , xn) ≡2 0) be defined as vect(p).

Let P be a set of multivariate polynomials in variables x1, . . . , xn, we usevect(P) to denote vect(p) | p ∈ P. Similarly, if L is a set of polynomialequalities in variables x1, . . . , xn for which the right-hand sides are zero, we letvect(L) be defined as vect(p) | (p(x1, . . . , xn) ≡2 0) ∈ L.

The following lemma follows from the definition above.

I Lemma 6.8 Let P be a set of polynomials of degree at most d over a tuple ofvariables y, and let q be a polynomial of degree at most d over y. If vect(q) ∈span2(vect(P)), then any assignment to y that satisfies p(y) ≡2 0 for all p ∈ P,satisfies q(y) ≡2 0.

Proof. Choose αp ∈ 0, 1 for all p ∈ P such that vect(q) ≡2 ∑p∈P αp vect(p).

6


Consider an assignment to the variables y with p(y) ≡2 0 for all p ∈ P. Let y′

be the vector containing the evaluation of the monomials of degree at most dover y, for the values assigned to y. List them in the same order in which thecoefficients for these monomials are listed in vect(·). Since a polynomial is aweighted sum of monomials, the value of a polynomial p of degree most d in yfor the assigned values, equals the inner product of vect(p) and y′. So:

q(y) = vect(q) · y′ ≡2 ∑p∈P

αp vect(p) · y′ ≡2 ∑p∈P

αp · p(y) ≡2 0. J

To utilize polynomials over Boolean variables to represent solutions of graphH-coloring problems, we represent the color of a vertex v in a graph G by|V(H)| Boolean variables, indicating whether v has the corresponding color. Wenow define a partial choice assignment, which reflects that any vertex receivesat most one color.

Definition 6.9 Let yi,k ∈ 0, 1 for i ∈ [n], k ∈ [q] be a set of Boolean variablesand let y be the vector containing all these variables. We say y is given a partialchoice assignment if for all i ∈ [n]:

q

∑k=1

yi,k ≤ 1.

Note that a partial choice assignment sets at most n variables to true. Bythis definition, a partial choice assignment can be seen as a partial coloring inthe following way: yi,k = 1 means vertex i has color k. Note that the coloring ofsome vertices may remain undefined.

The following lemma gives a polynomial that can be used to express theconstraint that out of exactly q neighbors of a given vertex u, there are at leasttwo that have the same color. By combining multiple such constraints, we canensure that at most q− 1 different colors are used in the neighborhood of vertexu, leaving one color free for u itself in the q-coloring problem. When evaluatingthe polynomial for y that is given a partial choice assignment, the polynomialhas the following two essential properties. (1) It equals 1 modulo 2 when the qvertices all receive a distinct color, and (2) it equals 0 modulo 2 whenever twovertices have the same color, or when two vertices have no color defined8.

I Lemma 6.10 Let q > 0 be an integer and let yi,k for i ∈ [q], k ∈ [q] be Booleanvariables. Then there exists a polynomial p of degree q− 1 over the integers modulo2, such that whenever the variables in y are given a partial choice assignment, itholds that p(y) ≡2 1 if and only if

8One may note that the polynomial described in Section 6.1.2 does not satisfy this. It evaluatedto 0 when all vertices receive distinct colors, or when two colors where not defined. It evaluatedto 1 whenever two vertices receive the same color. The fact that the role of 1 and 0 is swapped iseasy to resolve, but the fact that it evaluates to 0 both for desirable and undesirable partial choiceassignments, is problematic in our new setting.

6


• there exist no i, j, k ∈ [q] with i 6= j such that yi,k = yj,k = 1, and

• for all k ∈ [q− 1] there exists i ∈ [q] such that yi,k = 1.

We have seen an example of a polynomial used for the case where q = 3, inSection 6.1.2. However, the ideas behind this particular polynomial appeardifficult to extend to larger values of q. Furthermore, while it has all desirableproperties with respect to choice assignments, when considering partial choiceassignments, problems arise. Considering partial choice assignments is howevernecessary to deal with H-COLORING, instead of q-COLORING.

Therefore, the proof of Lemma 6.10 will use an entirely different polynomial,that is based on different ideas. Before proving Lemma 6.10, let us thereforegive the polynomial p corresponding to q = 3 that we will use in this proof.

p(y) := ∑i1 6=i2∈[3]

2

∏k=1

yik ,k

= y1,1 · y2,2 + y1,1 · y3,2 + y2,1 · y1,2 + y2,1 · y3,2 + y3,1 · y1,2 + y3,1 · y2,2.

One may verify for this example that letting y1,1 = y2,2 = y3,3 = 1 and all othervariables be zero, gives p(y) = 1 ≡2 1, as desired. Setting y1,1 = y2,2 = y3,2 = 1and all other variables to zero, gives p(y) = 2 ≡2 0, which explains why themodulus is used. Letting y1,1 = y2,3 = y3,3 = 1 and all other variables zero,results in p(y) = 0 ≡2 0. Furthermore, letting y1,1 = y2,2 = 1 and all othervariables be zero, also results in p(y) ≡2 1. We now proceed with the generalconstruction.

Proof of Lemma 6.10. Define the multivariate polynomial p as

p(y) := ∑i1,...,iq−1∈[q]

distinct

q−1

∏k=1

yik ,k.

We prove that p has the desired properties. It is easy to see that the degree ofp is q− 1. It remains to prove the claim on the values of p(y) for partial choiceassignments. So let y be given a partial choice assignment, and for each i ∈ [q]let xi := k exactly when yi,k = 1. Let xi := 0 when there is no such yi,k.

We now show that p(y) ≡2 1 if there exist no i, j, k ∈ [q] with i 6= jsuch that yi,k = yj,k = 1, and for all k ∈ [q − 1] there exists i ∈ [q] suchthat yi,k = 1. In terms of the values for xi, this implies that they are alldistinct, and that [q− 1] ⊆ x1, . . . , xq. Thereby, we have x1, . . . , xq = [q] orx1, . . . , xq = [q− 1] ∪ 0.

For k ∈ [q − 1], let jk be the unique index such that xjk = k, implyingthat yjk ,k = 1. Note that this is well defined, since all values from [q− 1] are

used exactly once. Then, ∏q−1k=1 yjk ,k = 1. For any other choice of distinct indices

6


i1, . . . , iq−1 ∈ [q], there exists m ∈ [q− 1] such that im 6= jm. This implies that

yim ,m = 0 and thereby ∏q−1k=1 yik ,k = 0. Thus, p(y) = 1 ≡2 1.

Now suppose there exist i, j ∈ [q], such that xi = xj 6= 0, or there existsk ∈ [q− 1] such that yi,k = 0 for all i ∈ [q]. We show that p(y) ≡2 0 by a casedistinction.

• Suppose there exists k ∈ [q− 1] such that ∑qi=1 yi,k = 0, or equivalently there

is no i ∈ [q] such that xi = k. Thereby, for any choice of i1, . . . , iq−1 ∈ [q], we

have that ∏q−1`=1 yi`,` = 0, since yik ,k = 0. Thus, p(y) ≡2 0.

• Suppose there exists no k ∈ [q− 1] such that ∑qi=1 yi,k = 0. Thereby, for each

k ∈ [q− 1] there exists i ∈ [q] such that xi = k. It follows from our earlierassumption that there must exist i, j, k ∈ [q] with i 6= j such that xi = xj = k,which implies k < q. For all c ∈ [q− 1] with c 6= k, let ic be the unique indexsuch that xic = c and thus yic ,c = 1. Then

yi,k ·q−1

∏c=1c 6=k

yic ,c = yj,k ·q−1

∏c=1c 6=k

yic ,c = 1.

However, ∏q−1c=1 yic ,c = 0 for any other choice of i1, . . . , iq−1. Thereby, p(y) =

2 ≡2 0. J

There is another way to look at the proof of Lemma 6.10, which maygive some intuition. Consider the q × q matrix A given by ai,j = yi,j for alli ∈ [q], j ∈ [q − 1], and ai,q = 1 for all i ∈ [q]. Then p(y) is equal to thedeterminant of the matrix A over the integers modulo 2. It follows from thisthat p(y) ≡2 1 if and only if the columns of A are linearly independent. Onemay verify that if y is given a partial choice assignment, the columns of A arelinearly independent if and only if the two conditions of the lemma statementhold for y.

6.2.3 Construction of polynomial equalitiesWe continue to define the polynomial equalities that will be constructed for asubset P of the vertices of G. These are necessary constraints on the coloringof NG(P), such this coloring can be extended to H-color P. In this construction,P will be a partite set of the twin decomposition of G, and hence a clique.

Let G be a graph with P ⊆ V(G). We create variables cv,i for each v ∈V(G) and i ∈ V(H), denoting whether vertex v has color i. Let C be theset containing all constructed variables. Let L(P, G) be the set of polynomialequalities produced by the following procedure, which results in two types ofconstraints. The first type will ensure that the neighborhood of P uses at most∆(H) colors. The second type of constraints will ensure that the coloring of the

6


neighborhood of P can indeed be extended to color P: we ensure that thereis a clique in H of size at least |P|, such that the colors used for NG(P) are inthe neighborhood of this clique in H. The reason for having these two types ofconstraints is to obtain a good bound on the degree of the relevant polynomials.

Consider each set S ⊆ NG(P) with |S| = ∆(H) + 1 and each set X ⊆ V(H)with |X| = |S|. Pick an arbitrary ordering on X such that X = x1, . . . , x|S|and use Lemma 6.10 to find a polynomial pP,S,X such that pP,S,X(C) ≡2 1 if andonly if the following two statements hold:

1. there exist no u 6= v ∈ S, k ∈ X such that cu,k = cv,k = 1, and

2. for all k ∈ [|S| − 1] there exists u ∈ S such that cu,xk = 1.

When considering a partial choice assignment that corresponds to a mappingfrom V(G) to V(H), the above two statements together imply that S is coloredwith |S| = ∆(H) + 1 distinct colors (which is what we want to avoid). Moreprecisely, when pP,S,X(C) ≡2 1, the coloring of S would use colors x1, . . . , x|S|−1and one other color. Add the following constraint to L(P, G):

pP,S,X(C) ≡2 0.

For the second type of constraints, consider each set S ⊆ NG(P) of ` ∈[∆(H)] elements. Pick an arbitrary ordering of S, such that S = s1, . . . , s`.Consider each sequence x = (x1, . . . , x`) ∈ V(H)` (of not necessarily distinctelements). Let Y :=

⋂ì=1 NH(xi) be the common neighborhood of all vertices

from x in H. If H[Y] does not contain a clique of size at least |P| (i.e., ifω(H[Y]) < |P|), add the following polynomial equality to L(P, G):

qP,S,x(C) :=`

∏i=1

csi ,xi ≡2 0.

The computation of ω(H[Y]) for some Y ⊆ V(H) can be done in constant time,since H is considered constant. This concludes the construction of L(P, G).Note that the constraints L(P, G) are defined solely in terms of the variablesthat describe the coloring of the open neighborhood of P.

Next, we define a complete list of equations for G.

Definition 6.11 (LΠ(G)) Let G be a graph and let Π be a partition of its vertexset. Let LΠ(G) be defined as follows.

LΠ(G) :=⋃

P∈ΠL(P, G).

Since the polynomials for the first type of constraints have degree at most|S| − 1 = ∆(H) by Lemma 6.10, while the polynomials for the second type ofconstraints are the product of ` ≤ ∆(H) variables, we observe the following.

6


Observation 6.12 Let G be a graph with Π a partition of its vertex set. Thepolynomials in LΠ(G) have degree at most ∆(H).

We now prove two useful properties for this choice of constraints.

I Lemma 6.13 Let G be a graph with some partial twin decomposition Π. Letf : V(G) → V(H) be a mapping. If f is an H-coloring of G, then the Booleanassignment to C := cv,i | v ∈ V(G), i ∈ V(H) given by cv,i = 1 ⇔ f (v) = isatisfies all constraints in LΠ(G).

Proof. Let f be given and the value of any cv,i ∈ C be defined by cv,i = 1 ⇔f (v) = i. We show that this assignment satisfies all constraints in LΠ(G), byshowing that it satisfies both types of constraints in L(P, G) for all P ∈ Π.Consider some P ∈ Π. Since it consists of twins, it is a clique in G. As H has noself-loops, the vertices in P all receive distinct colors by Observation 6.2, andthe colors used on P form a clique in H. The fact that P consists of twins alsoimplies that u, v ∈ E(G) for all u ∈ P, v ∈ NG(P). Thereby, any color used inP is not used in the coloring of NG(P).

Consider a constraint pP,S,X(C) ≡2 0 ∈ L(P, G) for S ⊆ NG(P) of size|∆(H)|+ 1 and X ⊆ V(H) of the same size. By Observation 6.3, the verticesin S use at most ∆(H) = |S| − 1 colors. Hence at least one color d ∈ V(H)appears twice on a vertex of S. If d ∈ X then some color of X is used twice on S,violating the first condition of Lemma 6.10. If d /∈ X then at least two verticesu, v ∈ S do not receive a color from X and hence ∑i∈X cu,i = ∑i∈X cv,i = 0.Since |S| = |X| = q, there are at most q− 2 vertices w ∈ S for which thereexists i ∈ X with cw,i = 1. As such, there exist distinct colors d1, d2 ∈ X suchthat ∑s∈S cs,d1 = ∑s∈S cs,d2 = 0, so that the lowest-indexed of d1 and d2 violatesthe second condition of Lemma 6.10. It then follows from Lemma 6.10 thatpP,S,X(C) ≡2 0 as required.

Consider a constraint qP,S,x(C) ≡2 0 ∈ L(P, G) for S ⊆ N(P) and x =

(x1, . . . , x|S|) ∈ V(H)|S|. Suppose this constraint is not satisfied. Then the

coloring of S is given by x and furthermore, H[Y] where Y :=⋂|S|

i=1 NH(xi)does not contain a clique on |P| vertices. But any H-coloring (for H withoutself-loops) colors any clique in G with an equally-sized clique in H, and thecolors used on the clique P must be adjacent in H to all the colors used on theneighbors S of P in G. Since H[Y] contains no clique on |P| vertices, f cannotbe an H-coloring of G. It follows that for any H-coloring, all constraints aresatisfied. J

Let S ⊆ V(G) and let f : S→ V(H) be an H-coloring of G[S]. We say that fcan be extended to color G, if there exists f ′ : V(G)→ V(H) such that f ′ is anH-coloring of G and furthermore f ′(v) = f (v) for all v ∈ S. The next lemmashows that an H-coloring of a part of the graph can be extended to color theentire graph, if it satisfies certain constraints.

6


I Lemma 6.14 Let G be a graph with P′ ⊆ V(G). Let f : V(G) \ P′ → V(H)be an H-coloring of G− P′, such that the Boolean assignment to C := cv,i | v ∈V(G), i ∈ V(H) given by cv,i = 1⇔ (v /∈ P′ ∧ f (v) = i) satisfies all constraintsin L(P′, G). Then f can be extended to H-color G.

Proof. Let f be given and C be defined by cv,i = 1⇔ (v /∈ P′ ∧ f (v) = i). Westart by showing that NG(P′) uses at most ∆(H) different colors. Suppose not,then there is a set S ⊆ NG(P′) of size ∆(H) + 1 using ∆(H) + 1 distinct colors.Let X be the set of colors used in S. It follows from Lemma 6.10, that regardlessof which ordering of X was chosen when constructing pP′ ,S,X(C), we havepP′ ,S,X(C) ≡2 1. By definition, L(P′, G) contains the equation pP′ ,S,X(C) ≡2 0.This contradicts the assumption that all constraints in L(P′, G) are satisfied.

Let X be the set of colors used by NG(P′), suppose |X| = `. We haveshown above that ` ≤ ∆(H). Let S be a size-` subset of NG(P′) such that forevery color x ∈ X, there exists exactly one s ∈ S such that f (s) = x. LetY :=

⋂x∈X NH(x). Suppose towards a contradiction that H[Y] contains no

clique of size |P′|. As such, for some ordering of S as S = s1, . . . , s` and forx = (x1, . . . , x`) such that f (si) = xi for all i, the constraint qP′ ,S,x(C) ≡2 0 wasadded to L(P′, G). However, by definition, qP′ ,S,x(C) ≡2 1, contradicting thefact that all constraints in L(P′, G) are satisfied. Thereby, H[Y] contains a cliqueK of size |P′|, where Y :=

⋂|X|i=1 NH(xi). To extend f to color P′, assign each

vertex in P′ a distinct color from K.It remains to verify that f is indeed a valid H-coloring of G. Any edge

between two vertices in V(G) \ P′ remains properly colored. Any edge in P′

is properly colored, because its endpoints have a different color and K is aclique in H. Any edge between P′ and V(G) \ P′ is properly colored, becauseall vertices in K are common neighbors of the vertices in X, and K ∩ X = ∅. J

6.2.4 Reduction rulesWe now present the three reduction rules that will be used to obtain the kernel,and prove that they are safe. The first checks whether the graph is trivially notH-colorable, the second removes sets of edges from the graph, and the thirdremoves sets of vertices from the graph.

Reduction rule 6.1 Let G be a graph with twin decomposition Π. If there existsP ∈ Π with |P| > ω(H), return a trivial no-instance.

It is easy to see that Reduction rule 6.1 preserves the answer to the problem,since in this case G cannot have an H-coloring by Observation 6.2.

Reduction rule 6.2 Let G be a graph with twin decomposition Π. Let P′ 6=P′′ ∈ Π such that EG(P′, P′′) 6= ∅. If

vect(

L(P′, G))⊆ span2

(vect

(LΠ(G \ EG(P′, P′′))

)),

6


remove all edges in EG(P′, P′′) from graph G.

Reduction rule 6.2 is the key rule for our kernelization. It simplifies thegraph by removing all edges between two distinct sets of twins P′ and P′′,if the constraints L(P′, G) are implied by the constraints generated by theremaining graph G \ EG(P′, P′′). Observe that it is essential for the effectivenessof Reduction rule 6.2 that Π is a twin decomposition, since applying the ruleto partial twin decompositions that are not twin decompositions may increasethe size of an optimal twin-cover in the considered graph. The following lemmaproves that the reduction rule is safe.

I Lemma 6.15 If G′ is obtained from G by applying Reduction rule 6.2, then Gis H-colorable if and only if G′ is H-colorable.

Proof. Let G′ be G \ EG(P′, P′′). Clearly, if G is H-colorable, then G′ is alsoH-colorable, since G′ is a subgraph of G.

In the other direction, let f ′ be an H-coloring of G′. It follows fromLemma 6.13 and the fact that Π is a partial twin decomposition of G \ EG(P′, P′′)that the derived setting of the Boolean variables C satisfies the constraints inLΠ(G \ EG(P′, P′′)). Since vect(L(P′, G)) ⊆ span2(vect(LΠ(G \ EG(P′, P′′))))it follows from Lemma 6.8 that this setting of C also satisfies all constraintsin L(P′, G). Let f be defined as f ′ restricted to the vertices in G′ − P′. Notethat G′ − P′ equals G − P′ by definition. It is easy to see that f is indeed anH-coloring of G− P′ since G− P′ is a subgraph of G′ and f ′ is an H-coloringof G′. Furthermore, f satisfies the constraints in L(P′, G) since it colors therelevant vertices the same as f ′. It now follows from Lemma 6.14 that we canextend f to color all vertices in G. Thereby, G is H-colorable. J

The final rule effectively removes isolated cliques from the graph, when Hhas a sufficiently large clique to allow them to be colored properly.

Reduction rule 6.3 Let G be a graph with twin decomposition Π. If there existsP′ ∈ Π with NG(P′) = ∅ and |P′| ≤ ω(H), then remove P′ from G.

I Lemma 6.16 If G′ is obtained from G by applying Reduction rule 6.3, then Gis H-colorable if and only if G′ is H-colorable.

Proof. Let P′ be such that G′ = G− P′. Clearly, if G is H-colorable, G′ remainsH-colorable. In the other direction, suppose G′ is H-colorable. We show howto extend this coloring to G. Since we assumed that |P′| ≤ ω(H), graph H hasa clique X of size |P′|. Use the colors of X to assign a different color to eachvertex in P′. Since NG(P′) = ∅, this gives an H-coloring of G. J

I Lemma 6.17 Let G′ be the graph resulting from applying Reduction rule 6.2or 6.3 to a graph G. Then the size of a minimum twin-cover in G′ is at most thesize of a minimum twin-cover in G.

6


Proof. Let P ⊆ V(G). It is easy to see that if S is a twin-cover of G, then it isalso a twin-cover of G′ = G− P. Thereby, the statement holds for Reductionrule 6.3.

Let Π be the twin decomposition of G and let P′ 6= P′′ ∈ Π. Let F :=EG(P′, P′′) be the set of edges between P′ and P′′. Let S be a twin-cover ofG. We show that S is a twin-cover of G′ = G \ F. Clearly, any edges thatwere previously covered, are still covered. We show that all vertices thatwere twins in G, are also twins in G \ F to conclude the proof. Let u, v betwins in G, and let P ∈ Π such that u, v ∈ P. If P 6= P′ and P 6= P′′, it isobvious that u and v remain twins in G \ F. Suppose u, v lie in P′ or in P′′;without loss of generality, suppose u, v ∈ P′. Note that the edge u, v belongsto E(G) \ F. Then NG\F[u] = NG[u] \ P′′. Since u and v are twins in G, wehave NG[u] \ P′′ = NG[v] \ P′′ = NG\F[v]. Thereby, u and v are twins in G \ F.It follows that the lemma statement also holds for Reduction rule 6.2. J

6.2.5 Analysis of the kernelizationHaving described all reduction rules, we can now formally prove the kerneliza-tion result for H-COLORING.

I Theorem 6.18 For any fixed non-bipartite graph H (without self-loops), H-COLORING parameterized by the size k of a twin-cover has a kernel with O(k∆(H))

vertices and edges, which can be encoded in O(k∆(H) log k) bits. Furthermore, thekernelized instance is a subgraph of the original input graph.

Proof. Let G be a graph. Apply Reduction rules 6.1, 6.2, and 6.3 exhaustively.Let the resulting graph be G′. We show that this is a correct kernelization.

B Claim 6.19 Reduction rules 6.1–6.3 can exhaustively be applied in time|V(G)|O(∆(H)).

Proof. We can compute a twin decomposition of G in linear time by Lemma 6.5.Computing ω(H) can be done in O(1) time for fixed H. Hence Reductionrule 6.1 can be applied in polynomial time.

The set LΠ(G) contains at most m := 2|V(G)| · |V(G)|∆(H)+1 · |V(H)|∆(H)+1

polynomial equalities (the number of ways to pick S, X, and P as for thedefinition of pP,S,X and qP,S,X), over |V(G)| · |V(H)| variables. All polynomialswe employ are multilinear. This can be verified directly from their construction,and explained by noting that squaring a number does not change it, whenworking modulo 2. By Lemma 2.27, we therefore only have to consider (|V(G)| ·|V(H)|)∆(H) + 1 coefficients for the polynomials. Constructing the requiredpolynomial equalities can be done in time |V(G)|O(∆(H)) for fixed H. Wecan test if one vector lies in the span of a set of other vectors by comparingthe ranks of matrices of dimensions at most m× ((|V(G)| · |V(H)|)∆(H) + 1).Thereby, Reduction rule 6.2 can be applied in polynomial time. Note that the

6


twin decomposition has to be recomputed after each application of Rule 6.2.Reduction rule 6.3 can trivially be applied in polynomial time. Since |Π| ≤|V(G)|, checking for all P ∈ Π whether any of the reduction rules can beapplied takes polynomial time.

Each rule can be applied at most |V(G)|2 times, as it always removes at leastone edge or vertex. The claim follows. C

Let G′ be the result of applying Reduction rules 6.1, 6.2, and 6.3 exhaustively.We use the following claim to prove a bound on the size of G′.

B Claim 6.20 The resulting graph G′ has O(|V(H)|∆(H) · ∆(H)2 · k∆(H)

)ver-

tices and O(|V(H)|∆(H) · ∆(H)3 · k∆(H)

)edges.

Proof. When Reduction rule 6.1 has been applied at any point, G′ trivially hasconstant size. Otherwise, since G has a twin-cover of size k, it follows fromLemma 6.17 that G′ has a twin-cover of size at most k. Let Y be a minimumtwin-cover of G′, such that |Y| ≤ k. Let Π be the twin decomposition of G′.By Lemma 6.6, every P in Π is either fully contained in Y, or disjoint from Y.Let Π′ := P ∈ Π | P ∩ Y = ∅. Define Ltc :=

⋃P∈Π′ L(P, G′), and note that

NG′(P) ⊆ Y for all P ∈ Π′. This implies the polynomial equalities in Ltc onlyinvolve the variables controlling the coloring of Y. By Observation 6.12, thepolynomials in Ltc have degree at most ∆(H) and they use at most |V(H)|variables for each of the k vertices in Y. Define

α := (k · |V(H)|)∆(H) + 1.

Let Vtc := vect(Ltc) be the vectors of coefficients corresponding to thepolynomials in Ltc. Compute a basis V′tc ⊆ Vtc of Vtc over the integers modulo 2.Let L′tc ⊆ Ltc contain all polynomial equalities whose corresponding vector iscontained in V′tc. Since all employed polynomials are multilinear, it followsthat the vectors in Vtc only have nonzero entries in positions corresponding tomultilinear monomials over |Y||V(H)| distinct variables, of which there areat most α by Lemma 2.27. As the size of the basis V′tc equals the rank of thematrix containing the (row)vectors Vtc, which is upper-bounded by the numberof columns that contain a nonzero entry, it follows that |L′tc| = |V′tc| ≤ α.

We define a set of meta-edges F ⊆ (Π′ × (Π \Π′)) based on the constraintsin L′tc. For each constraint Z in L′tc, do the following.

• Suppose Z = (pP′ ,S,X(C) ≡2 0) for some P′ ∈ Π′, S ⊆ NG(P′), andX ⊆ V(H). Since P′ is a partite set of twins that is disjoint from Y, wehave NG(P′) ⊆ Y since Y is a twin-cover. So each v ∈ S belongs to a partiteset Pv of twins with Pv ∈ Π \Π′. For each v ∈ S, add (P′, Pv) to F.

• Otherwise, Z = (qP′ ,S,x(C) ≡2 0) for some P′ ∈ Π′, S ⊆ NG(P′), andsequence x = (x1, . . . , x|S|) ∈ V(H)|S|. Similarly as above, for each v ∈ Stake Pv ∈ Π \Π′ such that v ∈ Pv and add (P′, Pv) to F.

6


The above procedure adds at most ∆(H) + 1 meta-edges for each constraint inL′tc. Thereby,

|F| ≤ α(∆(H) + 1). (6.1)

We now argue that for any (P′, P′′) /∈ F with P′ ∈ Π′ and P′′ ∈ Π \Π′, thefollowing holds:

L′tc ⊆ LΠ(G′ \ EG′(P′, P′′)). (6.2)

To see this, consider a constraint in L′tc. It is of one of two possible types, andit was added to Ltc =

⋃P∈Π′ L(P, G′) ⊇ L′tc because it satisfied the criteria

described in Section 6.2.3. Effectively, the constraint was created because someset P ∈ Π′ contains a certain vertex set S of size at most ∆(H) + 1 in its openneighborhood in G′. But by our choice of meta-edges F, the set P still has Sin its neighborhood in G′ \ EG′(P′, P′′), so that all constraints of L′tc are alsocontained in LΠ(G′ \ EG′(P′, P′′)).

Using this, we show that for all P′ ∈ Π′ and P′′ ∈ Π \Π′:

EG′(P′, P′′) 6= ∅⇒ (P′, P′′) ∈ F. (6.3)

Suppose there exist P′ ∈ Π′, P′′ ∈ Π \ Π′ such that EG′(P′, P′′) 6= ∅ but(P′, P′′) /∈ F. It follows from Equation 6.2 that L′tc ⊆ LΠ(G′ \ EG′(P′, P′′)).Thereby,

span2(vect(LΠ(G′ \ EG′(P′, P′′)))) ⊇ span2(vect(L′tc))

= span2(V′tc) ⊇ Vtc = vect(Ltc) ⊇ vect(L(P′, G′)).

Thereby, Reduction rule 6.2 could be applied to G′, which is a contradiction. Itfollows that P′ ∈ Π′ and P′′ ∈ Π \Π′ can only be connected in G′ if there is acorresponding meta-edge in F. We can now use Equations 6.1 and 6.3 to boundthe number of vertices and edges in G′.

First of all, for all P′ ∈ Π′ there must exist some P′′ ∈ Π \Π′ such that(P′, P′′) ∈ F, otherwise it follows from Equation 6.3 that NG′(P′) = ∅ and P′

would have been removed by Reduction rule 6.3. Thereby |Π′| ≤ |F|. Since|P| ≤ ω(H) ≤ ∆(H) + 1 for all P ∈ Π by Reduction rule 6.1, the number ofvertices of G′ can be bounded as follows.

|V(G′)| ≤ |F| · (∆(H) + 1) + |Y| ≤((k|V(H)|)∆(H) + 1

)· (∆(H) + 1)2 + k

= O(|V(H)|∆(H) · ∆(H)2 · k∆(H)

).

If edge u, v ∈ G′ with u ∈ Y and v /∈ Y, then there exist (P′, P′′) ∈ F suchthat u ∈ P′, v ∈ P′′. Since |P| ≤ ∆(H) + 1 for any P ∈ Π, there are at most|F| · (∆(H) + 1)2 such edges. Furthermore, there are at most (|Y|2 ) ≤ k2 edgesbetween vertices in Y, and at most |F| · (∆(H) + 1)2 edges between vertices in

6

6.3 Conclusion 157

V(G) \Y. Thereby, the total number of edges can be bounded by:

|E(G′)| ≤ |F| · (∆(H) + 1)2 + |Y|2 + |F| · (∆(H) + 1)2

≤ 2α(∆(H) + 1)3 + k2

= 2((k · |V(H)|)∆(H) + 1

)· (∆(H) + 1)3 + k2

(note that ∆(H) ≥ 2 for non-bipartite H)

= O(|V(H)|∆(H) · ∆(H)3 · k∆(H)

).

This concludes the proof of Claim 6.20. C

It follows from the correctness of Reduction rules 6.1, 6.2, and 6.3 that G′ isH-colorable if and only if G is H-colorable. It follows from Claims 6.19 and 6.20that we have given a kernel for H-coloring with O(k∆(H)) vertices and edgesfor constant-size H that can be computed in polynomial time. By encoding thegraph using adjacency lists, it can be encoded in O(k∆(H) · log k) bits. J

The following corollary shows that Theorem 6.18 generalizes the resultobtained for q-COLORING parameterized by vertex cover in the extended abstractof this work.

I Corollary 6.21 For any constant q ≥ 3, q-COLORING parameterized by thesize of a twin-cover has a kernel with O(kq−1) vertices, which can be encodedin O(kq−1 log k) bits. Furthermore, the resulting instance is a subgraph of theoriginal input graph.

Proof. Since q-COLORING is equivalent to Kq-COLORING, and ∆(Kq) = q − 1and Kq has q vertices, the result now follows directly from Theorem 6.18. J

6.3 ConclusionWe have given a kernel for H-COLORING parameterized by twin-cover withO(k∆(H)) vertices and bitsize O(k∆(H) log k). This kernel can be obtainedwithout using information about (an approximation of) the minimum twin-cover of the input graph. It follows that q-COLORING parameterized by vertexcover has a kernel of bitsize O(kq−1 log k), improving on the previously knownkernel by almost a factor k. Furthermore, this kernel is optimal up-to ko(1)

factors, as Jansen and Kratsch showed for q ≥ 4, there is no kernel of sizeO(kq−1−ε) for the problem, unless NP ⊆ coNP/poly [57]. For q = 3, the boundfollows from Theorem 5.44 and the fact that the size of a minimum vertex coveris at most the number of vertices.

It is easy to see that the kernel lower bounds also hold for q-LIST COLORING,where every vertex v in the graph has a list L(v) ⊆ [q] of allowed colors.

6


Furthermore, we can apply our kernel by first reducing an instance of q-LIST

COLORING to an instance of q-COLORING using q additional vertices, and addingthese to the twin-cover of the graph. This only changes the size of the obtainedkernel by a constant factor. We cannot apply the same method to extend thekernel to the general H-LIST COLORING problem, since the gadget to simulatethe list constraints only works if H is a clique.

In this chapter we have seen a first example where finding redundant ver-tices and edges is done using appropriate polynomial equalities. It would beinteresting to see if this technique can be applied to obtain smaller kernelsfor other graph problems as well [17]. To apply this idea, one needs to firstidentify which constraints should be modeled. When the constraints are found,they need to be written as equalities of low-degree polynomials over a suitablychosen field. This requires the clever construction of polynomials that have asufficiently low degree, in order to obtain a good bound on the kernel size.

Acknowledgements We thank Tim Hartmann for suggesting the use of twin-cover as a parameter.

7

Chapter 7Concluding Remarks

In this thesis we studied kernelization, and more specifically sparsification, andobtained tight bounds for several graph and logic problems. As the title ofthe thesis suggests, one of the main sparsification techniques was based onrepresenting constraints by low-degree polynomials. We will next discuss theuse of this technique and related techniques in the area of kernelization. Wethen consider the work on polynomial-time sparsification done in this thesis,and the broader area of sparsification research. We conclude this thesis bydiscussing potential directions for further research and by listing a few specificopen problems.

7.1 Linear-algebraic techniques in kernelizationUsing techniques based on linear algebra is relatively new in the area of kernel-ization. Earlier kernels for (for example) graph problems were more often basedon sets of reduction rules, based on high- or low-degree rules (as in the well-known O(k2)-vertex kernel for VERTEX COVER), crown structures [1,28,37,74],and packing arguments [19,80]. More recently, a number of other techniquesare appearing.

One such technique is the matroid-based method that can be used to obtainrandomized kernels [40, Chapter 10]. A matroid is a mathematical structure

7

160 Concluding Remarks

that can be defined as a pair (E, I) where E is a set and I is a family of sub-sets of E representing the independent sets. The set I is required to satisfyvarious additional conditions. In a linear matroid, there is a matrix represent-ing the matroid such that there is a one-to-one correspondence between Eand the columns of the matrix, and the independent sets in I correspond tosets of linearly-independent columns. To obtain kernels using matroids, theidea is to reduce the problem to a (linear) matroid and then obtain a (small)representation of this matroid.

While matroids were already well-studied in various settings (being intro-duced in the 1930’s [89]), they were only studied in parameterized complexitymuch more recently (cf. [78]).

Matroid-based methods were used in the area of kernelization to obtain afirst (randomized) polynomial kernel for ODD CYCLE TRANSVERSAL [69]. Aninput to the ODD CYCLE TRANSVERSAL problem is a graph G and an integer k,and the goal is to decide whether G can be made bipartite by removing atmost k vertices. Whether or not the problem admits a polynomial kernel whenparameterized by solution size, was a long-standing open question that has nowbeen positively resolved using a matroid-based method. These methods sinceturned out to be a helpful tool in more settings [66,67].

In this thesis, we developed another algebraic tool for sparsification, whichapplies when we can model a problem using low-degree polynomial inequalities.We then observed that when there are many such equalities over a small numberof variables, some of them can be shown to be redundant. In particular, givena set of degree-d polynomial equalities over at most n variables, we provideda method to reduce the number of equalities to O(nd), without changing theanswer. We further applied this method to obtain an improved kernel for theq-COLORING problem, when parameterized by the size of a vertex cover inthe input graph. The method has also been shown to be useful to obtain apolynomial kernel for another graph coloring problem [17].

The method of sparsifying instances by representing them as low-degreepolynomial equalities as such has been useful in various contexts, and adds tothe available methods to obtain kernelization algorithms. Just like the matroid-based method, it is a data-reduction step based on linear-algebraic dependence.

Recently, the INTERVAL VERTEX DELETION problem was shown to have apolynomial kernel [4]. Given a graph G and integer k, the problem asks whetherit is possible to remove at most k vertices from G, such that the resulting graphis an interval graph. Part of this kernelization algorithm relies on representing(part of) the structure of the graph by vectors, such that a basis can be computed.The underlying observation that, when given a large number of short vectors,one can find a relatively small basis is the same as the observation we use whenfinding redundant polynomial equalities. The challenge in both cases lies inactually finding a suitable representation of (part of) the problem.

7

7.2 Sparsification 161

To conclude, the use of linear-algebraic techniques in kernelization researchis increasing and this thesis contributes another useful technique.

7.2 SparsificationThe idea of making a graph or logic problem less dense by reducing the numberof edges or clauses compared to the number of vertices or variables, has beenwidely studied in several settings.

A celebrated result considering sparsification for logic problems is the well-known sparsification lemma by Impagliazzo et al. [53], which shows how ad-CNF formula can be rewritten in subexponential time as a subexponential-sizedisjunction of sparse d-CNF formulas. The lemma has important applications inthe study of subexponential-time algorithms [52,90].

Another research direction is the efficient sparsification of graphs, such thatthe sparsified graph is close to the original graph by some metric. This generallyinvolves reducing the original (possibly unweighted) graph to a weighted graph.The goal is now that the new (weighted) graph has similar properties to theoriginal one, and can for example be used to obtain approximation results.A well-known example is the sparsification that approximately preserves allcuts in the graph, and for which the reduced instance has only O(n log n)edges [10]. Later work improved running times in case the input graph isweighted, and helped to better understand which sparsification proceduresmaintain the desired cut weights [42]. This type of graph sparsification has beenfurther generalized to preserve even more properties of the original graph [85].

This way of sparsifying graphs can usually be done efficiently; the one givenabove for preserving cuts even runs in linear time. However, the guarantee isnot as strong as for the type of sparsification we considered in this thesis, assolutions to the original problem are only approximately preserved. The boundon the sparsification size depends on the quality of the approximation one wantsto achieve.

Polynomial-time sparsification where the sparsified instance is required tohave exactly the same yes/no-answer as the original one, was not widely studiedbefore this thesis. One of the earliest results is by Dell and Van Melkebeek [33],showing that d-CNF-SAT and VERTEX COVER do not allow for a non-trivial spar-sification. This was followed-up by several impossibility results regarding thesparsification of several classic graph problems [59]. Surprisingly, unlike d-CNF-SAT, the d-NAE-SAT problem did turn out to have a non-trivial sparsification [59].

In this thesis, we further studied this polynomial-time sparsification andshowed that H-COLORING is unlikely to have a non-trivial sparsification for alarge number of possible choices for H. This adds another problem to the list ofgraph problems that do not admit non-trivial sparsification [56,59].

7

162 Concluding Remarks

Finally, we studied the sparsification of a large number of logic problems,using the framework of constraint satisfaction problems. We gave a sufficientcondition under which CSP(Γ) has a sparsification with O(nd) constraints, andshowed that this unified known kernelization results for d-NAE-SAT, d-EXACT

SAT, and d-CNF-SAT. Furthermore, we fully classified which Boolean CSPs admitnon-trivial sparsification.

7.3 Future workIn this thesis we studied the sparsification of constraint satisfaction problems.While we obtained many provably tight sparsification results, numerous openquestions remain. The ultimate goal of this line of research would be to fullyclassify the sparsifiability of (Boolean) CSP(Γ), depending on Γ.

Open problem Given a finite (Boolean) constraint language Γ, what is the small-est possible kernel for CSP(Γ) parameterized by the number of variables n?

A first step in this direction could be to further study the sparsifiability ofsymmetric Boolean CSPs. In this thesis, we have seen a full classification ofthose symmetric Boolean CSPs that have a sparsification with a linear numberof constraints. We also already know for which Boolean CSPs non-trivial sparsi-fication is impossible. Studying the symmetric CSPs whose sparsifiability falls inbetween these two extremes, could be a next step in obtaining a fine-grainedview on the complexity of CSP(Γ) for different Γ. A natural first question wouldbe to classify which symmetric Boolean CSPs allow for a sparsification withquadratically many constraints. Further generalizing this, we would like toknow which CSP(Γ) have a sparsification with O(nd) constraints for any givenvalue for d.

Open problem Classify all symmetric Boolean CSPs having a kernel with O(nd)constraints, but no generalized kernel of size O(nd−ε) for any ε > 0, unlessNP ⊆ coNP/poly.

We fully classified all Boolean constraint languages for which CSP(Γ) hasa non-trivial sparsification. However, this result strongly depends on Γ beingBoolean. Suppose d is the arity of the largest-arity relation in Γ and CSP(Γ) isNP-hard. If Γ is Boolean, verifying whether CSP(Γ) has a non-trivial sparsifica-tion is as simple as checking whether there is a relation in Γ that contains exactly2d − 1 tuples: if not, then non-trivial sparsification is possible (Theorem 4.14).This simple idea is not effective when Γ it not guaranteed to be Boolean. Inthis case, a relation with exactly 2d − 1 tuples can sometimes be used to give alower bound (as in the case where Γ is Boolean), but not always: Consider thearity-2 relation R = (0, 0), (1, 2), (2, 1) over D = 0, 1, 2. Then (x1, x2) ∈ Rif and only if x1 + x2 ≡3 0, allowing for a sparsification with only linearly manyconstraints (assuming the same holds for the remaining relations in Γ). We thusask the following question.

7

7.3 Future work 163

Open problem For which (non-Boolean) finite constraint languages Γ where eachrelation in Γ has arity at most d, does CSP(Γ) admit a kernel with O(nd−δ)constraints for some δ > 0?

Finally, we restate the open question that remains from Chapter 5. Whileit has been partially resolved, in particular in the cases where H contains atriangle, and where H has odd girth α and no edge lies in two Cα-subgraphs,other cases remain open.

Open problem Prove or disprove that for all simple nonbipartite graphs H, theH-COLORING problem does not have a generalized kernel of size O(n2−ε) for anyε > 0, unless NP ⊆ coNP/poly.

If it turns out that the H-COLORING problem does not allow for a non-trivialsparsification in all NP-hard cases, this raises the question whether there arenatural graph problems, defined on general graphs, that do have a sparsificationwith a subquadratic number of edges.

7

Bibliography

[1] Faisal N. Abu-Khzam, Michael R. Fellows, Michael A. Langston, andW. Henry Suters. Crown structures for vertex cover kernelization. Theoryof Computing Systems, 41(3):411–430, 2007. doi:10.1007/s00224-007-1328-0. (Cited on page 159.)

[2] Tobias Achterberg. SCIP: solving constraint integer programs. Mathemati-cal Programming Computation, 1(1):1–41, 2009. doi:10.1007/s12532-008-0001-1. (Cited on page 2.)

[3] Tobias Achterberg and Roland Wunderling. Mixed Integer Programming:Analyzing 12 Years of Progress, pages 449–481. Springer Berlin Heidelberg,2013. doi:10.1007/978-3-642-38189-8_18. (Cited on page 2.)

[4] Akanksha Agrawal, Pranabendu Misra, Saket Saurabh, and Meirav Zehavi.Interval vertex deletion admits a polynomial kernel. In Proceedings of the13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019,pages 1711–1730, 2019. doi:10.1137/1.9781611975482.103. (Cited onpage 160.)

[5] Erling D. Andersen and Knud D. Andersen. Presolving in linear pro-gramming. Mathematical Programming, 71(2):221–245, 1995. doi:10.1007/BF01586000. (Cited on page 2.)

[6] Nicolas Barnier and Pascal Brisset. Graph coloring for air traffic flowmanagement. Annals of Operations Research, 130(1):163–178, 2004.doi:10.1023/B:ANOR.0000032574.01332.98. (Cited on page 8.)

[7] David A. Mix Barrington. Some problems involving Razborov-Smolenskypolynomials. In Boolean Function Complexity, pages 109–128. CambridgeUniversity Press, 1992. doi:10.1017/CBO9780511526633.010. (Cited onpages 37 and 59.)

http://dx.doi.org/10.1007/s00224-007-1328-0

http://dx.doi.org/10.1007/s00224-007-1328-0

http://dx.doi.org/10.1007/s12532-008-0001-1

http://dx.doi.org/10.1007/s12532-008-0001-1

http://dx.doi.org/10.1007/978-3-642-38189-8_18

http://dx.doi.org/10.1137/1.9781611975482.103

http://dx.doi.org/10.1007/BF01586000

http://dx.doi.org/10.1007/BF01586000

http://dx.doi.org/10.1023/B:ANOR.0000032574.01332.98

http://dx.doi.org/10.1017/CBO9780511526633.010

166 Bibliography

[8] David A. Mix Barrington, Richard Beigel, and Steven Rudich. RepresentingBoolean functions as polynomials modulo composite numbers. Com-putational Complexity, 4(4):367–382, 1994. doi:10.1007/BF01263424.(Cited on page 60.)

[9] Richard Beigel. The polynomial method in circuit complexity. In Proceed-ings of the 8th Annual Structure in Complexity Theory Conference (CCC1993), pages 82–95, 1993. doi:10.1109/SCT.1993.336538. (Cited onpage 61.)

[10] András A. Benczúr and David R. Karger. Approximating s-t minimumcuts in O(n2) time. In Proceedings of the 28th Annual ACM Symposiumon the Theory of Computing (STOC 1996), pages 47–55, 1996. doi:10.1145/237814.237827. (Cited on page 161.)

[11] Abhishek Bhowmick and Shachar Lovett. Nonclassical polynomials as abarrier to polynomial lower bounds. In Proceedings of the 30th Conferenceon Computational Complexity (CCC 2015), volume 33, pages 72–87, 2015.doi:10.4230/LIPIcs.CCC.2015.72. (Cited on pages 37, 60, and 62.)

[12] Robert E. Bixby and Edward Rothberg. Progress in computational mixedinteger programming - A look back from the other side of the tippingpoint. Annals of Operations Research, 149(1):37–41, 2007. doi:10.1007/s10479-006-0091-y. (Cited on page 2.)

[13] Hans L. Bodlaender. Kernelization: New upper and lower bound tech-niques. In Proceedings of the 4th International Workshop on Parameterizedand Exact Computation, (IWPEC 2009), volume 5917, pages 17–37, 2009.doi:10.1007/978-3-642-11269-0_2. (Cited on page 16.)

[14] Hans L. Bodlaender, Rodney G. Downey, Michael R. Fellows, andDanny Hermelin. On problems without polynomial kernels. Journalof Computer and System Sciences, 75(8):423–434, 2009. doi:10.1016/j.jcss.2009.04.001. (Cited on page 5.)

[15] Hans L. Bodlaender, Fedor V. Fomin, Daniel Lokshtanov, Eelko Penninkx,Saket Saurabh, and Dimitrios M. Thilikos. (Meta) Kernelization. Journalof the ACM, 63(5):44:1–44:69, 2016. doi:10.1145/2973749. (Cited onpage 4.)

[16] Hans L. Bodlaender, Bart M. P. Jansen, and Stefan Kratsch. Kernelizationlower bounds by cross-composition. SIAM Journal on Discrete Mathematics,28(1):277–305, 2014. doi:10.1137/120880240. (Cited on pages 5, 19,21, 22, and 100.)

[17] Hans L. Bodlaender, Sudeshna Kolay, and Astrid Pieterse. Parameterizedcomplexity of conflict-free graph coloring. In Proceedings of the 16th

http://dx.doi.org/10.1007/BF01263424

http://dx.doi.org/10.1109/SCT.1993.336538

http://dx.doi.org/10.1145/237814.237827

http://dx.doi.org/10.1145/237814.237827

http://dx.doi.org/10.4230/LIPIcs.CCC.2015.72

http://dx.doi.org/10.1007/s10479-006-0091-y

http://dx.doi.org/10.1007/s10479-006-0091-y

http://dx.doi.org/10.1007/978-3-642-11269-0_2

http://dx.doi.org/10.1016/j.jcss.2009.04.001


http://dx.doi.org/10.1145/2973749

http://dx.doi.org/10.1137/120880240

Bibliography 167

Algorithms and Data Structures Symposium (WADS 2019), 2019. To appear.URL: https://arxiv.org/abs/1905.00305. (Cited on pages 11, 158,and 160.)

[18] Hans L. Bodlaender, Stéphan Thomassé, and Anders Yeo. Kernel boundsfor disjoint cycles and disjoint paths. Theoretical Computer Science,412(35):4570–4578, 2011. doi:10.1016/j.tcs.2011.04.039. (Citedon pages 4, 5, 19, and 100.)

[19] Hans L. Bodlaender and Thomas C. van Dijk. A cubic kernel for feedbackvertex set and loop cutset. Theory of Computing Systems, 46(3):566–597,2010. doi:10.1007/s00224-009-9234-2. (Cited on page 159.)

[20] V. G. Bodnarcuk, L. A. Kalužnin, V. N. Kotov, and B. A. Romov. Galoistheory for post algebras, part I and II. Cybernetics, 5:243–252, 531–539,1969. doi:10.1007/BF01070906. (Cited on page 32.)

[21] Andrei A. Bulatov. H-coloring dichotomy revisited. Theoretical ComputerScience, 349(1):31–39, 2005. doi:10.1016/j.tcs.2005.09.028. (Citedon pages 116, 119, 121, 125, 126, and 129.)

[22] Andrei A. Bulatov. A dichotomy theorem for nonuniform CSPs. In Pro-ceedings of the 58th Annual IEEE Symposium on Foundations of ComputerScience (FOCS 2017), pages 319–330, 2017. doi:10.1109/FOCS.2017.37.(Cited on page 9.)

[23] Andrei A. Bulatov, Peter Jeavons, and Andrei A. Krokhin. Classifyingthe complexity of constraints using finite algebras. SIAM Journal onComputing, 34(3):720–742, 2005. doi:10.1137/S0097539700376676.(Cited on page 33.)

[24] Rafael G. Cano, Kevin Buchin, Thom Castermans, Astrid Pieterse, WillemSonke, and Bettina Speckmann. Mosaic drawings and cartograms. Com-puter Graphics Forum, 34(3):361–370, 2015. doi:10.1111/cgf.12648.(Cited on page 11.)

[25] Hubie Chen. A rendezvous of logic, complexity, and algebra. ACM Com-puting Surveys, 42(1), 2009. doi:10.1145/1189056.1189076. (Cited onpages 31 and 32.)

[26] Hubie Chen, Bart M. P. Jansen, and Astrid Pieterse. Sparsification lowerbounds for H-coloring. Submitted. (Cited on page 11.)

[27] Hubie Chen, Bart M. P. Jansen, and Astrid Pieterse. Best-case and worst-case sparsifiability of Boolean CSPs. In Proceedings of the 13th InternationalSymposium on Parameterized and Exact Computation (IPEC 2018), vol-ume 115, pages 15:1–15:13, 2019. doi:10.4230/LIPIcs.IPEC.2018.15.(Cited on page 10.)

https://arxiv.org/abs/1905.00305

http://dx.doi.org/10.1016/j.tcs.2011.04.039

http://dx.doi.org/10.1007/s00224-009-9234-2

http://dx.doi.org/10.1007/BF01070906


http://dx.doi.org/10.1109/FOCS.2017.37

http://dx.doi.org/10.1137/S0097539700376676

http://dx.doi.org/10.1111/cgf.12648

http://dx.doi.org/10.1145/1189056.1189076


168 Bibliography

[28] Zhi-Zhong Chen, Michael R. Fellows, Bin Fu, Haitao Jiang, Yang Liu,Lusheng Wang, and Binhai Zhu. A linear kernel for co-path/cycle packing.In Proceedings of the 6th International Conference on Algorithmic Applica-tions in Management (AAIM 2010), volume 6124, pages 90–102, 2010.doi:10.1007/978-3-642-14355-7_10. (Cited on page 159.)

[29] Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, DánielMarx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Parameter-ized Algorithms. Springer Publishing Company, Incorporated, 1st edition,2015. (Cited on page 4.)

[30] Holger Dell, Thore Husfeldt, Bart M. P. Jansen, Petteri Kaski, Christian Ko-musiewicz, and Frances A. Rosamond. The first parameterized algorithmsand computational experiments challenge. In Proceedings of the 11th Inter-national Symposium on Parameterized and Exact Computation (IPEC 2016),volume 63, pages 30:1–30:9, 2017. doi:10.4230/LIPIcs.IPEC.2016.30.(Cited on page 2.)

[31] Holger Dell, Christian Komusiewicz, Nimrod Talmon, and Mathias Weller.The PACE 2017 parameterized algorithms and computational experimentschallenge: The second iteration. In Proceedings of the 12th InternationalSymposium on Parameterized and Exact Computation (IPEC 2017), vol-ume 89, pages 30:1–30:12, 2018. doi:10.4230/LIPIcs.IPEC.2017.30.(Cited on page 2.)

[32] Holger Dell and Dániel Marx. Kernelization of packing problems. In Pro-ceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms(SODA 2012), pages 68–81, 2012. doi:10.1137/1.9781611973099.6.(Cited on page 22.)

[33] Holger Dell and Dieter van Melkebeek. Satisfiability allows no nontrivialsparsification unless the polynomial-time hierarchy collapses. Journalof the ACM, 61(4):23:1–23:27, 2014. doi:10.1145/2629620. (Cited onpages 5, 7, 19, 20, 59, 64, 70, 103, and 161.)

[34] Michael Dom, Daniel Lokshtanov, and Saket Saurabh. Kernelizationlower bounds through colors and IDs. ACM Transactions on Algorithms,11(2):13:1–13:20, 2014. doi:10.1145/2650261. (Cited on page 46.)

[35] Andrew Drucker. New limits to classical and quantum instance com-pression. SIAM Journal on Computing, 44(5):1443–1479, 2015. doi:10.1137/130927115. (Cited on pages 24, 70, and 71.)

[36] Louis Esperet, Mickaël Montassier, Pascal Ochem, and Alexandre Pinlou.A complexity dichotomy for the coloring of sparse graphs. Journal ofGraph Theory, 73(1):85–102, 2013. doi:10.1002/jgt.21659. (Cited onpage 104.)

http://dx.doi.org/10.1007/978-3-642-14355-7_10



http://dx.doi.org/10.1137/1.9781611973099.6

http://dx.doi.org/10.1145/2629620

http://dx.doi.org/10.1145/2650261

http://dx.doi.org/10.1137/130927115

http://dx.doi.org/10.1137/130927115

http://dx.doi.org/10.1002/jgt.21659

Bibliography 169

[37] Michael R. Fellows. Blow-ups, win/win’s, and crown rules: Some newdirections in FPT. In Proceedings of the 29th International Workshopon Graph-Theoretic Concepts in Computer Science (WG 2003), volume2880, pages 1–12, 2003. doi:10.1007/978-3-540-39890-5_1. (Citedon page 159.)

[38] Jirí Fiala, Petr A. Golovach, and Jan Kratochvíl. Parameterized complexityof coloring problems: Treewidth versus vertex cover. Theoretical ComputerScience, 412(23):2513 – 2523, 2011. doi:10.1016/j.tcs.2010.10.043.(Cited on page 139.)

[39] Fedor V. Fomin, Daniel Lokshtanov, Neeldhara Misra, and Saket Saurabh.Planar F-deletion: Approximation, kernelization and optimal FPT algo-rithms. In Proceedings of the 53rd Annual IEEE Symposium on Founda-tions of Computer Science (FOCS 2012), pages 470–479, 2012. doi:10.1109/FOCS.2012.62. (Cited on page 4.)

[40] Fedor V. Fomin, Daniel Lokshtanov, Saket Saurabh, and Meirav Zehavi.Kernelization: Theory of Parameterized Preprocessing. Cambridge UniversityPress, 2019. (Cited on pages 4 and 159.)

[41] Lance Fortnow and Rahul Santhanam. Infeasibility of instance compres-sion and succinct PCPs for NP. Journal of Computer and System Sci-ences, 77(1):91–106, 2011. doi:10.1016/j.jcss.2010.06.007. (Citedon pages 5 and 59.)

[42] Wai Shing Fung, Ramesh Hariharan, Nicholas J. A. Harvey, and DebmalyaPanigrahi. A general framework for graph sparsification. In Proceed-ings of the 43rd Annual ACM Symposium on Theory of Computing (STOC2011), pages 71–80, 2011. doi:10.1145/1993636.1993647. (Cited onpage 161.)

[43] Anna Galluccio, Pavol Hell, and Jarošlav Nešetril. The complexity ofH-colouring of bounded degree graphs. Discrete Mathematics, 222(1-3):101–109, 2000. doi:10.1016/S0012-365X(00)00009-1. (Cited onpage 104.)

[44] Robert Ganian. Improving vertex cover as a graph parameter. DiscreteMathematics & Theoretical Computer Science, 17(2):77–100, 2015. URL:http://dmtcs.episciences.org/2136. (Cited on pages 140 and 144.)

[45] Michael R. Garey, David S. Johnson, and H. C. So. An application of graphcoloring to printed circuit testing (working paper). In Proceedings of the16th Annual Symposium on Foundations of Computer Science (FOCS 1975),pages 178–183, 1975. doi:10.1109/SFCS.1975.3. (Cited on page 8.)

http://dx.doi.org/10.1007/978-3-540-39890-5_1





http://dx.doi.org/10.1145/1993636.1993647

http://dx.doi.org/10.1016/S0012-365X(00)00009-1

http://dmtcs.episciences.org/2136

http://dx.doi.org/10.1109/SFCS.1975.3

170 Bibliography

[46] Michael R. Garey, David S. Johnson, and Larry J. Stockmeyer. Somesimplified NP-complete graph problems. Theoretical Compututer Science,1(3):237–267, 1976. doi:10.1016/0304-3975(76)90059-1. (Cited onpage 6.)

[47] David Geiger. Closed systems of functions and predicates. Pacific Journalof Mathematics, 27:95–100, 1968. (Cited on page 32.)

[48] Mark S. Gockenbach. Finite-Dimensional Linear Algebra. Discrete Mathe-matics and Its Applications. Taylor & Francis, 2011. (Cited on page 25.)

[49] Pavol Hell and Jaroslav Nešetril. On the complexity of H-coloring. Journalof Combinatorial Theory, Series B, 48(1):92–110, 1990. doi:10.1016/0095-8956(90)90132-J. (Cited on page 104.)

[50] Leslie Hogben. Handbook of Linear Algebra, Second Edition. Chapman andHall/CRC, 2014. (Cited on page 39.)

[51] John A. Howell. Spans in the module (Zm)s. Linear and MultilinearAlgebra, 19(1):67–77, 1986. doi:10.1080/03081088608817705. (Citedon page 26.)

[52] Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-SAT. Journal of Computer and System Sciences, 62(2):367–375, 2001.doi:10.1006/jcss.2000.1727. (Cited on page 161.)

[53] Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which prob-lems have strongly exponential complexity? Journal of Computer andSystem Sciences, 63(4):512–530, 2001. doi:10.1006/jcss.2001.1774.(Cited on page 161.)

[54] Yoichi Iwata. Linear-time kernelization for feedback vertex set. In Pro-ceedings of the 44th International Colloquium on Automata, Languages,and Programming (ICALP 2017), volume 80, pages 68:1–68:14, 2017.doi:10.4230/LIPIcs.ICALP.2017.68. (Cited on page 2.)

[55] Lars Jaffke and Bart M. P. Jansen. Fine-grained parameterized complexityanalysis of graph coloring problems. In Proceedings of the 10th Inter-national Conference on Algorithms and Complexity (CIAC 2017), volume10236, pages 345–356, 2017. doi:10.1007/978-3-319-57586-5_29.(Cited on page 105.)

[56] Bart M. P. Jansen. On sparsification for computing treewidth. Algorithmica,71(3):605–635, 2015. doi:10.1007/s00453-014-9924-2. (Cited onpages 7, 103, and 161.)

http://dx.doi.org/10.1016/0304-3975(76)90059-1

http://dx.doi.org/10.1016/0095-8956(90)90132-J

http://dx.doi.org/10.1016/0095-8956(90)90132-J

http://dx.doi.org/10.1080/03081088608817705

http://dx.doi.org/10.1006/jcss.2000.1727

http://dx.doi.org/10.1006/jcss.2001.1774

http://dx.doi.org/10.4230/LIPIcs.ICALP.2017.68

http://dx.doi.org/10.1007/978-3-319-57586-5_29

http://dx.doi.org/10.1007/s00453-014-9924-2

Bibliography 171

[57] Bart M. P. Jansen and Stefan Kratsch. Data reduction for graph coloringproblems. Information and Computation, 231:70–88, 2013. doi:10.1016/j.ic.2013.08.005. (Cited on pages 139, 140, 141, and 157.)

[58] Bart M. P. Jansen and Astrid Pieterse. Optimal sparsification for somebinary CSPs using low-degree polynomials. In Proceedings of the 41stInternational Symposium on Mathematical Foundations of Computer Sci-ence (MFCS 2016), volume 58, pages 71:1–71:14, 2016. doi:10.4230/LIPIcs.MFCS.2016.71. (Cited on page 11.)

[59] Bart M. P. Jansen and Astrid Pieterse. Sparsification upper and lowerbounds for graph problems and not-all-equal SAT. Algorithmica, 79(1):3–28, 2017. doi:10.1007/s00453-016-0189-9. (Cited on pages 7, 11, 41,103, and 161.)

[60] Bart M. P. Jansen and Astrid Pieterse. Optimal data reduction forgraph coloring using low-degree polynomials. In Proceedings of the12th International Symposium on Parameterized and Exact Computa-tion (IPEC 2017), volume 89, pages 22:1–22:12, 2018. doi:10.4230/LIPIcs.IPEC.2017.22. (Cited on page 103.)

[61] Bart M. P. Jansen and Astrid Pieterse. Polynomial kernels for hittingforbidden minors under structural parameterizations. In Proceedings of the26th Annual European Symposium on Algorithms (ESA 2018), volume 112,pages 48:1–48:15, 2018. doi:10.4230/LIPIcs.ESA.2018.48. (Cited onpage 11.)

[62] Bart M. P. Jansen and Astrid Pieterse. Optimal data reduction for graphcoloring using low-degree polynomials. Algorithmica, 2019. Online First.doi:10.1007/s00453-019-00578-5. (Cited on page 11.)

[63] Tommy R. Jensen and Bjarne Toft. Graph Coloring Problems, volume 39.John Wiley & Sons, 2011. (Cited on page 8.)

[64] Ravindran Kannan and Achim Bachem. Polynomial algorithms for comput-ing the Smith and Hermite normal forms of an integer matrix. SIAM Jour-nal on Computing, 8(4):499–507, 1979. doi:10.1137/0208040. (Citedon page 26.)

[65] Richard M. Karp. Reducibility among combinatorial problems. In Complex-ity of Computer Computations, pages 85–103. 1972. doi:10.1007/978-1-4684-2001-2_9. (Cited on pages 20 and 46.)

[66] Stefan Kratsch. A randomized polynomial kernelization for vertexcover with a smaller parameter. SIAM Journal on Discrete Mathematics,32(3):1806–1839, 2018. doi:10.1137/16M1104585. (Cited on page 160.)

http://dx.doi.org/10.1016/j.ic.2013.08.005

http://dx.doi.org/10.1016/j.ic.2013.08.005



http://dx.doi.org/10.1007/s00453-016-0189-9



http://dx.doi.org/10.4230/LIPIcs.ESA.2018.48

http://dx.doi.org/10.1007/s00453-019-00578-5

http://dx.doi.org/10.1137/0208040

http://dx.doi.org/10.1007/978-1-4684-2001-2_9

http://dx.doi.org/10.1007/978-1-4684-2001-2_9

http://dx.doi.org/10.1137/16M1104585

172 Bibliography

[67] Stefan Kratsch and Magnus Wahlström. Representative sets and irrel-evant vertices: New tools for kernelization. In Proceedings of the 53rdAnnual IEEE Symposium on Foundations of Computer Science (FOCS 2012),pages 450–459, 2012. doi:10.1109/FOCS.2012.46. (Cited on pages 4and 160.)

[68] Stefan Kratsch and Magnus Wahlström. Two edge modification problemswithout polynomial kernels. Discrete Optimization, 10(3):193–199, 2013.doi:10.1016/j.disopt.2013.02.001. (Cited on page 5.)

[69] Stefan Kratsch and Magnus Wahlström. Compression via matroids: Arandomized polynomial kernel for odd cycle transversal. ACM Transactionson Algorithms, 10(4):20:1–20:15, 2014. doi:10.1145/2635810. (Citedon page 160.)

[70] Victor Lagerkvist and Magnus Wahlström. Kernelization of constraintsatisfaction problems: A study through universal algebra. In Proceedingsof the 23rd International Conference on Principles and Practice of ConstraintProgramming (CP 2017), volume 10416, pages 157–171, 2017. doi:10.1007/978-3-319-66158-2_11. (Cited on pages 64, 70, 71, 91, 92, 94,99, and 103.)

[71] Victor Lagerkvist and Magnus Wahlström. Kernelization of constraintsatisfaction problems: A study through universal algebra. CoRR,abs/1706.05941, 2017. arXiv:1706.05941. (Cited on pages 92, 94,and 100.)

[72] Gregory T. Lee. Abstract Algebra An Introductory Course. Springer, Cham,2018. doi:10.1007/978-3-319-77649-1. (Cited on page 24.)

[73] R. M. R. Lewis. A Guide to Graph Colouring. Springer InternationalPublishing Switzerland, 2016. doi:10.1007/978-3-319-25730-3. (Citedon page 8.)

[74] Daniel Lokshtanov and Christian Sloper. Fixed parameter set splitting,linear kernel and improved running time. In Proceedings of the 1st work-shop on Algorithms and Complexity in Durham (ACiD 2005), pages 105–113, 2005. URL: http://bora.uib.no/handle/1956/1115. (Cited onpage 159.)

[75] Lásló Lovász. Chromatic number of hypergraphs and linear algebra. InStudia Scientiarum Mathematicarum Hungarica 11, pages 113–114, 1976.URL: http://real-j.mtak.hu/5461/. (Cited on page 41.)

[76] Gary MacGillivray and Mark H. Siggers. On the complexity of H-colouringplanar graphs. Discrete Mathematics, 309(18):5729–5738, 2009. doi:10.1016/j.disc.2008.05.016. (Cited on page 104.)


http://dx.doi.org/10.1016/j.disopt.2013.02.001

http://dx.doi.org/10.1145/2635810

http://dx.doi.org/10.1007/978-3-319-66158-2_11

http://dx.doi.org/10.1007/978-3-319-66158-2_11

http://arxiv.org/abs/1706.05941

http://dx.doi.org/10.1007/978-3-319-77649-1

http://dx.doi.org/10.1007/978-3-319-25730-3

http://bora.uib.no/handle/1956/1115

http://real-j.mtak.hu/5461/

http://dx.doi.org/10.1016/j.disc.2008.05.016

http://dx.doi.org/10.1016/j.disc.2008.05.016

Bibliography 173

[77] Dániel Marx. Graph colouring problems and their applications in schedul-ing. Periodica Polytechnica Electrical Engineering (Archives), 48(1-2):11–16.URL: https://pp.bme.hu/ee/article/view/926. (Cited on page 8.)

[78] Dániel Marx. A parameterized view on matroid optimization problems.Theoretical Computer Science, 410(44):4471–4479, 2009. doi:10.1016/j.tcs.2009.07.027. (Cited on page 160.)

[79] Hermann A. Maurer, Ivan Hal Sudborough, and Emo Welzl. On thecomplexity of the general coloring problem. Information and Control,51(2):128–145, 1981. doi:10.1016/S0019-9958(81)90226-6. (Citedon page 105.)

[80] Hannes Moser. A problem kernelization for graph packing. In Proceedingsof the 35th Conference on Current Trends in Theory and Practice of ComputerScience (SOFSEM 2009), pages 401–412, 2009. doi:10.1007/978-3-540-95891-8_37. (Cited on page 159.)

[81] Astrid Pieterse. Sparsification upper and lower bounds for graph problemsand not-all-equal SAT. Master’s thesis, Eindhoven University of Tech-nology, 2015. URL: https://research.tue.nl/en/studentTheses/sparsification-upper-and-lower-bounds-for-graph-problems-and-not-. (Cited on page 23.)

[82] Astrid Pieterse and Gerhard J. Woeginger. The subset sum game revisited.In Proceedings of the 5th International Conference on Algorithmic DecisionTheory (ADT 2017), volume 10576, pages 228–240, 2017. doi:10.1007/978-3-319-67504-6_16. (Cited on page 11.)

[83] Thomas J. Schaefer. The complexity of satisfiability problems. In Proceed-ings of the 10th ACM Symposium on Theory of Computing (STOC 1978),pages 216–226, 1978. doi:10.1145/800133.804350. (Cited on pages 9and 31.)

[84] Mark H. Siggers. Dichotomy for bounded degree H-colouring. Dis-crete Applied Mathematics, 157(2):201–210, 2009. doi:10.1016/j.dam.2008.02.003. (Cited on page 104.)

[85] Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effectiveresistances. SIAM Journal on Computing, 40(6):1913–1926, 2011. doi:10.1137/080734029. (Cited on page 161.)

[86] Jeremy P. Spinrad. Efficient Graph Representations. American MathematicalSociety, 2003. (Cited on page 145.)

[87] Arne Storjohann and Thom Mulders. Fast Algorithms for Linear Al-gebra Modulo N, pages 139–150. Springer Berlin Heidelberg, 1998.doi:10.1007/3-540-68530-8_12. (Cited on pages 26 and 27.)

https://pp.bme.hu/ee/article/view/926



http://dx.doi.org/10.1016/S0019-9958(81)90226-6

http://dx.doi.org/10.1007/978-3-540-95891-8_37

http://dx.doi.org/10.1007/978-3-540-95891-8_37

https://research.tue.nl/en/studentTheses/sparsification-upper-and-lower-bounds-for-graph-problems-and-not-



http://dx.doi.org/10.1007/978-3-319-67504-6_16

http://dx.doi.org/10.1007/978-3-319-67504-6_16

http://dx.doi.org/10.1145/800133.804350

http://dx.doi.org/10.1016/j.dam.2008.02.003

http://dx.doi.org/10.1016/j.dam.2008.02.003

http://dx.doi.org/10.1137/080734029

http://dx.doi.org/10.1137/080734029

http://dx.doi.org/10.1007/3-540-68530-8_12

174 Bibliography

[88] Gábor Tardos and David A. Mix Barrington. A lower bound on the mod 6degree of the OR function. Computational Complexity, 7(2):99–108, 1998.doi:10.1007/PL00001597. (Cited on page 62.)

[89] Hassler Whitney. On the abstract properties of linear dependence. Ameri-can Journal of Mathematics, 57(3):509–533, 1935. doi:10.2307/2371182.(Cited on page 160.)

[90] Gerhard J. Woeginger. Exact algorithms for NP-hard problems: A survey.In Proceedings of the 5th Aussois Combinatorial Optimization Workshop- Eureka, You Shrink!, Papers Dedicated to Jack Edmonds, volume 2570,pages 185–208, 2001. doi:10.1007/3-540-36478-1\_17. (Cited onpage 161.)

[91] Chee-Keng Yap. Some consequences of non-uniform conditions on uniformclasses. Theoretical Computer Science, 26:287–300, 1983. doi:10.1016/0304-3975(83)90020-8. (Cited on page 18.)

[92] Dmitriy Zhuk. A proof of CSP dichotomy conjecture. In Proceedings of the58th Annual IEEE Symposium on Foundations of Computer Science (FOCS2017), pages 331–342, 2017. doi:10.1109/FOCS.2017.38. (Cited onpage 9.)

http://dx.doi.org/10.1007/PL00001597

http://dx.doi.org/10.2307/2371182

http://dx.doi.org/10.1007/3-540-36478-1_17

http://dx.doi.org/10.1016/0304-3975(83)90020-8

http://dx.doi.org/10.1016/0304-3975(83)90020-8


Summary

Tight Parameterized Preprocessing Bounds:Sparsification via Low-Degree PolynomialsMany of the problems we are interested in solving algorithmically are knownto be NP-hard. As such, we do not expect to find efficient (polynomial-time)algorithms to solve these problems. However, we are still interested in findingeffective ways to (at least) be able to solve certain input instances for suchproblems. One of the ways to do this, is to first try and preprocess the inputinstance, without changing the solution. Then, a more expensive algorithm toactually solve the problem can be applied to the (now hopefully easier) prepro-cessed instance. In this thesis, we will study the power of such preprocessingalgorithms. In particular, we will be interested in polynomial-time preprocessing,where the preprocessing algorithm is required to run in polynomial time.

As the problems we consider are NP-hard, we cannot hope that such apreprocessing algorithm will always be guaranteed to decrease the actual sizeof the input instance. After all, such a preprocessing algorithm could then beused repeatedly to obtain a polynomial-time algorithm that solves the problem,which is unlikely to exist. For this reason, we measure the effectiveness of ourpreprocessing algorithms by some additional parameter, other than the inputsize, that measures the complexity of the input. As such, input instances areoften denoted as (x, k) where k is the additional parameter.

The type of preprocessing algorithms we study in this thesis are calledkernelization algorithms (or kernels). A kernelization algorithm is a polynomial-time algorithm that, given an instance (x, k), outputs an instance (x′, k′) of thesame problem such that both |x′| and k′ are bounded by f (k) for some functionthat may only depend on k. This function is also called the size of the kernel.Naturally, smaller kernels are better. As such, we are interested in what thesmallest kernel is that can be obtained for a certain parameterized problem.

In this thesis we obtain several tight upper and lower bound results for suchkernelization algorithms. First of all, we study the (very general) CONSTRAINT

176 Summary

SATISFACTION PROBLEM (CSP). Broadly speaking, an input instance for this prob-lem consists of a number of constraints over a set of variables, and the questionis whether we can assign values to the variables, such that all constraints aresatisfied. We will be particularly interested in Boolean CSPs, where the questionis to assign 0 or 1 to each variable such that all constraints are satisfied.

This thesis investigates how the type of constraints that are used, influencesthe kernelizability of the problem. We give a general method, based on rep-resenting constraints by low-degree polynomials, to obtain kernels for CSPs.Furthermore, we show that this method is best-possible in some cases: thereare CSPs for which no better kernelization is possible (under commonly usedcomplexity-theoretic assumptions).

As part of our investigation into the kernelization of CSPs, we use universalalgebra to describe CSPs for which any n-variable input can be reduced to aninput that only has O(n) constraints. In fact, we show that for these CSPs thereis always a subset of the original constraints of size at most O(n) that directlyimplies all other constraints. Using this method, we fully classify the symmetricBoolean CSPs that have a kernel with O(n) constraints.

Furthermore, when considering CSPs over the Boolean domain in whichevery constraint considers at most d variables, we obtain a dichotomy result.When the constraints can express a size-d logical OR in a certain sense, thebest-possible kernel has size Θ(nd) (assuming NP * coNP/poly) where n is thenumber of variables. In all other cases, there is a kernel with at most O(nd−1)constraints and size O(nd−1 log n).

In addition to the above, we completely classify the kernelizability of BooleanCSPs when every constraint concerns at most three variables. Under standardcomplexity-theoretic assumptions, it turns out that there is a fairly simpleclassification into four cases. Either the CSP is in P, or the best-possible kernelhas O(n) constraints, or the best-possible kernel has O(n2) constraints, or nobetter kernel than the trivial one with O(n3) constraints exists.

The second problem we study is the H-COLORING problem, which is definedfor any fixed graph H. Given a graph G, the problem asks whether there existsa mapping f : V(G) → V(H), such that for any edge u, v ∈ E(G), we have f (u), f (v) ∈ E(H). Since this is a graph problem, and any n-vertex graphcan be stored in O(n2) bits, it is easy to see that H-COLORING parameterized bythe number of vertices has a kernel of size O(n2). In this thesis we show thatunder some additional restrictions on H, there is no kernel of size O(n2−ε) forany ε > 0, unless NP ⊆ coNP/poly.

Finally, we study the q-COLORING problem on structurally simple graphs.Given a graph G, the question is whether it is possible to color its vertices with atmost q colors, such that any two vertices connected by an edge receive differentcolors. We obtain a kernel of size O(kq−1 log k) for the q-COLORING problemwhen parameterized by the size of a vertex cover. This gives a kernel whose size

Summary 177

is optimal up to polylogarithmic factors under complexity-theoretic assumptions.It improves upon the previously best-known kernel that had size O(kq). Toobtain this kernel, we apply the methods developed for the kernelization of CSPsin this different setting. We further generalize this result to the H-COLORING

problem, parameterized by twin-cover, which is a smaller parameter than vertexcover. We obtain a kernel of size O(k∆(H) log k), where ∆(H) is the maximumdegree of any vertex in H.

Curriculum Vitae

Astrid Pieterse was born on April 19, 1993 in Weert, The Netherlands. Shecompleted her secondary education at Philips van Horne s.g. (Weert, TheNetherlands) in 2010. She received a bachelor in Computer Science and Engi-neering (Cum Laude) and a bachelor in Industrial and Applied Mathematics(Cum Laude) from Eindhoven University of Technology in 2013. In 2015 sheobtained a master’s degree in Computer Science and Engineering at EindhovenUniversity of Technology, which was awarded Cum Laude. She has been a PhDstudent at Eindhoven University of Technology since September 2015, underthe supervision of dr. Bart M. P. Jansen. During her PhD she was awarded thebest paper award at the MFCS conference in 2016, and the excellent studentpaper award at IPEC 2017.

1

1

Documents

Tight parameterized preprocessing bounds · Tight Parameterized Preprocessing Bounds Sparsiﬁcation via Low-Degree Polynomials PROEFSCHRIFT ter verkrijging van de graad van doctor