Backdoor sets for the class CNF(2) of CNF-formulas in which every variable has at most two occurrences are studied in terms of parameterized complexity. The question whether there exists a CNF(2)-backdoor set of size k is hard for the class , for both weak and strong backdoors, and in both cases it becomes fixed-parameter tractable when restricted to inputs in d-CNF for a fixed d.
Besides that, it is shown that the problem of finding weak backdoor sets is -complete, for certain tractable cases. These are the first completeness results in lower levels of the -hierarchy for any backdoor set problems.
Despite the theoretical hardness of the SAT problem, being the canonical NP-complete problem  and conjectured to not be solvable in sub-exponential time , state-of-the-art SAT solvers have become very efficient, and routinely solve instances arising from applications with hundreds of thousands of variables and millions of clauses. Even though there are lots of known tractable cases of SAT, i.e., classes of formulas that can be solved in polynomial time, the efficiently solvable instances arising in practice usually do not belong to these tractable classes, and thus the existence of these classes does not suffice to explain this apparent discrepancy between theory and practice.
A possibly better attempt at explaining the discrepancy is that the large, efficiently solvable instances are in some way close to a tractable case. One possible such notion of closeness is that there is a small subset of variables, such that after giving values to these variables the residual formula is in the tractable class. This concept was introduced by Crama et al.  and Williams et al. , the latter work coined the name backdoor set for such a set of variables.
There are several kinds of backdoor sets for each tractable case considered in the literature. A strong backdoor set for the class C is a set of variables, such that for every setting of these variables the residual formula is in C. A weak backdoor set for C is a set of variables, such that for some setting of these variables the residual formula is in C and satisfiable. There is also the auxiliary notion of a deletion backdoor set to be defined below.
A strong backdoor set of size k allows to solve a formula of size m in time . This runtime bound depends on the two parameters (size of the formula m and backdoor set size k) in essentially different ways, and the proper framework to analyze such complexity bounds is fixed-parameter tractability and parameterized complexity as introduced by Downey and Fellows .
Besides the class of fixed-parameter tractable problems, the theory of parameterized complexity provides a hierarchy of classes of increasingly hard problems that are probably fixed-parameter intractable, the -hierarchy containing in particular the classes for , and a notion of reduction that allows to define hardness and completeness for these classes.
The complexity bound mentioned above thus means that given a formula together with a strong backdoor set of size k, testing for satisfiability is fixed-parameter tractable w.r.t. the parameter k. Thus, if the problem of finding a backdoor set of size k was fixed-parameter tractable w.r.t. the parameter k as well, then the size of a smallest backdoor set would be a parameter with respect to which SAT is fixed-parameter tractable. Thus, starting with the work of Nishimura et al. , the parameterized complexity of finding backdoor sets was determined for various tractable cases, most of the results in that direction are collected in the recent survey by Gaspers and Szeider .
A less well-known tractable case is the class of -formulas in which every variable has at most two occurrences. The satisfiability problem for these formulas can be solved in linear time , and it has been shown to be complete for deterministic logarithmic space .
In this work, we determine the parameterized complexity of finding backdoor sets w.r.t. the class . We show that the problem is hard for the class , for both weak and strong backdoor sets, and that in both cases it becomes fixed-parameter tractable when restricted to inputs in for a fixed d.
For those tractable cases where the problem of finding backdoor sets is -hard, including , the smallest parameterized complexity class known to contain the problem is , which lies higher up in the -hierarchy, beyond the classes . For the tractable cases of 0-valid and 1-valid formulas, we show that the weak backdoor set problem is complete for . To the best of our knowledge, these are the first completeness results in the -hierarchy for any weak backdoor set problems.
In order prove this latter result, we study a related artificial problem of finding so-called very weak backdoor sets, whose definition differs from that of weak backdoor sets by weakening the condition that the residual formula has to be satisfiable. We show that the problem of finding very weak backdoor sets for these classes is in , by utilizing the characterization of the -hierarchy in terms of first-order logic definability. The method also allows us to put the very weak backdoor set problem for other tractable cases into the class , including the class .
The paper is structured as follows: in Section 2 we review the necessary background on the problem SAT and its tractable cases, in particular the class , on parameterized complexity and on backdoor sets. Section 3 treats some general properties of -backdoor sets. In Section 4 we show the results about weak -backdoor sets, and in Section 5 those about strong -backdoor sets. Finally Section 6 presents the mentioned upper bounds in the -hierarchy.
We briefly review basic notions about the propositional satisfiability problem, mainly to fix the notation.
A literal is a variable x or a negated variable . A clause is a disjunction of literals . The width of C is d, the number of literals in C. We identify a clause with the set of literals occurring in it, even though for clarity we still write it as a disjunction.
A formula in conjunctive normal form () is a conjunction of clauses, it is usually identified with the set of clauses . A formula F in is in if every clause C in F is of width .
A restriction α is a partial assignment from the set of variables V to the truth values . A restriction α is extended to literals by setting . We occasionally identify a restriction α with the set of literals it sets, i.e., .
For a clause C over the variables V, we define if for some literal , and otherwise is the disjunction of those literals for which does not hold, i.e., which are left unset by α. Here the empty disjunction is identified with the constant 0.
For a -formula F over V, we define if for some , and otherwise is the conjunction of the clauses for those clauses for which is not the case. Here the empty conjunction is identified with the constant 1.
In other words, the formula is obtained by deleting clauses satisfied by α from F, and deleting literals falsified by α from the remaining clauses in F.
For , we denote by the restriction setting the variable x to 𝜖. This notation is also extended to literals by letting denote .
If , then we say that α satisfies F and write . The satisfiability problem SAT is the decision problem:
Formula F in .
Is there a restriction α with ?
A clause C is tautological, if both x and occur in C for some variable x. Since tautological clauses are satisfied by all restrictions of their variables, they are irrelevant for the satisfiability of a formula they appear in. Therefore, SAT and other related problems are often restricted to formulas that do not contain any tautological clauses. Except when noted, all our results hold for problems restricted in this way.
2.1.Tractable Cases of SAT
Despite its hardness, many easy special cases of the problem SAT have been identified. A tractable case of SAT, sometimes also called an “island of tractability”, is a class of -formulas such that
membership can be decided in polynomial time,
the satisfiability problem for formulas can be decided in polynomial time.
The class of Horn formulas, i.e., formulas in which every clause contains at most one positive literal.
The class of formulas in which every clause contains at most one negative literal.
The class of formulas in which every clause is of width at most 2.
The class 1-Val of formulas where every clause contains at least one positive literal.
The class 0-Val of formulas where every clause contains at least one negative literal.
All the above tractable cases are defined by properties of the individual clauses in the formula, and by the classification result of Schaefer , these are the only maximal such classes of formulas.
A different tractable class which is not defined by properties of clauses is the class of cluster formulas [7,10]. Two clauses C and clash if they contain complementary literals, i.e. if and for some literal a. A formula F is a hitting formula if any two different clauses in F clash. The class of cluster formulas is the class of -formulas that are variable-disjoint unions of hitting formulas.
Another less well-known tractable class is the class of formulas with at most two occurrences of every variable – in a way a dual to . It is known that satisfiability of formulas in can be tested in linear time  and in logarithmic space – in fact it is complete for the class of problems solvable in deterministic logarithmic space .
We briefly review the basic concepts of parameterized complexity as used in this work, mostly following the textbook of Flum and Grohe .
A parameterized problem is a decision problem P together with a parameterization, i.e., a polynomial time computable mapping that associates with every instance x of P a parameter . We denote by the size of an instance, the number of bits required to represent x.
A parameterized problem P with parameter k is fixed-parameter tractable if there is an algorithm that solves P in time for some computable function f and . The class of parameterized problems that are fixed-parameter tractable is called .
An -reduction between parameterized problems P and with parametrizations k and , resp., is a function r mapping instances of P to instances of such that
r is computable in -time for some computable function f and ,
is a positive instance of iff x is a positive instance of P,
there is a computable function g s.t. for every instance x of P we have .
For a first-order formula with a free relation variable X of arity d, the parameterized problem is the following:
Structure A for the language of φ, .
Is there a relation of size with ?
The parameterized problem Weighted Circuit SAT is:
Boolean circuit C, .
Is there a satisfying assignment with ?
Family of subsets , .
Is there a set of size with for every i?
Let F be a -formula in the variables V, and let C be a tractable case of SAT. Let be a set of variables. The formula is the formula obtained by deleting from the clauses in F all occurrences of literals x and for .
A strong C-backdoor set for F is a subset such that for every assignment , the formula is in C.
A weak C-backdoor set for F is a subset such that there is an assignment such that the formula is in C and satisfiable.
A deletion C-backdoor set for F is a subset such that the formula is in C.
The parameterized problem Strong C-Backdoor Set is:
-formula F, and .
Is there a strong C-backdoor set for F of size k?
The parameterized complexity of these problems for various tractable classes of formulas has been determined in the literature. For the classes , , , and , the problem Strong C-Backdoor Set is fixed parameter tractable. Since these classes are defined by properties of individual clauses, they are closed under subsets and unions, so the problem is the same as Deletion C-Backdoor Set, which is easily seen to be fixed parameter tractable in these cases.
The problem Weak C-Backdoor Set for all of these classes, on the other hand, is -hard, but fixed parameter tractable when the input is restricted to formulas in for a fixed . For the class of cluster formulas, both problems Strong -Backdoor Set and Weak -Backdoor Set are -hard, but are in when restricted to inputs in . These results, together with many more results for other tractable cases, can be found in a recent survey by Gaspers and Szeider .
Since the class is obviously not closed under unions, deletion and strong backdoor sets do not necessarily coincide for this class. This is actually not the case, as the following example shows. Consider the following set of clauses:
For the base class , it matters whether formulas are represented as multisets, with multiple occurrences of the same clause allowed, or as sets, since the number of occurrences of variables is counted differently in both cases.
If formulas are represented as multisets, then the smallest deletion -backdoor set is exactly the set of variables with more than two occurrences, hence we trivially have:
For formulas represented as multisets, the problem Deletion -Backdoor Set can be solved in linear time.
On the other hand, for formulas represented as sets, clauses can become equal – and hence identified – if literals are deleted or set to 0. To show that this actually makes a difference note the following proposition.
For formulas represented as sets, the problem Deletion -Backdoor Set is -hard.
Proof:We reduce the well-known -hard problem Vertex Cover to Deletion -Backdoor Set. For a graph , define a formula as follows: F has variables for v, and for every edge a subformula of three clauses:
On the other hand, if is not a vertex cover, and let edge be uncovered, then the subformula remains unchanged after deleting the variables in , thus and occur at least three times each, thus is not a deletion -backdoor set. Thus F has a deletion -backdoor set of size k iff G has a vertex cover of size k. □
We will focus on the case of formulas represented as sets in the remainder of the paper.
We now show the hardness of finding weak -backdoor sets.
Weak -Backdoor Set is -hard.
Proof:We reduce Hitting Set to Weak -Backdoor Set. Let an instance of Hitting Set be given, with . We construct a formula in the variables for plus and for , such that F has a weak -backdoor set of size k iff S has a hitting set of size k.
The formula F has for every set the subformula consisting of the three clauses
Let be a hitting set for S. We show that is a weak -backdoor set for of the same size. Let α be the assignment with for every . Then since H hits every set , it follows that for every i, and hence is in and satisfiable for every i. Thus is in and satisfiable.
For the other direction, let B be a weak -backdoor set for F, and let α be an assignment to the variables in B such that is in and satisfiable. We first show that without loss of generality, B contains only variables for . If B contains a variable , then the assignment α restricted to the variables in will still leave every subformula for in and satisfiable. Thus if we let for an arbitrary , and let and coincide with α for all variables in , then is in and satisfiable, thus is a weak -backdoor set with .
Now we define , and show that is a hitting set for S of size . Assume for contradiction that . Then the formula is left unchanged in , and therefore the variable has three occurrences in . Hence B can not be a weak -backdoor set. □
In the survey  of Gaspers and Szeider, there is a generic construction to show the -hardness of the problem Weak C-Backdoor Set for a class C. Our reduction is based on that construction, but it is simplified, and it has the property that the formula only depends on S and is independent from the parameter k. In other words, it is a polynomial time reduction between the underlying classical problems that does not increase the parameter. The same simplification can also be made to other applications of the generic construction in , e.g. for and .
On the other hand, the problem of finding weak -backdoor sets becomes fixed-parameter tractable when restricted to inputs in :
The problem Weak -Backdoor Set for -formulas is fixed-parameter tractable.
Proof:We will devise a bounded search tree algorithm that, given a formula F and parameter k, will search for a restriction α of size such that is in and satisfiable. We will call such a restriction a backdoor restriction, its domain is a weak -backdoor set. Obviously, a backdoor restriction exists if and only if a weak -backdoor set exists.
A set of three clauses in F that share a common variable x will be called an obstruction.
Let an obstruction in F be given. Then every backdoor restriction α for F must set some variable that occurs in .
This holds because otherwise we have for and therefore still contains the obstruction, and is thus not in .
Once we have chosen a literal to be set, the following obvious proposition shows the correctness of the recurrence that the search tree algorithm is based on.
F has a backdoor restriction of size k that contains the assignment iff has a backdoor restriction of size .
These two proposition show the correctness of the following algorithm, that finds a backdoor restriction of size k in a -formula F if one exists.
Build a search tree of depth k, where at each node v at depth d we keep a partial assignment of size . At the root we have . A node v is closed if is in .
To extend the tree from a node v of depth that is not closed and labeled with α, find an obstruction in with the common variable x.
Now for each literal a based on a variable occurring in , add a child to v with the assignment . These are at most 14 children since the clauses together contain at most 6 distinct variables besides x.
Now, for every closed leaf v labeled α, test whether is satisfiable. If so, α is a backdoor restriction. If there is no closed leaf, or for no closed leaf v the residual formula is satisfiable, then F does not have a backdoor restriction of size k, and hence no weak -backdoor set of size k.
Since every inner node has at most 14 children, the size of the search tree is , and therefore the runtime is . □
The algorithm generalizes in the obvious way to formulas in for every fixed , where the branching degree of the search tree is , and hence its size is , which yields a runtime of .
Next, we show the hardness of finding strong -backdoor sets. Unfortunately, the proof of this result only works when the input formulas are allowed to contain tautological clauses. The complexity of the problem restricted to formulas without tautological clauses remains open.
Strong -Backdoor Set is -hard.
Proof:We reduce Hitting Set to Strong -Backdoor Set. The reduction is similar to that used in the proof of Theorem 3.
Let an instance of Hitting Set be given, with . We construct a formula in the variables for plus for , such that F has a strong -backdoor set of size k iff S has a hitting set of size k.
The formula F has three clauses for every set , viz.
Let be a hitting set for S. We show that is a strong -backdoor set for of the same size.
Let α be an assignment to the variables in . If for some , then α satisfies , and hence for every i. If, on the other hand for every , then since H is a hitting set, α satisfies and hence for very i. Thus in either case .
Now let B be strong -backdoor set for F. As in the proof of Theorem 3, we first show that without loss of generality, B contains only variables for . If B contains a variable , then as before we can exchange it for a variable for an arbitrary .
Now as above, we define , and show that is a hitting set for S of size . Assume for contradiction that , and let α be the assignment with for every . Then α does not satisfy any of the clauses associated with , and thus the variable occurs three times in . Hence B cannot have been a strong -backdoor set. □
The comment after the proof of Theorem 3 applies here as well: the reduction from Hitting Set to Strong -Backdoor Set given is actually a polynomial time reduction that does not change the parameter.
As in the case of weak -backdoor sets, the problem of finding strong -backdoor sets becomes fixed-parameter tractable when restricted to inputs in .
Strong -Backdoor Set for -formulas is fixed-parameter tractable.
Proof:Let F be a -formula, and X a subset of the variables of F. The following two propositions show the correctness of the bounded search tree algorithm to be presented below.
Let be a restriction. If is an obstruction in , then every strong -backdoor set for F extending X contains a variable that occurs in .
Otherwise, if is a set that is disjoint from the variables of , then for any restriction that extends α the formula still contains the obstruction , and is therefore not in .
Once we have chosen a variable to be included in the backdoor set, the following proposition, which holds obviously, shows the correctness of the recurrence that the search tree algorithm is based on.
F has a strong -backdoor set of size k that contains the variable x iff there is a set B of size that is a strong -backdoor set for and for .
We build a search tree of depth k, where at each node of depth d a set of variables of size is kept. Together with we keep a set of closed assignments, with the property that for every . A node v is closed if .
To extend the tree from a node of depth labeled that is not closed, pick an assignment such that , and an obstruction, i.e., three clauses in that share a common variable x.
For each variable y occurring in , add a child to v labeled with . For each of these children, add a set of assignments. To determine the set , we first put both extensions and of every closed assignment into . Then for every open assignment we consider the extension , and test whether is in . If that is the case, we add to . We perform the same for the assignment .
If a node is closed, i.e., , then X is a strong backdoor set. If on the other hand, no closed node has been found up to depth k, then by the two Propositions 9 and 10 above there is no strong backdoor set of size k.
Since the three clauses contain at most 7 variables, size of the search tree is bounded by . At each node we need to perform at most computation steps, so the runtime is bounded by . □
As in the case of the algorithm for weak backdoor sets, this algorithm generalizes in the obvious way to formulas in for every fixed , with a larger search tree size and thus a larger exponential dependence on the parameter k.
For most weak backdoor set problems that are not known to be in , there is no exact characterization of their complexity. With the exception of a few cases which are -complete, for most of the other cases it is only known that they are -hard and in . We will now show the – to the best of our knowledge – first -completeness results for weak backdoor set problems in the following theorem.
For the classes and , the problem Weak C-Backdoor Set is -complete.
Since finding weak backdoor sets for these classes is known to be -hard , we only need to show that they are in .
Since all formulas in and are satisfiable, weak backdoor sets into these classes are the same as the very weak backdoor sets defined next. This is an admittedly artificial notion of backdoor set where we weaken the requirement that the residual formula is satisfiable to the condition that it is not already false.
We define a very weak C-backdoor set for F to be a subset such that there is an assignment such that the formula is in C and non-trivial, i.e., .
The parameterized problem Very Weak C-Backdoor Set is:
-formula F, and .
Is there a very weak C-backdoor set for F of size k?
For the classes and , this problem is equivalent to Weak C-Backdoor Set, thus to prove Theorem 11 it suffices to show it is in . Since our technique readily generalizes to other cases, we show that the problem of finding very weak C-backdoor sets is in , for various tractable cases C. For the other cases besides and , this might turn out to be useful in the future.
We show that these problems are in the class by making use of the logical characterization of this class.
For each of the tractable classes
|a is a literal|
|c is a clause|
|literal a occurs in clause c|
|a and b are complementary literals|
|a is a positive literal|
|a is a negative literal|
For each of the classes C in the statement of the theorem, we will define a -formula in this language with a free set variable A expressing that A is a backdoor restriction into the class C, i.e. for a restriction α iff . We also define a -formula expressing that the formula F after restriction by A is nontrivial. Thus the problem Very Weak C-Backdoor Set is equivalent to the problem , where . Since both formulas and are -formulas, this shows that these problems are in .
We start by expressing that literal a is false under A. By uniqueness of the complementary literal, this can be expressed by the following formula:
We then can express the condition that the formula F restricted by A is non-trivial by the -formula
With the aid of these formulas, we can write down the formula as:
either one of them is satisfied by A,
or two of them are equal under A,
or every variable that is not set by A does not occur in one of them.
We can use the same technique to obtain upper bounds on the complexity of finding very weak backdoor sets for other tractable cases:
The following -formula states that F restricted by A is a -formula.
either two of the literals are equal,
or at least one of them is set by A,
or at least one of them does not occur in c.
The following -formula states that F restricted by A is a Horn formula.
A result of Nishimura et al.  characterizes the class of cluster formulas by excluded configurations: A formula is a cluster formula if it does not contain any of the following obstructions:
1. two clauses C and that overlap, i.e., have a variable in common, but do not clash,
2. three clauses , and such that and clash, and and clash, but and do not clash.
We define -formulas stating that two clauses overlap or clash as:
Finally, we define the -formula expressing that the formula F restricted by A is 1-valid. The formula is defined analogously.
We define the -formula
We list some problems left open by this work.
Settle the parameterized complexity of the problem Strong -Backdoor Set in the restricted case when formulas do not contain tautological clauses, i.e., show the problem remains -hard in this case.
Determine the precise parameterized complexity of Weak C-Backdoor Set for tractable cases C other that 1-Val and 0-Val, possibly using the logical approach used in this paper.
Is the problem Strong C-Backdoor Set in for the tractable cases and , for which we know it is -hard?
I thank Stefan Szeider, Sebastian Ordyniak and Ulrich Schöpp for useful discussions about the contents of the paper, and an anonymous referee whose comments helped to improve the presentation of the paper. The research leading to the results in this paper was initiated at Dagstuhl Seminar 12471 “SAT Interactions”.
S.A. Cook, The complexity of theorem-proving procedures, in: Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, 1971, pp. 151–158.
Y. Crama, O. Ekin and P.L. Hammer, Variable and term removal from Boolean formulae, Discrete Applied Mathematics 75(3) (1997), 217–230. doi:10.1016/S0166-218X(96)00028-5.
R.G. Downey and M.R. Fellows, Fixed-parameter tractability and completeness, Congressus Numerantium 87 (1992), 161–187.
J. Flum and M. Grohe, Parameterized Complexity Theory, Texts in Theoretical Computer Science, Springer, 2006.
S. Gaspers and S. Szeider, Backdoors to satisfaction, in: The Multivariate Algorithmic Revolution and Beyond, H.L. Bodlaender, R. Downey, F.V. Fomin and D. Marx, eds, Lecture Notes in Computer Science, Vol. 7370, 2012, pp. 287–317. doi:10.1007/978-3-642-30891-8_15.
R. Impagliazzo, R. Paturi and F. Zane, Which problems have strongly exponential complexity?, Journal of Computer and System Sciences 63(4) (2001), 512–530. doi:10.1006/jcss.2001.1774.
K. Iwama, CNF-satisfiability test by counting and polynomial average time, SIAM Journal on Computing 18(2) (1989), 385–391. doi:10.1137/0218026.
J. Johannsen, Satisfiability problems complete for deterministic logarithmic space, in: 21st International Symposium on Theoretical Aspects of Computer Science (STACS 2004), V. Diekert and M. Habib, eds, Lecture Notes in Computer Science, Vol. 2996, 2004, pp. 317–325.
H. Kleine Büning and T. Lettmann, Propositional Logic: Deduction and Algorithms, Cambridge University Press, 1999.
N. Nishimura, P. Ragde and S. Szeider, Solving #SAT using vertex covers, Acta Informatica 44(7–8) (2007), 509–523. doi:10.1007/s00236-007-0056-x.
N. Nishimura, P. Ragde and S. Szeider, Detecting backdoor sets with respect to Horn and binary clauses, in: Proceedings of the 7th International Conference on Theory and Applications of Satisfiability Testing, 2004, pp. 96–103.
T.J. Schaefer, The complexity of satisfiability problems, in: Proceedings of the 10th ACM Symposium on Theory of Computing, 1978, pp. 216–226.
R. Williams, C. Gomes and B. Selman, Backdoors to typical case complexity, in: Proceedings of the 18th International Joint Conference on Artificial Intelligence, G. Gottlob and T. Walsh, eds, 2003, pp. 1173–1178.