| Total: 35
In constraint programming and related paradigms, a modeller specifies their problem in a modelling language for a solver to search and return its solution(s). Using high-level modelling languages such as ESSENCE, a modeller may express their problems in terms of abstract structures. These are structures not natively supported by the solvers, and so they have to be transformed into or represented as other structures before solving. For example, nested sets are abstract structures, and they can be represented as matrices in constraint solvers. Many problems contain symmetries and one very common and highly successful technique used in constraint programming is to “break” symmetries, to avoid searching for symmetric solutions. This can speed up the solving process by many orders of magnitude. Most of these symmetry-breaking techniques involve placing some kind of ordering for the variables of the problem, and picking a particular member under the symmetries, usually the smallest. Unfortunately, applying this technique to abstract variables produces a very large number of complex constraints that perform poorly in practice. In this paper, we demonstrate a new incomplete method of breaking the symmetries of abstract structures by better exploiting their representations. We apply the method in breaking the symmetries arising from indistinguishable objects, a commonly occurring type of symmetry, and show that our method is faster than the previous methods proposed in (Akgün et al. 2025).
Symmetry breaking is a crucial technique in modern combinatorial solving, but it is difficult to be sure it is implemented correctly. The most successful approach to deal with bugs is to make solvers certifying, so that they output not just a solution, but also a mathematical proof of correctness in a standard format, which can then be checked by a formally verified checker. This requires justifying symmetry reasoning within the proof, but developing efficient methods for this has remained a long-standing open challenge. A fully general approach was recently proposed, but it relies on encoding lexicographic orders with big integers, which quickly becomes infeasible for large symmetries. In this work, we develop a method for instead encoding orders with auxiliary variables. We show that this leads to orders-of-magnitude speed-ups in both theory and practice by running experiments on proof logging and checking for SAT symmetry breaking using the state-of-the-art satsuma symmetry breaker and the VeriPB proof checking toolchain.
The Minimum Consistent Subset (MCS) problem arises naturally in the context of supervised clustering and instance selection. In supervised clustering, one aims to infer a meaningful partitioning of data using a small labeled subset. However, the sheer volume of training data in modern applications poses a significant computational challenge. The MCS problem formalizes this goal: given a labeled dataset X in a metric space, the task is to compute a smallest subset S of X such that every point in X shares its label with at least one of its nearest neighbors in S. Recently, the MCS problem has been extended to graph metrics, where distances are defined by shortest paths. Prior work has shown that MCS remains NP-hard even on simple graph classes like trees, and presented an fixed-parameter tractable (FPT) algorithm parameterized by the number of colors for MCS on trees. This raises the challenge of identifying graph classes that admit algorithms efficient in both input size (n) and the number of colors (c). In this work, we study the Minimum Consistent Subset problem on graphs, focusing on two well-established measures: the vertex cover number (vc) and the neighborhood diversity (nd). Specifically, we design efficient algorithms for graphs exhibiting small vc or small nd, which frequently arise in real-world domains characterized by local sparsity or repetitive structure. These parameters are particularly relevant because they capture structural properties that often correlate with the tractability of otherwise hard problems. Graphs with small vertex cover sizes are "almost independent sets", representing sparse interactions, while graphs with small neighborhood diversity exhibit a high degree of symmetry and regularity. Importantly, small neighborhood diversity can occur even in dense graphs, a property frequently observed in domains such as social networks with modular communities or knowledge graphs with repeated relational patterns. Thus, algorithms designed to work efficiently for graphs with small neighborhood diversity are capable of efficiently solving MCS in complex settings where small vertex covers may not exist. We show that MCS is FPT when parameterized by the vertex cover number and by neighborhood diversity. In each case, we present an algorithm whose running time is polynomial in n and c, and the non-polynomial part depends solely on the chosen parameter. Notably, our algorithms remain efficient for arbitrarily many colors, as their complexity is polynomially dependent on the number of colors.
We study a general framework of optimization with the aim to compute fair solutions in settings with a set of agents whose valuations are combined using an aggregation function. The strength of our framework lies (1) in its generality and (2) in the fact that we leverage the power of ex-ante fairness, a concept that has recently gained much attention in the scope of fair allocation and fairness in AI in general. More precisely, in our setting there are n set functions f₁, …, fₙ (e.g., the valuation functions of n agents) that are combined using an aggregation function g (e.g., the minimum, Nash social welfare, p-norm). The power of ex-ante fairness is obtained by allowing as a feasible solution not simply a finite set S, but instead a distribution Π over feasible sets. The goal in our setting is then to find a probability distribution p in Π that maximizes the value resulting from aggregating (using g) the n expected values of the functions f₁, …, fₙ obtained when sampling a set S according to the distribution p. We stress that this is different from maximizing the expected value of g (ex-post fairness) and typically allows for much fairer solutions. We give three different greedy algorithms for three different settings of this framework and prove that they achieve constant approximation guarantees under certain realistic assumptions. For some of the settings, we show that these approximation guarantees are tight. Specific scenarios that can be modelled using our framework include fair information diffusion in social networks, fair submodular matching problems, and ex-ante versions of item assignment problems.
Maximum satisfiability (MaxSAT) is a viable approach to solving NP-hard combinatorial optimization problems through propositional encodings. Understanding how problem structure and encodings impact the behaviour of different MaxSAT solving algorithms is an important challenge. In this work, we identify MaxSAT instances in which the constraints entail an ordering of the objective variables as an interesting instance class from the perspectives of problem structure and MaxSAT solving. From the problem structure perspective, we show that a non-negligible percentage of instances in commonly used MaxSAT benchmark sets have ordered objectives and further identify various examples of such problem domains to which MaxSAT solvers have been successfully applied. From the algorithmic perspective, we argue that MaxSAT instances with ordered objectives, provided an ordering, can be solved (at least) as efficiently with a very simplistic algorithmic approach as with modern core-based MaxSAT solving algorithms. We show empirically that state-of-the-art MaxSAT solvers suffer from overheads and are outperformed by the simplistic approach on real-world optimization problems with ordered objectives.
Solving the model counting problem #SAT, asking for the number of satisfying assignments of a propositional formula, has been explored intensively and has gathered its own community. While most existing solvers are based on knowledge compilation, another promising approach is through contraction in tensor hypernetworks. We perform a theoretical proof-complexity analysis of this approach. For this, we design two new tensor-based proof systems that we show to tightly correspond to tensor-based #SAT solving. We determine the simulation order of #SAT proof systems and prove exponential separations between the systems. This sheds light on the relative performance of different #SAT solving approaches.
Several proof systems for model counting have been introduced in recent years, mainly in an attempt to model #SAT solving and to allow proof logging of solvers. We reexamine these different approaches and show that: (i) with moderate adaptations, the conceptually quite different proof models of the dynamic system MICE and the static system of annotated Decision-DNNFs are equivalent and (ii) they tightly characterise state-of-the-art #SAT solving. Thus, these proof systems provide a precise and robust proof-theoretic underpinning of current model counting. We also propose new strengthenings of these proof systems that might lead to stronger model counters.
In the field of Explainable Constraint Solving, it is common to explain to a user why a problem is unsatisfiable. A recently proposed method for this is to compute a sequence of explanation steps. Such a step-wise explanation shows individual reasoning steps involving constraints from the original specification, that in the end explain a conflict. However, computing a step-wise explanation is computationally expensive, limiting the scope of problems for which it can be used. We investigate how we can use proofs generated by a constraint solver as a starting point for computing step-wise explanations, instead of computing them step-by-step. More specifically, we define a framework of abstract proofs, in which \textit{both} proofs and step-wise explanations can be represented. We then propose several methods for converting a proof to a step-wise explanation sequence, with special attention to trimming and simplification techniques to keep the sequence and its individual steps small. Our results show our method significantly speeds up the generation of step-wise explanation sequences, while the resulting step-wise explanation has a quality similar to the current state-of-the-art.
Learning finite automata from positive examples has recently gained attention as a powerful approach for understanding, explaining, analyzing, and verifying black-box systems. The motivation for focusing solely on positive examples arises from the practical limitation that we can only observe what a system is capable of (positive examples) but not what it cannot do (negative examples). Unlike the classical problem of passive DFA learning with both positive and negative examples, which has been known to be NP-complete since the 1970s, the topic of learning DFAs exclusively from positive examples remains poorly understood. This paper introduces a novel perspective on this problem by leveraging the concept of counting the number of accepted words up to a carefully determined length. Our contributions are twofold. First, we prove that computing the minimal number of words up to this length accepted by DFAs of a given size that accept all positive examples is NP-complete, establishing that learning from positive examples alone is computationally demanding. Second, we propose a new learning algorithm with a better asymptotic runtime than the best-known bound for existing algorithms. While our experimental evaluation reveals that this algorithm under-performs state-of-the-art methods, it demonstrates significant potential as a preprocessing step to enhance existing approaches.
The Aperiodic Tiling Complements Problem (ATCP) involves finding the full set of (normalized) aperiodic complements of a given pattern. This has become a classic problem in music theory, with some recent attempts to model it using Integer Linear Programming (ILP) and Boolean Satisfiability (SAT) frameworks. In this paper, we develop and compare different models of ATCP encoded with Constraint Programming (CP). The most effective approach admits two phases: a first one that allows us to merge (join) several subsets of linear constraints under the form of tables with large arity, and a second one that advantageously exploits the generated tables to discard periodic tiling complements. Our experimental results show that our approach significantly outperforms the state-of-the-art, solving every instance of a classical benchmark (standard Vuza rhythms for canons with periods set up to 900) in a time between 5 seconds and 2 minutes (except the largest instance being solved in 18 minutes).
In their AAAI 2024 paper, Horiyama et al. studied the problem of generating graph instances that possess a unique minimum vertex cover under specific conditions. Their approach involved pre-assigning certain vertices to be part of the solution or excluding them from it. Notably, for the Vertex Cover problem, pre-assigning a vertex is equivalent to removing it from the graph. Horiyama et al. focused on maintaining the size of the minimum vertex cover after these modifications. In this work, we extend their study by relaxing this constraint: our goal is to ensure a unique minimum vertex cover, even if the removal of a vertex may not incur a decrease on the size of said cover. Surprisingly, our relaxation introduces significant theoretical challenges. We observe that the problem is Σ²_P-complete, and remains so even for planar graphs of maximum degree 5. Nevertheless, we provide a linear time algorithm for trees, which is then further leveraged to show that MU-VC is in FPT when parameterized by the combination of treewidth and maximum degree. Finally, we show that MU-VC is in XP when parameterized by clique-width while it is fixed-parameter tractable (FPT) if we add the size of the solution as part of the parameter.
Step-wise explanations can explain logic puzzles and other satisfaction problems by showing how to derive decisions step by step. Each step consists of a set of constraints that derive an assignment to one or more decision variables. However, many candidate explanation steps exist, with different sets of constraints and different decisions they derive. To identify the most comprehensible one, a user-defined objective function is required to quantify the quality of each step. However, defining a good objective function is challenging. Here, interactive preference elicitation methods from the wider machine learning community can offer a way to learn user preferences from pairwise comparisons. We investigate the feasibility of this approach for step-wise explanations and address several limitations that distinguish it from elicitation for standard combinatorial problems. First, because the explanation quality is measured using multiple sub-objectives that can vary a lot in scale, we propose two dynamic normalization techniques to rescale these features and stabilize the learning process. We also observed that many generated comparisons involve similar explanations. For this reason, we introduce MACHOP (Multi-Armed CHOice Perceptron), a novel query generation strategy that integrates non-domination constraints with upper confidence bound-based diversification. We evaluate the elicitation techniques on Sudokus and Logic-Grid puzzles using artificial users, and validate them with a real-user evaluation. In both settings, MACHOP consistently produces higher-quality explanations than the standard approach.
Dependency Quantified Boolean Formulas (DQBF) generalize QBF by explicitly specifying which universal variables each existential variable depends on, instead of relying on a linear quantifier order. The satisfiability problem of DQBF is NEXP-complete, and many hard problems can be succinctly encoded as DQBF. Recent work has revealed a strong analogy between DQBF and SAT: k-DQBF (with k existential variables) is a succinct form of k-SAT, and satisfiability is NEXP-complete for 3-DQBF but PSPACE-complete for 2-DQBF, mirroring the complexity gap between 3-SAT (NP-complete) and 2-SAT (NL-complete). Motivated by this analogy, we study the model counting problem for DQBF, denoted #DQBF. Our main theoretical result is that #2-DQBF is #EXP-complete, where #EXP is the exponential-time analogue of #P. This parallels Valiant's classical theorem stating that #2-SAT is #P-complete. As a direct application, we show that first-order model counting (FOMC) remains #EXP-complete even when restricted to a PSPACE-decidable fragment of first-order logic and domain size two. Building on recent successes in reducing 2-DQBF satisfiability to symbolic model checking, we develop a dedicated 2-DQBF model counter. Using a diverse set of crafted instances, we experimentally evaluated it against a baseline that expands 2-DQBF formulas into propositional formulas and applies propositional model counting. While the baseline worked well when each existential variable depends on few variables, our implementation scaled significantly better to larger dependency sets.
Standardized microplates are used to conduct large-scale biomedical research. The design of microplate layouts plays an essential role in handling so-called plate effects, i.e., systematic variations across the geometry of a microplate. An effective layout allows us to detect and negate plate effects. The randomized placement of controls and compounds produces layouts of limited effectiveness, so specific approaches are needed. A previously developed system, PLAID, proposed a constraint satisfaction model to construct effective plate layouts. However, PLAID does not scale well with microplate dimensions. To improve on PLAID, we propose Constraint Optimization of MicroPlate Designs (COMPD), which allows for greater flexibility and higher quality of the layouts.
The implicit hitting set (IHS) approach offers a general framework for solving computationally hard combinatorial optimization problems declaratively. IHS iterates between a decision oracle used for extracting sources of inconsistency and an optimizer for computing so-called hitting sets (HSs) over the accumulated sources of inconsistency. While the decision oracle is language-specific, the optimizers is usually instantiated through integer programming. We explore alternative algorithmic techniques for hitting set optimization based on different ways of employing pseudo-Boolean (PB) reasoning as well as stochastic local search. We extensively evaluate the practical feasibility of the alternatives in particular in the context of pseudo-Boolean (0-1 IP) optimization as one of the most recent instantiations of IHS. Highlighting a trade-off between efficiency and reliability, while a commercial IP solver turns out to remain the most effective way to instantiate HS computations, it can cause correctness issues due to numerical instability; in fact, we show that exact HS computations instantiated via PB reasoning can be made competitive with a numerically exact IP solver. Furthermore, the use of PB reasoning as a basis for HS computations allows for obtaining certificates for the correctness of IHS computations, generally applicable to any IHS instantiation in which reasoning in the declarative language at hand can be captured in the PB-based proof format we employ.
Given a Boolean relational specification between inputs and outputs, the problem of functional synthesis is to construct a function that maps each assignment of the input to an assignment of the output such that each tuple of input and output assignments meets the specification. The past decade has witnessed significant improvement in the scalability of functional synthesis tools, allowing them to handle problems with tens of thousands of variables. A common ingredient in these approaches is their reliance on SAT solvers, thereby exploiting the breakthrough advances in SAT solving over the past three decades. While the recent techniques have been shown to perform well in practice, there is little theoretical understanding of the limitations and power of these approaches. The primary contribution of this work is to initiate a systematic theoretical investigation into the power of functional synthesis approaches that rely on NP oracles. We first show that even when small Skolem functions exist, naive bit-by-bit learning approaches fail due to the relational nature of specifications. We establish fundamental limitations of interpolation-based approaches proving that even when small Skolem functions exist, resolution-based interpolation must produce exponential-size circuits. We prove that access to an NP oracle is inherently necessary for efficient synthesis. Our main technical result shows that it is possible to use NP oracles to synthesize small Skolem functions in time polynomial in the size of the specification and the size of the smallest sufficient set of witnesses, establishing positive results for a broad class of relational specifications.
List coloring extends graph coloring by assigning each vertex a list of allowed colors. A graph is k-choosable if it can be properly colored for any choice of lists with k colors each. Deciding k-choosability is π²ₚ-complete, bipartite graphs have unbounded list chromatic number, and planar graphs (famously 4-colorable) are all 5-choosable but not all 4-choosable. To search for graphs of given choosability, we extend SAT Modulo Symmetries (SMS) with custom propagators for list coloring pruning techniques and propose a quantified Boolean (QBF) encoding for choosability. We employ a hybrid approach: pen-and-paper reasoning to optimize our formulas followed by automated case distinction by QBF solvers and SMS. Our methods yield two significant results: (1) a 27-vertex planar graph that is 4-choosable yet cannot be proven so using the combinatorial Nullstellensatz widely applied in previous work (we show this is a smallest graph with that property), and (2) the smallest graph exhibiting a gap between chromatic and list chromatic numbers for chromatic number 3.
In recent years, constraint solvers show increasing use in solving various open combinatorial problems, e.g., from Ramsey theory or synthesis of combinatorial designs. The similar approach can be applied to some problems related to binary linear codes, which form one of the largest families of error correcting codes used both in coding theory and in various practical applications. Thanks to a simple algebraic structure of such codes it is possible to study them using a wide range of methods. Note that even codes with the same basic parameters (length n, dimension k, minimum code distance d) can show different error correction performance, i.e., the ability to correct errors which appear in a noisy channel. In the paper, we formulate the problem of finding binary linear codes with good error correction performance as a constraint optimization problem and explore the effectiveness of modern constraint solvers on it, including SAT, MaxSAT, and CP solvers. Using the respective solvers and parallel computing, for several values of n, k, d we found the codes which are significantly better than the known in terms of their practical performance.
The region connection calculus (RCC) and Allen's interval algebra (IA) are two well-known NP-hard spatial-temporal qualitative reasoning problems. They are solvable in 2^(O(n log n)) time, where n is the number of variables, and IA is additionally known to be solvable in o(n)^n time. However, no improvement over exhaustive search is known for RCC, and if they are also solvable in single exponential time 2^O(n) is unknown. We investigate multiple avenues towards reaching such bounds. First, we show that branching is insufficient since there are too many non-redundant constraints. Concretely, we classify the maximum number of non-redundant constraints in RCC and IA. Algorithmically, we make two significant contributions based on dynamic programming (DP). The first algorithm runs in 4^n time and is applicable to a non-trivial, NP-hard fragment of IA, which includes the well-known interval graph sandwich problem of (Golumbic and Shamir 1993). For the richer RCC problem with 8 basic relations we use a more sophisticated approach which asymptotically matches the o(n)^n bound for IA.
Neural solvers for Vehicle Routing Problems (VRPs) have shown great advantages in solving various kinds of problem types. However, they also face critical challenges in generalizing from small-scale training to large-scale problems and in identifying the most salient topological information for decision-making. To mitigate these gaps, we introduce ScaleNet, a novel hierarchical framework that integrates a U-Net architecture into a unified, multi-task VRP solver. Scale-Net explicitly captures multi-scale structural patterns by processing a nested hierarchy of input graph instances. This enriched, coarse-to-fine representation is extracted by the encoder and fed directly into the decoder, empowering decoder module with superior topological awareness for routing decisions while simultaneously reducing computational overhead in the encoder. We conducted extensive experiments on 16 VRP variants with instances ranging from 50 to 5,000 nodes. The experimental results show that Scale-Net demonstrates significant performance gains over state-of-the-art baselines across in-distribution, zero-shot, and real-world settings.
Quantified formulas with Uninterpreted Functions (UFs) over non-linear real arithmetic pose fundamental challenges for Satisfiability Modulo Theories (SMT) solving. Traditional quantifier instantiation methods struggle because they lack semantic understanding of UF constraints, forcing them to search through unbounded solution spaces with limited guidance. We present AquaForte, a framework that leverages Large Language Models to provide semantic guidance for UF instantiation by generating instantiated candidates for function definitions that satisfy the constraints, thereby significantly reducing the search space and complexity for solvers. Our approach preprocesses formulas through constraint separation, uses structured prompts to extract mathematical reasoning from LLMs, and integrates the results with traditional SMT algorithms through adaptive instantiation. AquaForte maintains soundness through systematic validation: LLM-guided instantiations yielding SAT solve the original problem, while UNSAT results generate exclusion clauses for iterative refinement. Completeness is preserved by fallback to traditional solvers augmented with learned constraints. Experimental evaluation on SMT-COMP benchmarks demonstrates that AquaForte solves numerous instances where state-of-the-art solvers like Z3 and CVC5 timeout, with particular effectiveness on satisfiable formulas. Our work shows that LLMs can provide valuable mathematical intuition for symbolic reasoning, establishing a new paradigm for SMT constraint solving.
Drug discovery is a very time-consuming and costly endeavour due to its huge design space and to the lengthy and failure-fraught process of bringing a product to market. Automating the generation of candidate molecules exhibiting some of the desired properties can help. Among the standard formats to encode molecules, SMILES is a widespread string representation. We propose a constraint programming model showcasing the grammar constraint to express the design space of organic molecules using the SMILES notation. We show how some common physicochemical properties --- such as molecular weight and lipophilicity --- and structural features can be expressed as constraints in the model. We also contribute a weighted counting algorithm for the grammar constraint, allowing us to use a belief propagation heuristic to guide the generation. Our experiments indicate that such a heuristic is key to driving the search towards desired molecules.
To efficiently solve exact discrete optimization problems, branch and bound algorithms require tight bounds. In constraint programming, for optimization, soft arc consistencies typically derive much stronger bounds than those offered by domain or bound consistencies applied to a cost variable. The reason is that soft local consistencies exchange marginal cost information between variables whereas domain consistencies rely only on shrinking domains, which is less informative. However, CP solvers equipped with soft arc consistencies have so far offered limited support for efficient processing of global constraints. In this work, we show how we can efficiently enforce soft local consistency over the AllDifferent constraint, relying on algorithms for the Linear Assignment Problem (LAP). We implement this propagator in toulbar2, the state-of-the-art weighted CP solver exploiting soft local consistencies for bounding. We show that, equipped with this new propagator, toulbar2 outperforms state-of-the-art domain consistency-based CP as well as integer programming solvers for the Quadratic Assignment Problem and shows better performance for miniCOP instances of the 2024 XCSP competition with AllDifferent constraints.
Machine learning has tremendously benefited from graphics processing units (GPUs) to accelerate training and inference by several orders of magnitude. However, this success has not been replicated in general and exact combinatorial optimization. Our key contribution is to propose a general-purpose discrete constraint programming solver fully implemented on GPU. It is based on integer interval bound propagation and backtracking search. The two main ingredients are (1) ternary constraint network optimized for GPU architectures, and (2) an on-demand subproblems generation strategy. Our constraint solving algorithm is significantly simpler than those found in optimized CPU constraint solvers, yet is competitive with sequential solvers in the MiniZinc 2024 challenge.
Over the past few decades, combinatorial solvers have seen remarkable performance improvements, enabling their practical use in real-world applications. In some of these applications, ensuring the correctness of the solver's output is critical. However, the complexity of modern solvers makes them susceptible to bugs in their source code. In the domain of satisfiability checking (SAT), this issue has been addressed through proof logging, where the solver generates a formal proof of the correctness of its answer. For more expressive problems like MaxSAT, the optimization variant of SAT, proof logging had not seen a comparable breakthrough until recently. In this paper, we show how to achieve proof logging for state-of-the-art techniques in Branch-and-Bound MaxSAT solving. This includes certifying look-ahead methods used in such algorithms as well as advanced clausal encodings of pseudo-Boolean constraints based on so-called Multi-Valued Decision Diagrams (MDDs). We implement these ideas in MaxCDCL, the dominant branch-and-bound solver, and experimentally demonstrate that proof logging is feasible with limited overhead, while proof checking remains a challenge.