| Total: 17
This paper presents NgramMarkov, a variant of the Markov constraints. It is dedicated to text generation in constraint programming (CP). It involves a set of n-grams (i.e., sequence of n words) associated with probabilities given by a large language model (LLM). It limits the product of the probabilities of the n-gram of a sentence. The propagator of this constraint can be seen as an extension of the ElementaryMarkov constraint propagator, incorporating the LLM distribution instead of the maximum likelihood estimation of n-grams. It uses a gliding threshold, i.e., it rejects n-grams whose local probabilities are too low, to guarantee balanced solutions. It can also be combined with a "look-ahead" approach to remove n-grams that are very unlikely to lead to acceptable sentences for a fixed-length horizon. This idea is based on the MDDMarkovProcess constraint propagator, but without explicitly using an MDD (Multi-Valued Decision Diagram). The experimental results show that the generated text is valued in a similar way to the LLM perplexity function. Using this new constraint dramatically reduces the number of candidate sentences produced, improves computation times, and allows larger corpora or smaller n-grams to be used. A real-world problem has been solved for the first time using 4-grams instead of 5-grams.
This paper addresses the challenge of solution counting for Quantified Boolean Formulas (QBFs), a task distinct from the well-established model counting problem for SAT (\#SAT). Unlike SAT, where models are straightforward assignments to Boolean variables, QBF solution counting involves tree models that capture dependencies among variables within different quantifier blocks. We present a comprehensive top-down tree model counter capable of handling diverse satisfiable QBF formulas. Emphasizing the critical role of the branching heuristic, which must consider variables in the correct order according to quantification blocks, we further demonstrate the importance of addressing connected components, free variables, and caching. Experimental results indicate that our proposed approach for counting tree models of QBF formulas is highly efficient in practice, surpassing existing state-of-the-art methods designed for this specific purpose.
Knowledge compilation has proven effective in (weighted) model counting, uniquely supporting incrementality and checkability. For incrementality, compiling an input formula once suffices to answer multiple queries, thus reducing the total solving effort. For checkability, the compiled formula is amenable to producing machine-checkable proofs for verification, thus strengthening the solver’s reliability. In this work, we extend knowledge compilation from model counting to stochastic Boolean satisfiability (SSAT) solving by generalizing the dec-DNNF representation to accommodate the SSAT quantifier structure and integrate it into SharpSSAT, a state-of-the-art SSAT solver. We further study proof generation from the compiled representation and extend CPOG, a certified model-counting toolchain, to generate proofs for certifying the results of SharpSSAT. Experimental results show the benefits of the proposed knowledge compilation approach for SSAT in sharing computation efforts for multiple queries and producing checkable dec-DNNF logs with negligible overhead.
We introduce and study various models for satisfying electrical energy demands of prosumers in a microgrid, while optimizing their costs. Each prosumer has individual demands of electrical energy, which can vary day-by-day, and which they can satisfy by either generating electrical energy through a self-operated mini power plant like a solar panel, through buying from an external energy provider, such as the main grid or by trading with other prosumers. Our models take into account two key aspects motivated by real-life scenarios: first, we consider a daily volatility of prices for buying and selling the energy, and second, the possibility to store the self-generated energy in a battery of finite capacity to be either self-consumed or sold to other prosumers in the future. We provide a thorough complexity analysis, as well as efficient algorithms, so that prosumers can minimize their overall cost over the entire time horizon. As a byproduct, we also solve a new, generalized version of the KNAPSACK problem which may be of independent interest. We complement our theoretical findings by extensive experimental evaluations on realistic data sets.
Vertex Subset Problems (VSPs) are a class of combinatorial optimization problems on graphs where the goal is to find a subset of vertices satisfying a predefined condition. Two prominent approaches for solving VSPs are dynamic programming over tree-like structures, such as tree-decompositions or clique-decompositions, and linear programming. In this work, we establish a sharp connection between both approaches by showing that if a vertex-subset problem Pi admits a solution-preserving dynamic programming algorithm that produces tables of size at most alpha(k,n) when processing a tree decomposition of width at most k of an n-vertex graph G, then the polytope defined as the convex-hull of solutions of Pi in G has extension complexity at most O(alpha(k,n)*n). Additionally, this upper bound is optimal under the exponential time hypothesis (ETH). At the one hand, our results imply that ETH-optimal solution-preserving dynamic programming algorithms for combinatorial problems yield optimal-size parameterized extended formulations for the solution polytopes associated with instances of these problems. At the other hand, unconditional lower bounds obtained in the realm of the theory of extended formulations yield unconditional lower bounds on the table complexity of solution-preserving dynamic programming algorithms.
The quantified Boolean formula (QBF) problem is an important decision problem generally viewed as the archetype for PSPACE-completeness. Many problems of central interest in AI are in general not included in NP, e.g., planning, model checking, and non-monotonic reasoning, and for such problems QBF has successfully been used as a modelling tool. However, solvers for QBF are not as advanced as state of the art SAT solvers, which has prevented QBF from becoming a universal modelling language for PSPACE-complete problems. A theoretical explanation is that QBF (as well as many other PSPACE-complete problems) lacks natural parameters guaranteeing fixed-parameter tractability (FPT). In this paper we tackle this problem and consider a simple but overlooked parameter: the number of existentially quantified variables. This natural parameter is virtually unexplored in the literature which one might find surprising given the general scarcity of FPT algorithms for QBF. Via this parameterization we then develop a novel FPT algorithm applicable to QBF instances in conjunctive normal form (CNF) of bounded clause length. We complement this by a W[1]-hardness result for QBF in CNF of unbounded clause length as well as sharper lower bounds for the bounded arity case under the (strong) exponential-time hypothesis.
One of the fundamental results in quantum foundations is the Kochen–Specker (KS) theorem, which states that any theory whose predictions agree with quantum mechanics must be contextual, i.e., a quantum observation cannot be understood as revealing a pre-existing value. The theorem hinges on the existence of a mathematical object called a KS vector system. While many KS vector systems are known, the problem of finding the minimum KS vector system in three dimensions (3D) has remained stubbornly open for over 55 years. To address the minimum KS problem, we present a new verifiable proof-producing method based on a combination of a Boolean satisfiability (SAT) solver and a computer algebra system (CAS) that uses an isomorph-free orderly generation technique that is very effective in pruning away large parts of the search space. Our method shows that a KS system in 3D must contain at least 24 vectors. We show that our sequential and parallel Cube-and-Conquer (CnC) SAT+CAS methods are significantly faster than SAT-only, CAS-only, and a prior CAS-based method of Uijlen and Westerbaan. Further, while our parallel pipeline is somewhat slower than the parallel CnC version of the recently introduced Satisfiability Modulo Theories (SMS) method, this is in part due to the overhead of proof generation. Finally, we provide the first computer-verifiable proof certificate of a lower bound to the KS problem with a size of 40.3 TiB in order 23.
Modern SMT solvers, such as Z3, offer user-controllable strategies that allow solver users the ability to tailor solving strategies for their unique set of instances, thus dramatically enhancing the solver performance for their specific use cases. However, this approach of strategy customization presents a significant challenge: handcrafting an optimized strategy for a class of SMT instances remains a complex and demanding task for both solver developers and users alike. In this paper, we address this problem of automated SMT strategy synthesis via a novel Monte Carlo Tree Search (MCTS) based method. Our method treats strategy synthesis as a sequential decision-making process, whose search tree corresponds to the strategy space, and employs MCTS to navigate this vast search space. The key innovations that enable our method to identify effective strategies, while keeping costs low, are the ideas of layered and staged MCTS search. These novel heuristics allow for a deeper and more efficient exploration of the strategy space, enabling us to synthesize more effective strategies than the default ones in state-of-the-art (SOTA) SMT solvers. We implement our method, dubbed Z3alpha, as part of the Z3 SMT solver. Through extensive evaluations across six important SMT logics, Z3alpha demonstrates superior performance compared to the SOTA synthesis tool FastSMT, the default Z3 solver, and the CVC5 solver on most benchmarks. Remarkably, on a challenging QF_BV benchmark set, Z3alpha solves 42.7% more instances than the default strategy in Z3.
Most active constraint acquisition systems suffer from two weaknesses. They require the explicit generation of the set of potential constraints (the bias), whose size can be prohibitive for practical use of these systems, and the answers to queries contain little information. In this paper, we introduce ACQNOGOODS, an active learning schema that does not require the construction of a bias. We then propose LLMACQ, an active learning system that incorporates a Large Language Model component in the ACQNOGOODS schema. LLMACQ interprets the user’s answers given in natural language, leading to more informative communication. As our experiments show, the non requirement of a bias in extension combined to the higher level communication with the user allow LLMACQ to learn constraints of any arity and to dramatically decrease the number of queries.
Optimization of DR-submodular functions has experienced a notable surge in significance in recent times, marking a pivotal development within the domain of non-convex optimization. Motivated by real-world scenarios, some recent works have delved into the maximization of non-monotone DR-submodular functions over general (not necessarily down-closed) convex set constraints. Up to this point, these works have all used the minimum L-infinity norm of any feasible solution as a parameter. Unfortunately, a recent hardness result due to Mualem and Feldman shows that this approach cannot yield a smooth interpolation between down-closed and non-down-closed constraints. In this work, we suggest novel offline and online algorithms that provably provide such an interpolation based on a natural decomposition of the convex body constraint into two distinct convex bodies: a down-closed convex body and a general convex body. We also empirically demonstrate the superiority of our proposed algorithms across three offline and two online applications.
We study the following parameterization of the MaxSAT problem: Given a CNF formula F with m clauses, decide whether at least m/2 + μ clauses in F could be satisfied, where μ is the excess of the number of satisfied clauses over the trivial lower bound m/2 and is taken as the parameter. This perspective is known as the "above guarantee" parameterization. Since its introduction by Mahajan and Raman [1999], the analysis of parameterization above guarantee has become a highly active and fruitful line of research. In this paper, we develop a new algorithm with runtime O*(2.1479^μ), significantly improving the previous best upper bound O*(5.4064^μ) for this important problem. Here, the O* notation omits polynomial factors.
Metric Differential Privacy (mDP) extends the concept of Differential Privacy (DP) to serve as a new paradigm of data perturbation. It is designed to protect secret data represented in general metric space, such as text data encoded as word embeddings or geo-location data on the road network or grid maps. To derive an optimal data perturbation mechanism under mDP, a widely used method is linear programming (LP), which, however, might suffer from a polynomial explosion of decision variables, rendering it impractical in large-scale mDP. In this paper, our objective is to develop a new computation framework to enhance the scalability of the LP-based mDP. Considering the connections established by the mDP constraints among the secret records, we partition the original secret dataset into various subsets. Building upon the partition, we reformulate the LP problem for mDP and solve it via Benders Decomposition, which is composed of two stages: (1) a master program to manage the perturbation calculation across subsets, and (2) a set of subproblems, each managing the perturbation derivation within a subset. Our experimental results on multiple datasets, including geo-location data in the road network/grid maps, text data, and synthetic data, underscore our proposed mechanism’s superior scalability and efficiency.
Knowing that a function is convex ensures that any local minimum is also a global minimum. Here, we implement an approach to certify the convexity of twice-differentiable functions by certifying that their second-order derivative is positive semidefinite. Both the computation of the second-order derivative and the certification of positive semidefiniteness are done symbolically. Previous implementations of this approach assume that the function to be minimized takes scalar or vector inputs, meaning that the second-order derivative is at most a matrix. However, the input of many machine learning problems is naturally given in the form of matrices or higher order tensors, in which case the second-order derivative becomes a tensor of at least fourth order. The familiar linear algebra notations and known rules for determining whether a matrix is positive semidefinite are not sufficient to deal with these higher order expressions. Here, we present a formal language for tensor expressions that allows us to generalize semidefiniteness to higher-order tensors and thereby certify the convexity of a broader set of functions.
This work proposes an efficient parallel algorithm for non-monotone submodular maximization under a knapsack constraint problem over the ground set of size n. Our algorithm improves the best approximation factor of the existing parallel one from 8 to 7 with O(log n) adaptive complexity. The key idea of our approach is to create an alternate threshold algorithmic framework. This new strategy alternately constructs two disjoint candidate solutions within a constant number of sequence rounds. Then, the algorithm boosts solution quality without sacrificing the adaptive complexity. Extensive experimental studies on three applications, Revenue Maximization, Image Summarization, and Maximum Weighted Cut, show that our algorithm not only significantly increases solution quality but also requires comparative adaptivity to state-of-the-art algorithms.
The vehicle routing problem with two-dimensional loading constraints (2L-CVRP) and the last-in-first-out (LIFO) rule presents significant practical and algorithmic challenges. While numerous heuristic approaches have been proposed to address its complexity, stemming from two NP-hard problems: the vehicle routing problem (VRP) and the two-dimensional bin packing problem (2D-BPP), less attention has been paid to developing exact algorithms. Bridging this gap, this article presents an exact algorithm that integrates advanced machine learning techniques, specifically a novel combination of attention and recurrence mechanisms. This integration accelerates the state-of-the-art exact algorithm by a median of 29.79% across various problem instances. Moreover, the proposed algorithm successfully resolves an open instance in the standard test-bed, demonstrating significant improvements brought about by the incorporation of machine learning models. Code is available at https://github.com/xyfffff/NCG-for-2L-CVRP.
Numerous combinatorial optimization problems can be reduced to the optimal path problem over directed acyclic graphs (DAGs). The constrained version of the optimal path problem requires the solution to satisfy a given logical constraint. BDD-constrained search (BCS) is an efficient algorithm for the constrained optimal path problem over DAGs. This algorithm considers edges as variables and constraints as Boolean functions and maintains constraints via binary decision diagrams (BDDs), a compact form of Boolean functions. However, BCS involves redundant operations during the search process. To reduce these redundant operations, we use vertices instead of edges as variables and hence represent constraints as multi-valued functions. Due to the multi-valued representation of constraints, we propose a novel algorithm, namely MDD-constrained search (MCS), by using multi-valued decision diagrams (MDDs) instead of BDDs, an efficient representation of multi-valued functions. In addition, we improve MCS via domain reduction in multi-valued functions. Experimental results prove that our proposed algorithm outperforms BCS.
MaxSAT is an optimization version of the famous NP-complete Satisfiability problem (SAT). Algorithms for MaxSAT mainly include complete solvers and local search incomplete solvers. In many complete solvers, once a better solution is found, a Soft conflict Pseudo Boolean (SPB) constraint will be generated to enforce the algorithm to find better solutions. In many local search algorithms, clause weighting is a key technique for effectively guiding the search directions. In this paper, we propose to transfer the SPB constraint into the clause weighting system of the local search method, leading the algorithm to better solutions. We further propose an adaptive clause weighting strategy that breaks the tradition of using constant values to adjust clause weights. Based on the above methods, we propose a new local search algorithm called SPB-MaxSAT that provides new perspectives for clause weighting on MaxSAT local search solvers. Extensive experiments demonstrate the excellent performance of the proposed methods.