| Total: 22
Structured Neighbourhood Search (SNS) is a framework for constraint-based local search for problems expressed in the Essence abstract constraint specification language. The local search explores a structured neighbourhood, where each state in the neighbourhood preserves a high level structural feature of the problem. SNS derives highly structured problem-specific neighbourhoods automatically and directly from the features of the Essence specification of the problem. Hence, neighbourhoods can represent important structural features of the problem, such as partitions of sets, even if that structure is obscured in the low-level input format required by a constraint solver. SNS expresses each neighbourhood as a constrained optimisation problem, which is solved with a constraint solver. We have implemented SNS, together with automatic generation of neighbourhoods for high level structures, and report high quality results for several optimisation problems.
For both historical and practical reasons, the Boolean satisfiability problem (SAT) has become one of central importance in computer science. One type of instances arises when the clauses are chosen uniformly randomly \textendash{} random SAT. Here, a major problem, recently solved for sufficiently large clause length, is the satisfiability threshold conjecture. The value of this threshold is known exactly only for clause length $2$, and there has been a lot of research concerning its value for arbitrary fixed clause length. In this paper, we endeavor to study the satisfiability threshold for random industrial SAT. There is as yet no generally accepted model of industrial SAT, and we confine ourselves to one of the more common features of industrial SAT: the set of variables consists of a number of disjoint communities, and clauses tend to consist of variables from the same community. Our main result is that the threshold of random community-structured SAT tends to be smaller than its counterpart for random SAT. Moreover, under some conditions, this threshold even vanishes.
We study formalisms for temporal and spatial reasoning in the modern context of Constraint Satisfaction Problems (CSPs). We show how questions on the complexity of their subclasses can be solved using existing results via the powerful use of primitive positive (pp) interpretations and pp-homotopy. We demonstrate the methodology by giving a full complexity classification of all constraint languages that are first-order definable in Allen's Interval Algebra and contain the basic relations (s) and (f). In the case of the Rectangle Algebra we answer in the affirmative the old open question as to whether ORD-Horn is a maximally tractable subset among the (disjunctive, binary) relations. We then generalise our results for the Rectangle Algebra to the r-dimensional Block Algebra.
In many settings just finding a good clustering is insufficient and an explanation of the clustering is required. If the features used to perform the clustering are interpretable then methods such as conceptual clustering can be used. However, in many applications this is not the case particularly for image, graph and other complex data. Here we explore the setting where a set of interpretable discrete tags for each instance is available. We formulate the descriptive clustering problem as a bi-objective optimization to simultaneously find compact clusters using the features and to describe them using the tags. We present our formulation in a declarative platform and show it can be integrated into a standard iterative algorithm to find all Pareto optimal solutions to the two objectives. Preliminary results demonstrate the utility of our approach on real data sets for images and electronic health care records and that it outperforms single objective and multi-view clustering baselines.
In this work we present two general techniques to deal with multi-stage optimization problems under uncertainty, featuring off-line and on-line decisions. The methods are applicable when: 1) the uncertainty is exogenous; 2) there exists a heuristic for the on-line phase that can be modeled as a parametric convex optimization problem. The first technique replaces the on-line heuristics with an anticipatory solver, obtained through a systematic procedure. The second technique consists in making the off-line solver aware of the on-line heuristic, and capable of controlling its parameters so as to steer its behavior. We instantiate our approaches on two case studies: an energy management system with uncertain renewable generation and load demand, and a vehicle routing problem with uncertain travel times. We show how both techniques achieve high solution quality w.r.t. an oracle operating under perfect information, by obtaining different trade-offs in terms of computation time.
The problem of integrating heterogeneous data sources into an ontology is highly relevant in the database field. Several techniques exist to approach the problem, but side constraints on the data cannot be easily implemented and thus the results may be inconsistent. In this paper we improve previous work by Taheriyan et al. [2016a] using Machine Learning (ML) to take into account inconsistencies in the data (unmatchable attributes) and encode the problem as a variation of the Steiner Tree, for which we use work by De Uña et al. [2016] in Constraint Programming (CP). Combining ML and CP achieves state-of-the-art precision, recall and speed, and provides a more flexible framework for variations of the problem.
Recently a number of algorithmic results have appeared which show the tractability of Integer Linear Programming (ILP) instances under strong restrictions on variable domains and/or coefficients (AAAI 2016, AAAI 2017, IJCAI 2017). In this paper, we target ILPs where neither the variable domains nor the coefficients are restricted by a fixed constant or parameter; instead, we only require that our instances can be encoded in unary. We provide new algorithms and lower bounds for such ILPs by exploiting the structure of their variable interactions, represented as a graph. Our first set of results focuses on solving ILP instances through the use of a graph parameter called clique-width, which can be seen as an extension of treewidth which also captures well-structured dense graphs. In particular, we obtain a polynomial-time algorithm for instances of bounded clique-width whose domain and coefficients are polynomially bounded by the input size, and we complement this positive result by a number of algorithmic lower bounds. Afterwards, we turn our attention to ILPs with acyclic variable interactions. In this setting, we obtain a complexity map for the problem with respect to the graph representation used and restrictions on the encoding.
The last 20 years have seen dramatic improvements in the performance of algorithms for Boolean satisfiability---so-called SAT solvers---and today conflict-driven clause learning (CDCL) solvers are routinely used in a wide range of application areas. One serious short-coming of CDCL, however, is that the underlying method of reasoning is quite weak. A tantalizing solution is to instead use stronger pseudo-Boolean (PB) reasoning, but so far the promise of exponential gains in performance has failed to materialize---the increased theoretical strength seems hard to harness algorithmically, and in many applications CDCL-based methods are still superior. We propose a modified approach to pseudo-Boolean solving based on division instead of the saturation rule used in [Chai and Kuehlmann '05] and other PB solvers. In addition to resulting in a stronger conflict analysis, this also improves performance by keeping integer coefficient sizes down, and yields a very competitive solver as shown by the results in the Pseudo-Boolean Competitions 2015 and 2016.
Over the last decades Boolean satisfiability (SAT) solvers based on conflict-driven clause learning (CDCL) have developed to the point where they can handle formulas with millions of variables. Yet a deeper understanding of how these solvers can be so successful has remained elusive. In this work we shed light on CDCL performance by using theoretical benchmarks, which have the attractive features of being a) scalable, b) extremal with respect to different proof search parameters, and c) theoretically easy in the sense of having short proofs in the resolution proof system underlying CDCL. This allows for a systematic study of solver heuristics and how efficiently they search for proofs. We report results from extensive experiments on a wide range of benchmarks. Our findings include several examples where theory predicts and explains CDCL behaviour, but also raise a number of intriguing questions for further study.
The enumeration of all Maximal Satisfiable Subsets (MSSes) or all Minimal Correction Subsets (MCSes) of an unsatisfiable CNF Boolean formula is a useful and sometimes necessary step for solving a variety of important A.I. issues. Although the number of different MCSes of a CNF Boolean formula is exponential in the worst case, it remains low in many practical situations; this makes the tentative enumeration possibly successful in these latter cases. In the paper, a technique is introduced that boosts the currently most efficient practical approaches to enumerate MCSes. It implements a model rotation paradigm that allows the set of MCSes to be computed in an heuristically efficient way.
The maximum clique and minimum vertex cover problems are among Karp's 21 NP-complete problems, and have numerous applications: in combinatorial auctions, for computing phylogenetic trees, to predict the structure of proteins, to analyse social networks, and so forth. Currently, the best complete methods are branch & bound algorithms and rely largely on graph colouring to compute a bound. We introduce a new approach based on SAT and on the "Conflict-Driven Clause Learning" (CDCL) algorithm. We propose an efficient implementation of Babel's bound and pruning rule, as well as a novel dominance rule. Moreover, we show how to compute concise explanations for this inference. Our experimental results show that this approach is competitive and often outperforms the state of the art for finding cliques of maximum weight.
Recent work on Conditional Simple Temporal Networks (CSTNs) has focused on checking the dynamic consistency (DC) property assuming that execution strategies can react instantaneously to observations. Three alternative semantics---IR-DC, 0-DC, and π-DC---have been presented. The most practical DC-checking algorithm for CSTNs has only been analyzed with respect to the IR-DC semantics, while the 0-DC semantics was shown to have a serious flaw that the π-DC semantics fixed. Whether the IR-DC semantics had the same flaw and, if so, what the consequences would be for the DC-checking algorithm remained open questions. This paper (1) shows that the IR-DC semantics is also flawed; (2) shows that one of the constraint-propagation rules from the IR-DC-checking algorithm is not sound with respect to the IR-DC semantics; (3) presents a simpler algorithm, called the π-DC-checking algorithm; (4) proves that it is sound and complete with respect to the π-DC semantics; and (5) empirically evaluates the new algorithm.
We present and evaluate DMC, a distributed model counter for propositional CNF formulae based on the state-of-the-art sequential model counter D4. DMC can take advantage of a (possibly large) number of sequential model counters running on (possibly heterogeneous) computing units spread over a network of computers. For ensuring an efficient workload distribution, the model counting task is shared between the model counters following a policy close to work stealing. The number and the sizes of the messages which are exchanged by the jobs are kept small. The results obtained show DMC as a much more efficient counter than D4, the distribution of the computation yielding large improvements for some benchmarks. DMC appears also as a serious challenger to the parallel model counter CountAntom and to the distributed model counter dCountAntom.
Stochastic Boolean satisfiability (SSAT) is an expressive language to formulate decision problems with randomness. Solving SSAT formulas has the same PSPACE-complete computational complexity as solving quantified Boolean formulas (QBFs). Despite its broad applications and profound theoretical values, SSAT has received relatively little attention compared to QBF. In this paper, we focus on exist-random quantified SSAT formulas, also known as E-MAJSAT, which is a special fragment of SSAT commonly applied in probabilistic conformant planning, posteriori hypothesis, and maximum expected utility. Based on clause selection, a recently proposed QBF technique, we propose an algorithm to solve E-MAJSAT. Moreover, our method can provide an approximate solution to E-MAJSAT with a lower bound when an exact answer is too expensive to compute. Experiments show that the proposed algorithm achieves significant performance gains and memory savings over the state-of-the-art SSAT solvers on a number of benchmark formulas, and provides useful lower bounds for cases where prior methods fail to compute exact answers.
Partial MaxSAT (PMS) generalizes SAT and MaxSAT by introducing hard clauses and soft clauses. PMS and Weighted PMS (WPMS) have many important real world applications. Local search is one popular method for solving (W)PMS. Recent studies on specialized local search for (W)PMS have led to significant improvements. But such specialized algorithms are complicated with the concepts tailored for hard and soft clauses. In this work, we propose a dynamic local search algorithm, which exploits the structure of (W)PMS by a carefully designed clause weighting scheme. Our solver SATLike adopts a local search framework for SAT and does not need any specialized concept for (W)PMS. Experiments on PMS and WPMS benchmarks from the MaxSAT Evaluations (MSE) 2016 and 2017 show that SATLike significantly outperforms state of the art local search solvers. Also, SATLike significantly narrows the gap between the performance of local search solvers and complete solvers on industrial benchmarks, and performs better than the complete solvers on the MSE2017 benchmarks.
A set of constraints is unsatisfiable if there is no solution that satisfies these constraints. To analyse unsatisfiable problems, the user needs to understand where inconsistencies come from and how they can be repaired. Minimal unsatisfiable cores and correction sets are important subsets of constraints that enable such analysis. In this work, we propose a new algorithm for extracting minimal unsatisfiable cores and correction sets simultaneously. Building on top of the relaxation and strengthening framework, we introduce novel techniques for extracting these sets. Our new solver significantly outperforms several state of the art algorithms on common benchmarks when it comes to extracting correction sets and compares favorably on core extraction.
Explanations of machine learning (ML) predictions are of fundamental importance in different settings. Moreover, explanations should be succinct, to enable easy understanding by humans. Decision trees represent an often used approach for developing explainable ML models, motivated by the natural mapping between decision tree paths and rules. Clearly, smaller trees correlate well with smaller rules, and so one challenge is to devise solutions for computing smallest size decision trees given training data. Although simple to formulate, the computation of smallest size decision trees turns out to be an extremely challenging computational problem, for which no practical solutions are known. This paper develops a SAT-based model for computing smallest-size decision trees given training data. In sharp contrast with past work, the proposed SAT model is shown to scale for publicly available datasets of practical interest.
In this work, we present a variant of DCA (Difference of Convex function Algorithm) with the aim to improve its convergence speed. The proposed algorithm, named Accelerated DCA (ADCA), consists in incorporating the Nesterov's acceleration technique into DCA. We first investigate ADCA for solving the standard DC program and rigorously study its convergence properties and the convergence rate. Secondly, we develop ADCA for a special case of the standard DC program whose the objective function is the sum of a differentiable with L-Lipschitz gradient function (possibly nonconvex) and a nonsmooth DC function. We exploit the special structure of the problem to propose an efficient DC decomposition for which the corresponding ADCA scheme is inexpensive. As an application, we consider the sparse binary logistic regression problem. Numerical experiments on several benchmark datasets illustrate the efficiency of our algorithm and its superiority over well-known methods.
New constraint-based algorithms have been recently proposed to solve Multi-Objective Combinatorial Optimization (MOCO) problems. These new methods are based on Minimal Correction Subsets (MCSs) or P-minimal models and have shown to be successful at solving MOCO instances when the constraint set is hard to satisfy. However, if the constraints are easy to satisfy, constraint-based tools usually do not perform as well as stochastic methods. For solving such instances, algorithms should focus on dealing with the objective functions. This paper proposes the integration of stratification techniques in constraint-based algorithms for MOCO. Moreover, it also shows how to diversify the stratification among the several objective criteria in order to better approximate the Pareto front of MOCO problems. An extensive experimental evaluation on publicly available MOCO instances shows that the new algorithm is competitive with stochastic methods and it is much more effective than existing constraint-based methods.
Multi-Valued Decision Diagrams (MDDs) are instrumental in modeling combinatorial problems with Constraint Programming.In this paper, we propose a related data structure called sMDD (semi-MDD) where the central layer of the diagrams is non-deterministic.We show that it is easy and efficient to transform any table (set of tuples) into an sMDD.We also introduce a new filtering algorithm, called Compact-MDD, which is based on bitwise operations, and can be applied to both MDDs and sMDDs.Our experimental results show the practical interest of our approach, both in terms of compression and filtering speed.
Constraint propagation during backtrack search significantly improves the performance of solving a Constraint Satisfaction Problem. While Generalized Arc Consistency (GAC) is the most popular level of propagation, higher-level consistencies (HLC) are needed to solve difficult instances. Deciding to enforce an HLC instead of GAC remains the topic of active research. We propose a simple and effective strategy that reactively triggers an HLC by monitoring search performance: When search starts thrashing, we trigger an HLC, then conservatively revert to GAC. We detect thrashing by counting the number of backtracks at each level of the search tree and geometrically adjust the frequency of triggering an HLC based on its filtering effectiveness. We validate our approach on benchmark problems using Partition-One Arc-Consistency as an HLC. However, our strategy is generic and can be used with other higher-level consistency algorithms.
The alldifferent constraint is an essential ingredient of most Constraints Satisfaction Problems (CSPs). It has been known that the generalized arc consistency (GAC) of alldifferent constraints can be reduced to the maximum matching problem in a value graph. The redundant edges, which do not appear in any maximum matching of the value graph, can and should be removed from the graph. The existing methods attempt to identify these redundant edges by computing the strongly connected components after finding a maximum matching for the graph. Here, we present a novel theorem for identification of the redundant edges. We show that some of the redundant edges can be immediately detected after finding a maximum matching. Based on this theoretical result, we present an efficient algorithm for processing alldifferent constraints. Experimental results on real problems show that our new algorithm significantly outperforms the-state-of-art approaches.