Total: 19

This paper presents Athanor, a novel local search solver that operates on abstract constraint specifications of combinatorial problems in the Essence language. It is unique in that it operates directly on the high level, nested types in Essence, such as set of partitions or multiset of sequences, without refining such types into low level representations. This approach has two main advantages. First, the structure present in the high level types allows high quality neighbourhoods for local search to be automatically derived. Second, it allows Athanor to scale much better than solvers that operate on the equivalent, but much larger, low-level representations. The paper details how Athanor operates, covering incremental evaluation, dynamic unrolling of quantified expressions and neighbourhood construction. A series of case studies show the performance of Athanor, benchmarked against several local search solvers on a range of problem classes.

Frequent itemset mining is one of the most studied tasks in knowledge discovery. It is often reduced to mining the positive border of frequent itemsets, i.e. maximal frequent itemsets. Infrequent itemset mining, on the other hand, can be reduced to mining the negative border, i.e. minimal infrequent itemsets. We propose a generic framework based on constraint programming to mine both borders of frequent itemsets.One can easily decide which border to mine by setting a simple parameter. For this, we introduce two new global constraints, FREQUENTSUBS and INFREQUENTSUPERS, with complete polynomial propagators. We then consider the problem of mining borders with additional constraints. We prove that this problem is coNP-hard, ruling out the hope for the existence of a single CSP solving this problem (unless coNP ⊆ NP).

Sampling-based anticipatory algorithms can be very effective at solving online optimization problems under uncertainty, but their computational cost may be prohibitive in some cases. Given an arbitrary anticipatory algorithm, we present three methods that allow to retain its solution quality at a fraction of the online computational cost, via a substantial degree of offline preparation. Our approaches are obtained by combining: 1) a simple technique to identify likely future outcomes based on past observations; 2) the (expensive) offline computation of a "contingency table"; and 3) an efficient solution-fixing heuristic. We ground our techniques on two case studies: an energy management system with uncertain renewable generation and load demand, and a traveling salesman problem with uncertain travel times. In both cases, our techniques achieve high solution quality, while substantially reducing the online computation time.

We study the predict+optimise problem, where machine learning and combinatorial optimisation must interact to achieve a common goal. These problems are important when optimisation needs to be performed on input parameters that are not fully observed but must instead be estimated using machine learning. Our contributions are two-fold: 1) we provide theoretical insight into the properties and computational complexity of predict+optimise problems in general, and 2) develop a novel framework that, in contrast to related work, guarantees to compute the optimal parameters for a linear learning function given any ranking optimisation problem. We illustrate the applicability of our framework for the particular case of the unit-weighted knapsack predict+optimise problem and evaluate on benchmarks from the literature.

The paper studies how to release data about a critical infrastructure network (e.g., a power network or a transportation network) without disclosing sensitive information that can be exploited by malevolent agents, while preserving the realism of the network. It proposes a novel obfuscation mechanism that combines several privacy-preserving building blocks with a bi-level optimization model to significantly improve accuracy. The obfuscation is evaluated for both realism and privacy properties on real energy and transportation networks. Experimental results show the obfuscation mechanism substantially reduces the potential damage of an attack exploiting the released data to harm the real network.

Modal logic S5 has found various applications in artificial intelligence. With the advances in modern SAT solvers, SAT-based approach has shown great potential in solving the satisfiability problem of S5. The scale of the SAT encoding for S5 is strongly influenced by the upper bound on the number of possible worlds. In this paper, we present a novel SAT-based approach for S5 satisfiability problem. We show a normal form for S5 formulas. Based on this normal form, a conflict graph can be derived whose chromatic number provides an upper bound of the possible worlds and a lot of unnecessary search spaces can be eliminated in this process. A heuristic graph coloring algorithm is adopted to balance the efficiency and optimality. The number of possible worlds can be significantly reduced for many practical instances. Extensive experiments demonstrate that our approach outperforms state-of-the-art S5-SAT solvers.

The paper proposes Maximum Residue (MR) as a notion to evaluate the strength of a symmetry breaking method. We give a proof to improve the best known DoubleLex MR upper bound from m!n! - (m!+n!) to min(m!,n!) for an m x n matrix model. Our result implies that DoubleLex works well on matrix models where min(m, n) is relatively small. We further study the MR bounds of SwapNext and SwapAny, which are extensions to DoubleLex breaking further a small number of composition symmetries. Such theoretical comparisons suggest general principles on selecting Lex-based symmetry breaking methods based on the dimensions of the matrix models. Our experiments confirm the theoretical predictions as well as efficiency of these methods.

Existing automated testing frameworks require multiple observations to be jointly diagnosed with the purpose of identifying common fault locations. This is the case for example with continuous integration tools. This paper shows that existing solutions fail to compute the set of minimal diagnoses, and as a result run times can increase by orders of magnitude. The paper proposes not only solutions to correct existing algorithms, but also conditions for improving their run times. Nevertheless, the diagnosis of multiple observations raises a number of important computational challenges, which even the corrected algorithms are often unable to cope with. As a result, the paper devises a novel algorithm for diagnosing multiple observations, which is shown to enable significant performance improvements in practice.

The Bouchitté-Todinca algorithm (BT), operating dynamic programming over the so-called potential maximal cliques (PMCs), yields a practically efficient approach to treewidth and generalized hypertreewidth. The enumeration of PMCs is a scalability bottleneck for BT in practice. We propose the use of declarative solvers for PMC enumeration as a substitute for the specialized PMC enumeration algorithms employed in current BT implementations. The presented Boolean satisfiability (SAT) and answer set programming (ASP) based PMC enumeration approaches open up new possibilities for improving the efficiency of BT in practice.

Low-rank methods for semi-definite programming (SDP) have gained a lot of interest recently, especially in machine learning applications. Their analysis often involves determinant-based or Schatten-norm penalties, which are difficult to implement in practice due to high computational efforts. In this paper, we propose Entropy-Penalized Semi-Definite Programming (EP-SDP), which provides a unified framework for a broad class of penalty functions used in practice to promote a low-rank solution. We show that EP-SDP problems admit an efficient numerical algorithm, having (almost) linear time complexity of the gradient computation; this makes it useful for many machine learning and optimization problems. We illustrate the practical efficiency of our approach on several combinatorial optimization and machine learning problems.

Integer programming (IP) is widely used within operations research to model and solve complex combinatorial problems such as personnel rostering and assignment problems. Modelling such problems is difficult for non-experts and expensive when hiring domain experts to perform the modelling. For many tasks, however, examples of working solutions are readily available. We propose ARNOLD, an approach that partially automates the modelling step by learning an integer program from example solutions. Contrary to existing alternatives, ARNOLD natively handles multi-dimensional quantities and non-linear operations, which are at the core of IP problems, and it only requires examples of feasible solution. The main challenge is to efficiently explore the space of possible programs. Our approach pairs a general-to-specific traversal strategy with a nested lexicographic ordering in order to prune large portions of the space of candidate constraints while avoiding visiting the same candidate multiple times. Our empirical evaluation shows that ARNOLD can acquire models for a number of realistic benchmark problems

A number of data mining problems on probabilistic networks can be modeled as Stochastic Constraint Optimization and Satisfaction Problems, i.e., problems that involve objectives or constraints with a stochastic component. Earlier methods for solving these problems used Ordered Binary Decision Diagrams (OBDDs) to represent constraints on probability distributions, which were decomposed into sets of smaller constraints and solved by Constraint Programming (CP) or Mixed Integer Programming (MIP) solvers. For the specific case of monotonic distributions, we propose an alternative method: a new propagator for a global OBDD-based constraint. We show that this propagator is (sub-)linear in the size of the OBDD, and maintains domain consistency. We experimentally evaluate the effectiveness of this global constraint in comparison to existing decomposition-based approaches, and show how this propagator can be used in combination with another data mining specific constraint present in CP systems. As test cases we use problems from the data mining literature.

Constraint optimization problems (COP) on finite domains are typically solved via search. Many problems (e.g., 0-1 knapsack) involve redundant search, making a general constraint solver revisit the same subproblems again and again. Existing approaches use caching, symmetry breaking, subproblem dominance, or search with decomposition to prune the search space of constraint problems. In this paper we present a different approach--DPSolver--which uses dynamic programming (DP) to efficiently solve certain types of constraint optimization problems (COPs). Given a COP modeled with MiniZinc, DPSolver first analyzes the model to decide whether the problem is efficiently solvable with DP. If so, DPSolver refactors the constraints and objective functions to model the problem as a DP problem. Finally, DPSolver feeds the refactored model to Gecode--a widely used constraint solver--for the optimal solution. Our evaluation shows that DPSolver significantly improves the performance of constraint solving.

In this paper, we consider scheduling problems involving resources that must perform complex setup operations between the tasks they realize. To deal with such problems, we introduce a simple yet efficient iterative two-layer decision process that alternates between the fast synthesis of high-level schedules based on a coarse-grain model of setup operations, and the production of detailed schedules based on a fine-grain model. Experiments realized on representative benchmarks of a multi-robot application show the efficiency of the approach.

The runtime performance of modern SAT solvers is deeply connected to the phase transition behavior of CNF formulas. While CNF solving has witnessed significant runtime improvement over the past two decades, the same does not hold for several other classes such as the conjunction of cardinality and XOR constraints, denoted as CARD-XOR formulas. The problem of determining satisfiability of CARD-XOR formulas is a fundamental problem with wide variety of applications ranging from discrete integration in the field of artificial intelligence to maximum likelihood decoding in coding theory. The runtime behavior of random CARD-XOR formulas is unexplored in prior work. In this paper, we present the first rigorous empirical study to characterize the runtime behavior of 1-CARD-XOR formulas. We show empirical evidence of a surprising phase-transition that follows a non-linear tradeoff between CARD and XOR constraints.

Given a Boolean formula F, the problem of model counting, also referred to as #SAT, seeks to compute the number of solutions of F. Model counting is a fundamental problem with a wide variety of applications ranging from planning, quantified information flow to probabilistic reasoning and the like. The modern #SAT solvers tend to be either based on static decomposition, dynamic decomposition, or a hybrid of the two. Despite dynamic decomposition based #SAT solvers sharing much of their architecture with SAT solvers, the core design and heuristics of dynamic decomposition-based #SAT solvers has remained constant for over a decade. In this paper, we revisit the architecture of the state-of-the-art dynamic decomposition-based #SAT tool, sharpSAT, and demonstrate that by introducing a new notion of probabilistic component caching and the usage of universal hashing for exact model counting along with the development of several new heuristics can lead to significant performance improvement over state-of-the-art model-counters. In particular, we develop GANAK, a new scalable probabilistic exact model counter that outperforms state-of-the-art exact and approximate model counters sharpSAT and ApproxMC3 respectively, both in terms of PAR-2 score and the number of instances solved. Furthermore, in our experiments, the model count returned by GANAK was equal to the exact model count for all the benchmarks. Finally, we observe that recently proposed preprocessing techniques for model counting benefit exact model counters while hurting the performance of approximate model counters.

We unify search-based and compilation-based approaches to multi-agent path finding (MAPF) through satisfiability modulo theories (SMT). The task in MAPF is to navigate agents in an undirected graph to given goal vertices so that they do not collide. We rephrase Conflict-Based Search (CBS), one of the state-of-the-art algorithms for optimal MAPF solving, in the terms of SMT. This idea combines SAT-based solving known from MDD-SAT, a SAT-based optimal MAPF solver, at the low-level with conflict elimination of CBS at the high-level. Where the standard CBS branches the search after a conflict, we refine the propositional model with a disjunctive constraint. Our novel algorithm called SMT-CBS hence does not branch at the high-level but incrementally extends the propositional model. We experimentally compare SMT-CBS with CBS, ICBS, and MDD-SAT.

Constraint-based reasoning methods thrive in solving problem instances with a tight solution space. On the other hand, evolutionary algorithms are usually effective when it is not hard to satisfy the problem constraints. This dichotomy has been observed in many optimization problems. In the particular case of Multi-Objective Combinatorial Optimization (MOCO), new recently proposed constraint-based algorithms have been shown to outperform more established evolutionary approaches when a given problem instance is hard to satisfy. In this paper, we propose the integration of constraint-based procedures in evolutionary algorithms for solving MOCO. First, a new core-based smart mutation operator is applied to individuals that do not satisfy all problem constraints. Additionally, a new smart improvement operator based on Minimal Correction Subsets is used to improve the quality of the population. Experimental results clearly show that the integration of these operators greatly improves multi-objective evolutionary algorithms MOEA/D and NSGAII. Moreover, even on problem instances with a tight solution space, the newly proposed algorithms outperform the state-of-the-art constraint-based approaches for MOCO.

We study the Maximum Satisfiability problem (MaxSAT). Particularly, we derive a branching algorithm of running time O*(1.2989^m) for the MaxSAT problem, where m denotes the number of clauses in the given CNF formula. Our algorithm considerably improves the previous best result O*(1.3248^m) by Chen and Kanj [2004] published 15 years ago. For our purpose, we derive improved branching strategies for variables of degrees 3, 4, and 5. The worst case of our branching algorithm is at variables of degree 4 which occur twice both positively and negatively in the given CNF formula. To serve the branching rules and shrink the size of the CNF formula, we also propose a variety of reduction rules which can be exhaustively applied in polynomial time and, moreover, some of them solve a bottleneck of the previous best algorithm.