Total: 33

In recent years, automated data-driven decision-making systems have enjoyed a tremendous success in a variety of fields (e.g., to make product recommendations, or to guide the production of entertainment). More recently, these algorithms are increasingly being used to assist socially sensitive decisionmaking (e.g., to decide who to admit into a degree program or to prioritize individuals for public housing). Yet, these automated tools may result in discriminative decision-making in the sense that they may treat individuals unfairly or unequally based on membership to a category or a minority, resulting in disparate treatment or disparate impact and violating both moral and ethical standards. This may happen when the training dataset is itself biased (e.g., if individuals belonging to a particular group have historically been discriminated upon). However, it may also happen when the training dataset is unbiased, if the errors made by the system affect individuals belonging to a category or minority differently (e.g., if misclassification rates for Blacks are higher than for Whites). In this paper, we unify the definitions of unfairness across classification and regression. We propose a versatile mixed-integer optimization framework for learning optimal and fair decision trees and variants thereof to prevent disparate treatment and/or disparate impact as appropriate. This translates to a flexible schema for designing fair and interpretable policies suitable for socially sensitive decision-making. We conduct extensive computational studies that show that our framework improves the state-of-the-art in the field (which typically relies on heuristics) to yield non-discriminative decisions at lower cost to overall accuracy.

We propose a simple and general online method to measure the search progress within the Branch-and-Bound algorithm, from which we estimate the size of the remaining search tree. We then show how this information can help solvers algorithmically at runtime by designing a restart strategy for MixedInteger Programming (MIP) solvers that decides whether to restart the search based on the current estimate of the number of remaining nodes in the tree. We refer to this type of algorithm as clairvoyant. Our clairvoyant restart strategy outperforms a state-of-the-art solver on a large set of publicly available MIP benchmark instances. It is implemented in the MIP solver SCIP and will be available in future releases.

We enumerate all circulant good matrices with odd orders divisible by 3 up to order 70. As a consequence of this we find a previously overlooked set of good matrices of order 27 and a new set of good matrices of order 57. We also find that circulant good matrices do not exist in the orders 51, 63, and 69, thereby finding three new counterexamples to the conjecture that such matrices exist in all odd orders. Additionally, we prove a new relationship between the entries of good matrices and exploit this relationship in our enumeration algorithm. Our method applies the SAT+CAS paradigm of combining computer algebra functionality with modern SAT solvers to efficiently search large spaces which are specified by both algebraic and logical constraints.

Finding tight bounds on the optimal solution is a critical element of practical solution methods for discrete optimization problems. In the last decade, decision diagrams (DDs) have brought a new perspective on obtaining upper and lower bounds that can be significantly better than classical bounding mechanisms, such as linear relaxations. It is well known that the quality of the bounds achieved through this flexible bounding method is highly reliant on the ordering of variables chosen for building the diagram, and finding an ordering that optimizes standard metrics is an NP-hard problem. In this paper, we propose an innovative and generic approach based on deep reinforcement learning for obtaining an ordering for tightening the bounds obtained with relaxed and restricted DDs. We apply the approach to both the Maximum Independent Set Problem and the Maximum Cut Problem. Experimental results on synthetic instances show that the deep reinforcement learning approach, by achieving tighter objective function bounds, generally outperforms ordering methods commonly used in the literature when the distribution of instances is known. To the best knowledge of the authors, this is the first paper to apply machine learning to directly improve relaxation bounds obtained by general-purpose bounding mechanisms for combinatorial optimization problems.

Currently, detecting and isolating faults in hybrid systems is often done manually with the help of human operators. In this paper we present a novel model-based diagnosis approach for automatically diagnosing hybrid systems. The approach has two parts: First, modelling dynamic system behaviour is done through well-known state space models using differential equations. Second, from the state space models we calculate Boolean residuals through an observer-pattern. The novelty lies in implementing the observer pattern through the use of a symbolic system description specified in satisfiability theory modulo linear arithmetic. With this, we create a static situation for the diagnosis algorithm and decouple modelling and diagnosis. Evaluating the system description generates one Boolean residual for each component. These residuals constitute the fault symptoms. To find the minimum cardinality diagnosis from these symptoms we employ Reiter’s diagnosis lattice. For the experimental evaluation we use a simulation of the Tennessee Eastman process and a simulation of a four-tank model. We show that the presented approach is able to identify all injected faults.

In real-world, many problems can be formulated as the alignment between two geometric patterns. Previously, a great amount of research focus on the alignment of 2D or 3D patterns, especially in the field of computer vision. Recently, the alignment of geometric patterns in high dimension finds several novel applications, and has attracted more and more attentions. However, the research is still rather limited in terms of algorithms. To the best of our knowledge, most existing approaches for high dimensional alignment are just simple extensions of their counterparts for 2D and 3D cases, and often suffer from the issues such as high complexities. In this paper, we propose an effective framework to compress the high dimensional geometric patterns and approximately preserve the alignment quality. As a consequence, existing alignment approach can be applied to the compressed geometric patterns and thus the time complexity is significantly reduced. Our idea is inspired by the observation that high dimensional data often has a low intrinsic dimension. We adopt the widely used notion “doubling dimension” to measure the extents of our compression and the resulting approximation. Finally, we test our method on both random and real datasets; the experimental results reveal that running the alignment algorithm on compressed patterns can achieve similar qualities, comparing with the results on the original patterns, but the running times (including the times cost for compression) are substantially lower.

Robust principal component analysis (RPCA) is a well-studied problem whose goal is to decompose a matrix into the sum of low-rank and sparse components. In this paper, we propose a nonconvex feasibility reformulation of RPCA problem and apply an alternating projection method to solve it. To the best of our knowledge, this is the first paper proposing a method that solves RPCA problem without considering any objective function, convex relaxation, or surrogate convex constraints. We demonstrate through extensive numerical experiments on a variety of applications, including shadow removal, background estimation, face detection, and galaxy evolution, that our approach matches and often significantly outperforms current state-of-the-art in various ways.

We study the parameterized complexity of Integer Quadratic Programming under two kinds of restrictions: explicit restrictions on the domain or coefficients, and structural restrictions on variable interactions. We argue that both kinds of restrictions are necessary to achieve tractability for Integer Quadratic Programming, and obtain four new algorithms for the problem that are tuned to possible explicit restrictions of instances that we may wish to solve. The presented algorithms are exact, deterministic, and complemented by appropriate lower bounds.

We formulate a new stochastic submodular maximization problem by introducing the performance-dependent costs of items. In this problem, we consider selecting items for the case where the performance of each item (i.e., how much an item contributes to the objective function) is decided randomly, and the cost of an item depends on its performance. The goal of the problem is to maximize the objective function subject to a budget constraint on the costs of the selected items. We present an adaptive algorithm for this problem with a theoretical guaran-√ tee that its expected objective value is at least (1−1/ 4e)/2 times the maximum value attained by any adaptive algorithms. We verify the performance of the algorithm through numerical experiments.

Constraint-based sequential pattern mining aims at identifying frequent patterns on a sequential database of items while observing constraints defined over the item attributes. We introduce novel techniques for constraint-based sequential pattern mining that rely on a multi-valued decision diagram (MDD) representation of the database. Specifically, our representation can accommodate multiple item attributes and various constraint types, including a number of non-monotone constraints. To evaluate the applicability of our approach, we develop an MDD-based prefix-projection algorithm and compare its performance against a typical generate-and-check variant, as well as a state-of-the-art constraint-based sequential pattern mining algorithm. Results show that our approach is competitive with or superior to these other methods in terms of scalability and efficiency.

Proximal gradient method has been playing an important role to solve many machine learning tasks, especially for the nonsmooth problems. However, in some machine learning problems such as the bandit model and the black-box learning problem, proximal gradient method could fail because the explicit gradients of these problems are difficult or infeasible to obtain. The gradient-free (zeroth-order) method can address these problems because only the objective function values are required in the optimization. Recently, the first zeroth-order proximal stochastic algorithm was proposed to solve the nonconvex nonsmooth problems. However, its convergence rate is O(1/√T) for the nonconvex problems, which is significantly slower than the best convergence rate O(T1) of the zerothorder stochastic algorithm, where T is the iteration number. To fill this gap, in the paper, we propose a class of faster zeroth-order proximal stochastic methods with the variance reduction techniques of SVRG and SAGA, which are denoted as ZO-ProxSVRG and ZO-ProxSAGA, respectively. In theoretical analysis, we address the main challenge that an unbiased estimate of the true gradient does not hold in the zerothorder case, which was required in previous theoretical analysis of both SVRG and SAGA. Moreover, we prove that both ZO-ProxSVRG and ZO-ProxSAGA algorithms have O(T1) convergence rates. Finally, the experimental results verify that our algorithms have a faster convergence rate than the existing zeroth-order proximal stochastic algorithm.

The growing range of applications of Machine Learning (ML) in a multitude of settings motivates the ability of computing small explanations for predictions made. Small explanations are generally accepted as easier for human decision makers to understand. Most earlier work on computing explanations is based on heuristic approaches, providing no guarantees of quality, in terms of how close such solutions are from cardinality- or subset-minimal explanations. This paper develops a constraint-agnostic solution for computing explanations for any ML model. The proposed solution exploits abductive reasoning, and imposes the requirement that the ML model can be represented as sets of constraints using some target constraint reasoning system for which the decision problem can be answered with some oracle. The experimental results, obtained on well-known datasets, validate the scalability of the proposed approach as well as the quality of the computed solutions.

Steiner tree is a classical NP-hard problem that has been extensively studied both theoretically and empirically. In theory, the fastest approach for inputs with a small number of terminals uses the dynamic programming, but in practice, stateof-the-art solvers are based on the branch-and-cut method. In this paper, we present a novel separator-based pruning technique for speeding up a theoretically fast DP algorithm. Our empirical evaluation shows that our pruned DP algorithm is quite effective against real-world instances admitting small separators, scales to more than a hundred terminals, and is competitive with a branch-and-cut solver.

Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this paper, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed.

We present a recursive algorithm for projected model counting, i.e., the problem consisting in determining the number of models k∃X.Σk of a propositional formula Σ after eliminating from it a given set X of variables. Based on a ”standard” model counter, our algorithm projMC takes advantage of a disjunctive decomposition scheme of ∃X.Σ for computing k∃X.Σk. It also looks for disjoint components in its input for improving the computation. Our experiments show that in many cases projMC is significantly more efficient than the previous algorithms for projected model counting from the literature.

In this paper, we propose a class of robust stochastic subgradient methods for distributed learning from heterogeneous datasets at presence of an unknown number of Byzantine workers. The Byzantine workers, during the learning process, may send arbitrary incorrect messages to the master due to data corruptions, communication failures or malicious attacks, and consequently bias the learned model. The key to the proposed methods is a regularization term incorporated with the objective function so as to robustify the learning task and mitigate the negative effects of Byzantine attacks. The resultant subgradient-based algorithms are termed Byzantine-Robust Stochastic Aggregation methods, justifying our acronym RSA used henceforth. In contrast to most of the existing algorithms, RSA does not rely on the assumption that the data are independent and identically distributed (i.i.d.) on the workers, and hence fits for a wider class of applications. Theoretically, we show that: i) RSA converges to a near-optimal solution with the learning error dependent on the number of Byzantine workers; ii) the convergence rate of RSA under Byzantine attacks is the same as that of the stochastic gradient descent method, which is free of Byzantine attacks. Numerically, experiments on real dataset corroborate the competitive performance of RSA and a complexity reduction compared to the state-of-the-art alternatives.

We focus on empirical risk minimization with a composite regulariser, which has been widely applied in various machine learning tasks to introduce important structural information regarding the problem or data. In general, it is challenging to calculate the proximal operator with the composite regulariser. Recently, proximal average (PA) which involves a feasible proximal operator calculation is proposed to approximate composite regularisers. Augmented with the prevailing variance reducing (VR) stochastic methods (e.g. SVRG, SAGA), PA based algorithms would achieve a better performance. However, existing works require a fixed stepsize, which needs to be rather small to ensure that the PA approximation is sufficiently accurate. In the meantime, the smaller stepsize would incur many more iterations for convergence. In this paper, we propose two fast PA based VR stochastic methods – APA-SVRG and APA-SAGA. By initializing the stepsize with a much larger value and adaptively decreasing it, both of the proposed methods are proved to enjoy the (ô n log 1/ε + mo 1/ε) iteration complexity to achieve the accurate solutions, where m0 is the initial number of inner iterations and n is the number of samples. Moreover, experimental results demonstrate the superiority of the proposed algorithms.

Exploiting parallelism is becoming more and more important in designing efficient solvers for computationally hard problems. However, manually building parallel solvers typically requires considerable domain knowledge and plenty of human effort. As an alternative, automatic construction of parallel portfolios (ACPP) aims at automatically building effective parallel portfolios based on a given problem instance set and a given rich configuration space. One promising way to solve the ACPP problem is to explicitly group the instances into different subsets and promote a component solver to handle each of them. This paper investigates solving ACPP from this perspective, and especially studies how to obtain a good instance grouping. The experimental results on two widely studied problem domains, the boolean satisfiability problems (SAT) and the traveling salesman problems (TSP), showed that the parallel portfolios constructed by the proposed method could achieve consistently superior performances to the ones constructed by the state-of-the-art ACPP methods, and could even rival sophisticated hand-designed parallel solvers.

In this paper, we study the semidefinite affine rank feasibility problem, which consists in finding a positive semidefinite matrix of a given rank from its linear measurements. We consider the semidefinite programming relaxations of the problem with different objective functions and study their properties. In particular, we propose an analytical bound on the number of relaxations that are sufficient to solve in order to obtain a solution of a generic instance of the semidefinite affine rank feasibility problem or prove that there is no solution. This is followed by a heuristic algorithm based on semidefinite relaxation and an experimental proof of its performance on a large sample of synthetic data.

We revisit the Frank-Wolfe (FW) optimization under strongly convex constraint sets. We provide a faster convergence rate for FW without line search, showing that a previously overlooked variant of FW is indeed faster than the standard variant. With line search, we show that FW can converge to the global optimum, even for smooth functions that are not convex, but are quasi-convex and locally-Lipschitz. We also show that, for the general case of (smooth) non-convex functions, FW with line search converges with high probability to a stationary point at a rate of O(1/t), as long as the constraint set is strongly convex—one of the fastest convergence rates in non-convex optimization.

Dependency quantified Boolean formulas (DQBFs) are a powerful formalism, which subsumes quantified Boolean formulas (QBFs) and allows an explicit specification of dependencies of existential variables on universal variables. This enables a succinct encoding of decision problems in the NEXPTIME complexity class. As solving general DQBFs is NEXPTIME complete, in contrast to the PSPACE completeness of QBF solving, characterizing DQBF subclasses of lower computational complexity allows their effective solving and is of practical importance. Recently a DQBF proof calculus based on a notion of fork extension, in addition to resolution and universal reduction, was proposed by Rabe in 2017. We show that this calculus is in fact incomplete for general DQBFs, but complete for a subclass of DQBFs, where any two existential variables have either identical or disjoint dependency sets over the universal variables. We further characterize this DQBF subclass to be ΣP3 complete in the polynomial time hierarchy. Essentially using fork extension, a DQBF in this subclass can be converted to an equisatisfiable 3QBF with only a linear increase in formula size. We exploit this conversion for effective solving of this DQBF subclass and point out its potential as a general strategy for DQBF quantifier localization. Experimental results show that the method outperforms state-of-the-art DQBF solvers on a number of benchmarks, including the 2018 DQBF evaluation benchmarks.

Given a Boolean formula φ, the problem of model counting, also referred to as #SAT is to compute the number of solutions of φ. Model counting is a fundamental problem in artificial intelligence with a wide range of applications including probabilistic reasoning, decision making under uncertainty, quantified information flow, and the like. Motivated by the success of SAT solvers, there has been surge of interest in the design of hashing-based techniques for approximate model counting for the past decade. We profiled the state of the art approximate model counter ApproxMC2 and observed that over 99.99% of time is consumed by the underlying SAT solver, CryptoMiniSat. This observation motivated us to ask: Can we design an efficient underlying CNF-XOR SAT solver that can take advantage of the structure of hashing-based algorithms and would this lead to an efficient approximate model counter? The primary contribution of this paper is an affirmative answer to the above question. We present a novel architecture, called BIRD, to handle CNF-XOR formulas arising from hashingbased techniques. The resulting hashing-based approximate model counter, called ApproxMC3, employs the BIRD framework in its underlying SAT solver, CryptoMiniSat. To the best of our knowledge, we conducted the most comprehensive study of evaluation performance of counting algorithms involving 1896 benchmarks with computational effort totaling 86400 computational hours. Our experimental evaluation demonstrates significant runtime performance improvement for ApproxMC3 over ApproxMC2. In particular, we solve 648 benchmarks more than ApproxMC2, the state of the art approximate model counter and for all the formulas where both ApproxMC2 and ApproxMC3 did not timeout and took more than 1 seconds, the mean speedup is 284.40 – more than two orders of magnitude. Erratum: This research is supported in part by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: [AISG-RP-2018-005])

In this paper, we study a problem from the realm of multicriteria decision making in which the goal is to select from a given set S of d-dimensional objects a minimum sized subset S0 with bounded regret. Thereby, regret measures the unhappiness of users which would like to select their favorite object from set S but now can only select their favorite object from the subset S0. Previous work focused on bounding the maximum regret which is determined by the most unhappy user. We propose to consider the average regret instead which is determined by the sum of (un)happiness of all possible users. We show that this regret measure comes with desirable properties as supermodularity which allows to construct approximation algorithms. Furthermore, we introduce the regret minimizing permutation problem and discuss extensions of our algorithms to the recently proposed k-regret measure. Our theoretical results are accompanied with experiments on a variety of inputs with d up to 7.

Current Maximum Satisfiability (MaxSAT) algorithms based on successive calls to a powerful Satisfiability (SAT) solver are now able to solve real-world instances in many application domains. Moreover, replacing the SAT solver with a Satisfiability Modulo Theories (SMT) solver enables effective MaxSMT algorithms. However, MaxSMT has seldom been used in debugging multi-threaded software. Multi-threaded programs are usually non-deterministic due to the huge number of possible thread operation schedules, which makes them much harder to debug than sequential programs. A recent approach to isolate the root cause of concurrency bugs in multi-threaded software is to produce a report that shows the differences between a failing and a non-failing execution. However, since they rely solely on heuristics, these reports can be unnecessarily large. Hence, reports may contain operations that are not relevant to the bug’s occurrence. This paper proposes the use of MaxSMT for the generation of minimal reports for multi-threaded software with concurrency bugs. The proposed techniques report situations that the existing techniques are not able to identify. Experimental results show that using MaxSMT can significantly improve the accuracy of the generated reports and, consequently, their usefulness in debugging the root cause of concurrency bugs.

Real world experiments are expensive, and thus it is important to reach a target in a minimum number of experiments. Experimental processes often involve control variables that change over time. Such problems can be formulated as functional optimisation problem. We develop a novel Bayesian optimisation framework for such functional optimisation of expensive black-box processes. We represent the control function using Bernstein polynomial basis and optimise in the coefficient space. We derive the theory and practice required to dynamically adjust the order of the polynomial degree, and show how prior information about shape can be integrated. We demonstrate the effectiveness of our approach for short polymer fibre design and optimising learning rate schedules for deep networks.