IJCAI.2020 - Uncertainty in AI | Cool Papers

#1 Lifted Hybrid Variational Inference [PDF] [Copy] [Kimi]

Authors: Yuqiao Chen ; Yibo Yang ; Sriraam Natarajan ; Nicholas Ruozzi

Lifted inference algorithms exploit model symmetry to reduce computational cost in probabilistic inference. However, most existing lifted inference algorithms operate only over discrete domains or continuous domains with restricted potential functions. We investigate two approximate lifted variational approaches that apply to domains with general hybrid potentials, and are expressive enough to capture multi-modality. We demonstrate that the proposed variational methods are highly scalable and can exploit approximate model symmetries even in the presence of a large amount of continuous evidence, outperforming existing message-passing-based approaches in a variety of settings. Additionally, we present a sufficient condition for the Bethe variational approximation to yield a non-trivial estimate over the marginal polytope.

#2 Learning Bayesian Networks Under Sparsity Constraints: A Parameterized Complexity Analysis [PDF] [Copy] [Kimi]

Authors: Niels Grüttemeier ; Christian Komusiewicz

We study the problem of learning the structure of an optimal Bayesian network when additional structural constraints are posed on the network or on its moralized graph. More precisely, we consider the constraint that the moralized graph can be transformed to a graph from a sparse graph class Π by at most k vertex deletions. We show that for Π being the graphs with maximum degree 1, an optimal network can be computed in polynomial time when k is constant, extending previous work that gave an algorithm with such a running time for Π being the class of edgeless graphs [Korhonen & Parviainen, NIPS 2015]. We then show that further extensions or improvements are presumably impossible. For example, we show that when Π is the set of graphs in which each component has size at most three, then learning an optimal network is NP-hard even if k=0. Finally, we show that learning an optimal network with at most k edges in the moralized graph presumably is not fixed-parameter tractable with respect to k and that, in contrast, computing an optimal network with at most k arcs can be computed is fixed-parameter tractable in k.

#3 Approximate Weighted First-Order Model Counting: Exploiting Fast Approximate Model Counters and Symmetry [PDF] [Copy] [Kimi]

Authors: Timothy van Bremen ; Ondrej Kuzelka

We study the symmetric weighted first-order model counting task and present ApproxWFOMC, a novel anytime method for efficiently bounding the weighted first-order model count of a sentence given an unweighted first-order model counting oracle. The algorithm has applications to inference in a variety of first-order probabilistic representations, such as Markov logic networks and probabilistic logic programs. Crucially for many applications, no assumptions are made on the form of the input sentence. Instead, the algorithm makes use of the symmetry inherent in the problem by imposing cardinality constraints on the number of possible true groundings of a sentence's literals. Realising the first-order model counting oracle in practice using the approximate hashing-based model counter ApproxMC3, we show how our algorithm is competitive with existing approximate and exact techniques for inference in first-order probabilistic models. We additionally provide PAC guarantees on the accuracy of the bounds generated.

#4 Efficient and Robust High-Dimensional Linear Contextual Bandits [PDF] [Copy] [Kimi]

Authors: Cheng Chen ; Luo Luo ; Weinan Zhang ; Yong Yu ; Yijiang Lian

The linear contextual bandits is a sequential decision-making problem where an agent decides among sequential actions given their corresponding contexts. Since large-scale data sets become more and more common, we study the linear contextual bandits in high-dimensional situations. Recent works focus on employing matrix sketching methods to accelerating contextual bandits. However, the matrix approximation error will bring additional terms to the regret bound. In this paper we first propose a novel matrix sketching method which is called Spectral Compensation Frequent Directions (SCFD). Then we propose an efficient approach for contextual bandits by adopting SCFD to approximate the covariance matrices. By maintaining and manipulating sketched matrices, our method only needs O(md) space and O(md) updating time in each round, where d is the dimensionality of the data and m is the sketching size. Theoretical analysis reveals that our method has better regret bounds than previous methods in high-dimensional cases. Experimental results demonstrate the effectiveness of our algorithm and verify our theoretical guarantees.

#5 Scaling Up AND/OR Abstraction Sampling [PDF] [Copy] [Kimi]

Authors: Kalev Kask ; Bobak Pezeshki ; Filjor Broka ; Alexander Ihler ; Rina Dechter

Abstraction Sampling (AS) is a recently introduced enhancement of Importance Sampling that exploits stratification by using a notion of abstractions: groupings of similar nodes into abstract states. It was previously shown that AS performs particularly well when sampling over an AND/OR search space; however, existing schemes were limited to ``proper'' abstractions in order to ensure unbiasedness, severely hindering scalability. In this paper, we introduce AOAS, a new Abstraction Sampling scheme on AND/OR search spaces that allow more flexible use of abstractions by circumventing the properness requirement. We analyze the properties of this new algorithm and, in an extensive empirical evaluation on five benchmarks, over 480 problems, and comparing against other state of the art algorithms, illustrate AOAS's properties and show that it provides a far more powerful and competitive Abstraction Sampling framework.

#6 Neural Belief Reasoner [PDF] [Copy] [Kimi]

Author: Haifeng Qian

This paper proposes a new generative model called neural belief reasoner (NBR). It differs from previous models in that it specifies a belief function rather than a probability distribution. Its implementation consists of neural networks, fuzzy-set operations and belief-function operations, and query-answering, sample-generation and training algorithms are presented. This paper studies NBR in two tasks. The first is a synthetic unsupervised-learning task, which demonstrates NBR's ability to perform multi-hop reasoning, reasoning with uncertainty and reasoning about conflicting information. The second is supervised learning: a robust MNIST classifier for 4 and 9, which is the most challenging pair of digits. This classifier needs no adversarial training, and it substantially exceeds the state of the art in adversarial robustness as measured by the L2 metric, while at the same time maintains 99.1% accuracy on natural images.

#7 A Complete Characterization of Projectivity for Statistical Relational Models [PDF] [Copy] [Kimi]

Authors: Manfred Jaeger ; Oliver Schulte

A generative probabilistic model for relational data consists of a family of probability distributions for relational structures over domains of different sizes. In most existing statistical relational learning (SRL) frameworks, these models are not projective in the sense that the marginal of the distribution for size-n structures on induced substructures of size k Keywords: Uncertainty in AI: Statistical Relational AI Machine Learning: Relational Learning Machine Learning: Probabilistic Machine Learning

#8 State Variable Effects in Graphical Event Models [PDF] [Copy] [Kimi]

Authors: Debarun Bhattacharjya ; Dharmashankar Subramanian ; Tian Gao

Many real-world domains involve co-evolving relationships between events, such as meals and exercise, and time-varying random variables, such as a patient's blood glucose levels. In this paper, we propose a general framework for modeling joint temporal dynamics involving continuous time transitions of discrete state variables and irregular arrivals of events over the timeline. We show how conditional Markov processes (as represented by continuous time Bayesian networks) and multivariate point processes (as represented by graphical event models) are among various processes that are covered by the framework. We introduce and compare two simple and interpretable yet practical joint models within the framework with relevant baselines on simulated and real-world datasets, using a graph search algorithm for learning. The experiments highlight the importance of jointly modeling event arrivals and state variable transitions to better fit joint temporal datasets, and the framework opens up possibilities for models involving even more complex dynamics whenever suitable.