| Total: 16

Fuzzy Answer Set Programming (FASP) combines the non-monotonic reasoning typical of Answer Set Programming with the capability of Fuzzy Logic to deal with imprecise information and paraconsistent reasoning. In the context of paraconsistent reasoning, the fundamental principle of minimal undefinedness states that truth degrees close to 0 and 1 should be preferred to those close to 0.5, to minimize the ambiguity of the scenario. The aim of this paper is to enforce such a principle in FASP through the minimization of a measure of undefinedness. Algorithms that minimize undefinedness of fuzzy answer sets are presented, and implemented.

A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy that maximizes the expected discounted-sum payoff. However, such policies may still permit unlikely but highly undesirable outcomes, which is problematic especially in safety-critical applications. Recently, there has been a surge of interest in POMDPs where the goal is to maximize the probability to ensure that the payoff is at least a given threshold, but these approaches do not consider any optimization beyond satisfying this threshold constraint. In this work we go beyond both the “expectation” and “threshold” approaches and consider a “guaranteed payoff optimization (GPO)” problem for POMDPs, where we are given a threshold t and the objective is to find a policy σ such that a) each possible outcome of σ yields a discounted-sum payoff of at least t, and b) the expected discounted-sum payoff of σ is optimal (or near-optimal) among all policies satisfying a). We present a practical approach to tackle the GPO problem and evaluate it on standard POMDP benchmarks.

Probabilistic modeling is one of the foundations of modern machine learning and artificial intelligence. In this paper, we propose a novel type of probabilistic models named latent dependency forest models (LDFMs). A LDFM models the dependencies between random variables with a forest structure that can change dynamically based on the variable values. It is therefore capable of modeling context-specific independence. We parameterize a LDFM using a first-order non-projective dependency grammar. Learning LDFMs from data can be formulated purely as a parameter learning problem, and hence the difficult problem of model structure learning is circumvented. Our experimental results show that LDFMs are competitive with existing probabilistic models.

Solving constrained combinatorial optimisation problems via MAP inference is often achieved by introducing extra potential functions for each constraint. This can result in very high order potentials, e.g. a 2nd-order objective with pairwise potentials and a quadratic constraint over all N variables would correspond to an unconstrained objective with an order-N potential. This limits the practicality of such an approach, since inference with high order potentials is tractable only for a few special classes of functions. We propose an approach which is able to solve constrained combinatorial problems using belief propagation without increasing the order. For example, in our scheme the 2nd-order problem above remains order 2 instead of order N. Experiments on applications ranging from foreground detection, image reconstruction, quadratic knapsack, and the M-best solutions problem demonstrate the effectiveness and efficiency of our method. Moreover, we show several situations in which our approach outperforms commercial solvers like CPLEX and others designed for specific constrained MAP inference problems.

Human motor behavior is naturally guided by sensing the environment. To predict such sensori-motor behavior, it is necessary to model what is sensed and how actions are chosen based on the obtained sensory measurements. Although several models of human sensing haven been proposed, rarely data of the assumed sensory measurements is available. This makes statistical estimation of sensor models problematic. To overcome this issue, we propose an abstract structural estimation approach building on the ideas of Herman et al.'s Simultaneous Estimation of Rewards and Dynamics (SERD). Assuming optimal fusion of sensory information and rational choice of actions the proposed method allows to infer sensor models even in absence of data of the sensory measurements. To the best of our knowledge, this work presents the first general approach for joint inference of sensor and policy models. Furthermore, we consider its concrete implementation in the important class of sensor scheduling linear quadratic Gaussian problems. Finally, the effectiveness of the approach is demonstrated for prediction of the behavior of automobile drivers. Specifically, we model the glance and steering behavior of driving in the presence of visually demanding secondary tasks. The results show, that prediction benefits from the inference of sensor models. This is the case, especially, if also information is considered, that is contained in gaze switching behavior.

We consider the setting of stochastic multiagent systems and formulate an automated verification framework for quantifying and reasoning about agents' trust. To capture human trust, we work with a cognitive notion of trust defined as a subjective evaluation that agent A makes about agent B's ability to complete a task, which in turn may lead to a decision by A to rely on B. We propose a probabilistic rational temporal logic PRTL*, which extends the logic PCTL* with reasoning about mental attitudes (beliefs, goals and intentions), and includes novel operators that can express concepts of social trust such as competence, disposition and dependence. The logic can express, for example, that "agent A will eventually trust agent B with probability at least p that B will be have in a way that ensures the successful completion of a given task". We study the complexity of the automated verification problem and, while the general problem is undecidable, we identify restrictions on the logic and the system that result in decidable, or even tractable, subproblems.

Nonparametric inference techniques provide promising tools for probabilistic reasoning in high-dimensional nonlinear systems.Most of these techniques embed distributions into reproducing kernel Hilbert spaces (RKHS) and rely on the kernel Bayes' rule (KBR) to manipulate the embeddings. However, the computational demands of the KBR scale poorly with the number of samples and the KBR often suffers from numerical instabilities. In this paper, we present the kernel Kalman rule (KKR) as an alternative to the KBR.The derivation of the KKR is based on recursive least squares, inspired by the derivation of the Kalman innovation update.We apply the KKR to filtering tasks where we use RKHS embeddings to represent the belief state, resulting in the kernel Kalman filter (KKF).We show on a nonlinear state estimation task with high dimensional observations that our approach provides a significantly improved estimation accuracy while the computational demands are significantly decreased.

We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold under the assumption that the arms expected rewards are perfectly linear in their features. It is, however, of interest to investigate the impact of potential misspecification in linear bandit models, where the expected rewards are perturbed away from the linear subspace determined by the arms features. Although OFUL has recently been shown to be robust to relatively small deviations from linearity, we show that any linear bandit algorithm that enjoys optimal regret performance in the perfectly linear setting (e.g., OFUL) must suffer linear regret under a sparse additive perturbation of the linear model. In an attempt to overcome this negative result,we define a natural class of bandit models characterized by a non-sparse deviation from linearity. We argue that the OFUL algorithm can fail to achieve sublinear regret even under models that have non-sparse deviation. We finally develop a novel bandit algorithm, comprising a hypothesis test for linearity followed by a decision to use either the OFUL or Upper Confidence Bound (UCB) algorithm. For perfectly linear bandit models, the algorithm provably exhibits OFULs favorable regret performance, while for misspecified models satisfying the non-sparse deviation property, the algorithm avoids the linear regret phenomenon and falls back on UCBs sublinear regret scaling. Numerical experiments on synthetic data, and on recommendation data from the public Yahoo! Learning toRank Challenge dataset, empirically support our findings.

The formalism of multi-objective influence diagrams has recently been developed for modeling and solving sequential decision problems under uncertainty and multiple objectives. Since utility values representing the decision maker's preferences are only partially ordered (e.g., by the Pareto order) we no longer have a unique maximal value of expected utility, but a set of them. Computing the set of maximal values of expected utility and the corresponding policies can be computationally very challenging. In this paper, we consider alternative notions of optimality, one of the most important one being the notion of possibly optimal, namely optimal in at least one scenario compatible with the inter-objective tradeoffs. We develop a variable elimination algorithm for computing the set of possibly optimal expected utility values, prove formally its correctness, and compare variants of the algorithm experimentally.

Several advanced applications of autonomous aerial vehicles in civilian and military contexts involve a searching agent with imperfect sensors that seeks to locate a mobile target in a given region. Effectively managing uncertainty is key to solving the related search problem, which is why all methods devised so far hinge on a probabilistic formulation of the problem and solve it through branch-and-bound algorithms, Bayesian filtering or POMDP solvers. In this paper, we consider a class of hard search tasks involving a target that exhibits an intentional evasive behaviour and moves over a large geographical area, i.e., a target that is particularly difficult to track down and uncertain to locate. We show that, even for such a complex problem, it is advantageous to compile its probabilistic structure into a deterministic model and use standard deterministic solvers to find solutions. In particular, we formulate the search problem for our uncooperative target both as a deterministic automated planning task and as a constraint programming task and show that in both cases our solution outperforms POMDPs methods.

We introduce new anytime search algorithms that combine best-first with depth-first search into hybrid schemes for Marginal MAP inference in graphical models. The main goal is to facilitate the generation of upper bounds (via the best-first part) alongside the lower bounds of solutions (via the depth-first part) in an anytime fashion. We compare against two of the best current state-of-the-art schemes and show that our best+depth search scheme produces higher quality solutions faster while also producing a bound on their accuracy, which can be used to measure solution quality during search. An extensive empirical evaluation demonstrates the effectiveness of our new methods which enjoy the strength of best-first (optimality of search) and of depth-first (memory robustness), leading to solutions for difficult instances where previous solvers were unable to find even a single solution.

Hybrid (mixed discrete and continuous) state and action Markov Decision Processes (HSA-MDPs) provide an expressive formalism for modeling stochastic and concurrent sequential decision-making problems. Existing solvers for HSA-MDPs are either limited to very restricted transition distributions, require knowledge of domain-specific basis functions to achieve good approximations, or do not scale. We explore a domain-independent approach based on the framework of hindsight optimization (HOP) for HSA-MDPs, which uses an upper bound on the finite-horizon action values for action selection. Our main contribution is a linear time reduction to a Mixed Integer Linear Program (MILP) that encodes the HOP objective, when the dynamics are specified as location-scale probability distributions parametrized by Piecewise Linear (PWL) functions of states and actions. In addition, we show how to use the same machinery to select actions based on a lower-bound generated by straight line plans. Our empirical results show that the HSA-HOP approach effectively scales to high-dimensional problems and outperforms baselines that are capable of scaling to such large hybrid MDPs.

Weighted model counting (WMC) has recently emerged as an effective and general approach to probabilistic inference, offering a computational framework for encoding a variety of formalisms, such as factor graphs and Bayesian networks.The advent of large-scale probabilistic knowledge bases has generated further interest in relational probabilistic representations, obtained by according weights to first-order formulas, whose semantics is given in terms of the ground theory, and solved by WMC. A fundamental limitation is that the domain of quantification, by construction and design, is assumed to be finite, which is at odds with areas such as vision and language understanding, where the existence of objects must be inferred from raw data. Dropping the finite-domain assumption has been known to improve the expressiveness of a first-order language for open-universe purposes, but these languages, so far, have eluded WMC approaches. In this paper, we revisit relational probabilistic models over an infinite domain, and establish a number of results that permit effective algorithms. We demonstrate this language on a number of examples, including a parameterized version of Pearl's Burglary-Earthquake-Alarm Bayesian network.

Temporally extended goals are critical to the specification of a diversity of real-world planning problems. Here we examine the problem of non-deterministic planning with temporally extended goals specified in linear temporal logic (LTL), interpreted over either finite or infinite traces. Unlike existing LTL planners, we place no restrictions on our LTL formulae beyond those necessary to distinguish finite from infinite interpretations. We generate plans by compiling LTL temporally extended goals into problem instances described in the Planning Domain Definition Language that are solved by a state-of-the-art fully observable non-deterministic planner. We propose several different compilations based on translations of LTL to (Büchi) alternating or (Büchi) non-deterministic finite state automata, and evaluate various properties of the competing approaches. We address a diverse spectrum of LTL planning problems that, to this point, had not been solvable using AI planning techniques, and do so in a manner that demonstrates highly competitive performance.

Belief Propagation (BP) is a widely used approximation for exact probabilistic inference in graphical models, such as Markov Random Fields (MRFs). In graphs with cycles, however, no exact convergence guarantees for BP are known, in general. For the case when all edges in the MRF carry the same symmetric, doubly stochastic potential, recent works have proposed to approximate BP by linearizing the update equations around default values, which was shown to work well for the problem of node classification. The present paper generalizes all prior work and derives an approach that approximates loopy BP on any pairwise MRF with the problem of solving a linear equation system. This approach combines exact convergence guarantees and a fast matrix implementation with the ability to model heterogenous networks. Experiments on synthetic graphs with planted edge potentials show that the linearization has comparable labeling accuracy as BP for graphs with weak potentials, while speeding-up inference by orders of magnitude.

Controlling for selection and confounding biases are two of the most challenging problems in the empirical sciences as well as in artificial intelligence tasks. Covariate adjustment (or, Backdoor Adjustment) is the most pervasive technique used for controlling confounding bias, but the same is oblivious to issues of sampling selection. In this paper, we introduce a generalized version of covariate adjustment that simultaneously controls for both confounding and selection biases. We first derive a sufficient and necessary condition for recovering causal effects using covariate adjustment from an observational distribution collected under preferential selection. We then relax this setting to consider cases when additional, unbiased measurements over a set of covariates are available for use (e.g., the age and gender distribution obtained from census data). Finally, we present a complete algorithm with polynomial delay to find all sets of admissible covariates for adjustment when confounding and selection biases are simultaneously present and unbiased data is available.