| Total: 134
Reward is the driving force for reinforcement-learning agents. We here set out to understand the expressivity of Markov reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of "task": (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories. Our main results prove that while reward can express many of these tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to perform each task type, and correctly determine when no such reward function exists.
For many years, link prediction on knowledge. graphs has been a purely transductive task, not allowing for reasoning on unseen entities. Recently, increasing efforts are put into exploring semi- and fully inductive scenarios, enabling inference over unseen and emerging entities. Still, all these approaches only consider triple-based KGs, whereas their richer counterparts, hyper-relational KGs (e.g., Wikidata), have not yet been properly studied. In this work, we classify different inductive settings and study the benefits of employing hyper-relational KGs on a wide range of semi- and fully inductive link prediction tasks powered by recent advancements in graph neural networks. Our experiments on a novel set of benchmarks show that qualifiers over typed edges can lead to performance improvements of 6% of absolute gains (for the Hits@10 metric) compared to triple-only baselines. Our code is available at https://github.com/mali-git/hyper_relational_ilp.
Extending the popular Answer Set Programming (ASP) paradigm by introspective reasoning capacities has received increasing interest within the last years. Particular attention is given to the formalism of epistemic logic programs (ELPs) where standard rules are equipped with modal operators which allow to express conditions on literals for being known or possible, i.e., contained in all or some answer sets, respectively. ELPs thus deliver multiple collections of answer sets, known as world views. Employing ELPs for reasoning problems so far has mainly been restricted to standard deci- sion problems (complexity analysis) and enumeration (development of systems) of world views. In this paper, we first establish quantitative reasoning for ELPs, where the acceptance of a certain set of literals depends on the number (proportion) of world views that are compatible with the set. Second, we present a novel system capable of efficiently solving the underlying counting problems required for quantitative reasoning. Our system exploits the graph-based measure treewidth by iteratively finding (graph) abstractions of ELPs.
Existential rules are a very popular ontology-mediated query language for which the chase represents a generic computational approach for query answering. It is straightforward that existential rule queries exhibiting chase termination are decidable and can only recognize properties that are preserved under homomorphisms. This paper is an extended abstract of our eponymous publication at KR 2021 where we show the converse: every decidable query that is closed under homomorphism can be expressed by an existential rule set for which the standard chase universally terminates. Membership in this fragment is not decidable, but we show via a diagonalisation argument that this is unavoidable.
Modern SAT solvers are based on a paradigm named conflict driven clause learning (CDCL), while local search is an important alternative. Although there have been attempts combining these two methods, this work proposes deeper cooperation techniques. First, we relax the CDCL framework by extending promising branches to complete assignments and calling a local search solver to search for a model nearby. More importantly, the local search assignments and the conflict frequency of variables in local search are exploited in the phase selection and branching heuristics of CDCL. We use our techniques to improve three typical CDCL solvers (glucose, MapleLCMDistChronoBT and Kissat). Experiments on benchmarks from the Main tracks of SAT Competitions 2017-2020 and a real world benchmark of spectrum allocation show that the techniques bring significant improvements, particularly on satisfiable instances. A resulting solver won the Main Track SAT category in SAT Competition 2020 and also performs very well on the spectrum allocation benchmark. As far as we know, this is the first work that meets the standard of the challenge ``Demonstrate the successful combination of stochastic search and systematic search techniques, by the creation of a new algorithm that outperforms the best previous examples of both approaches.'' (AAAI 1997) on standard application benchmarks.
We present a scalable planning algorithm for multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number of agents. We circumvent this complexity through an anytime approach that allows us to trade computation for approximation quality and also dynamically coordinate actions. Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factorizing local agent interactions with coordination graphs, and selecting optimal joint actions with the Max-Plus method. On the benchmark SysAdmin domain with static coordination graphs, our approach achieves comparable performance with much lower computation cost than the MCTS baselines. We also introduce a multi-drone delivery domain with dynamic, i.e., state-dependent coordination graphs, and demonstrate how our approach scales to large problems on this domain that are intractable for other MCTS methods.
Machine-learning models contain information about the data they were trained on. This information leaks either through the model itself or through predictions made by the model. Consequently, when the training data contains sensitive attributes, assessing the amount of information leakage is paramount. We propose a method to quantify this leakage using the Fisher information of the model about the data. Unlike the worst-case a priori guarantees of differential privacy, Fisher information loss measures leakage with respect to specific examples, attributes, or sub-populations within the dataset. We motivate Fisher information loss through the Cramer-Rao bound and delineate the implied threat model. We provide efficient methods to compute Fisher information loss for output-perturbed generalized linear models. Finally, we empirically validate Fisher information loss as a useful measure of information leakage.
Opportunities such as higher education can promote intergenerational mobility, leading individuals to achieve levels of socioeconomic status above that of their parents. In this work, which is an extended abstract of a longer paper in the proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, we develop a dynamic model for allocating such opportunities in a society that exhibits bottlenecks in mobility; the problem of optimal allocation reflects a trade-off between the benefits conferred by the opportunities in the current generation and the potential to elevate the socioeconomic status of recipients, shaping the composition of future generations in ways that can benefit further from the opportunities. We show how optimal allocations in our model arise as solutions to continuous optimization problems over multiple generations, and we find in general that these optimal solutions can favor recipients of low socioeconomic status over slightly higher-performing individuals of high socioeconomic status --- a form of socioeconomic affirmative action that the society in our model discovers in the pursuit of purely payoff-maximizing goals. We characterize how the structure of the model can lead to either temporary or persistent affirmative action, and we consider extensions of the model with more complex processes modulating the movement between different levels of socioeconomic status.
Expectation maximization (EM) is the default algorithm for fitting probabilistic models with missing or latent variables, yet we lack a full understanding of its non-asymptotic convergence properties. Previous works show results along the lines of “EM converges at least as fast as gradient descent” by assuming the conditions for the convergence of gradient descent apply. This approach is not only loose, in that it does not capture that EM can make more progress than a gradient step, but the assumptions fail to hold for textbook examples of EM like Gaussian mixtures. In this work, we show that for the common setting of exponential family distributions, viewing EM as a mirror descent algorithm leads to convergence rates in Kullback-Leibler (KL) divergence and how the KL divergence is related to first-order stationarity via Bregman divergences. In contrast to previous works, the analysis is invariant to the choice of parametrization and holds with minimal assumptions. We also show applications of these ideas to local linear (and superlinear) convergence rates, generalized EM, and non-exponential family distributions.
Branch and Bound (BnB) has been successfully used to solve many combinatorial optimization problems. However, BnB MaxSAT solvers perform poorly when solving real-world and academic optimization problems. They are only competitive for random and some crafted instances. Thus, it is a prevailing opinion in the community that BnB is not really useful for practical MaxSAT solving. We refute this opinion by presenting a new BnB MaxSAT solver, called MaxCDCL, which combines clause learning and an efficient bounding procedure. MaxCDCL is among the top 5 out of a total of 15 exact solvers that participated in the 2020 MaxSAT Evaluation, solving several instances that other solvers cannot solve. Furthermore, MaxCDCL solves the highest number of instances from different MaxSAT Evaluations when combined with the best existing solvers.
We describe the deep learning-based COVID-19 cases predictor and the Pareto-optimal Non-Pharmaceutical Intervention (NPI) prescriptor developed by the winning team of the 500k XPRIZE Pandemic Response Challenge. The competition aimed at developing data-driven AI models to predict COVID-19 infection rates and to prescribe NPI Plans that governments, business leaders and organizations could implement to minimize harm when reopening their economies. In addition to the validation performed by XPRIZE with real data, our models were validated in a real-world scenario thanks to an ongoing collaboration with the Valencian Government in Spain. Our experience contributes to a necessary transition to more evidence-driven policy-making during a pandemic.
Neural link predictors are useful for identifying missing edges in large scale Knowledge Graphs. However, it is still not clear how to use these models for answering more complex queries containing logical conjunctions (∧), disjunctions (∨), and existential quantifiers (∃). We propose a framework for efficiently answering complex queries on in- complete Knowledge Graphs. We translate each query into an end-to-end differentiable objective, where the truth value of each atom is computed by a pre-trained neural link predictor. We then analyse two solutions to the optimisation problem, including gradient-based and combinatorial search. In our experiments, the proposed approach produces more accurate results than state-of-the-art methods — black-box models trained on millions of generated queries — without the need for training on a large and diverse set of complex queries. Using orders of magnitude less training data, we obtain relative improvements ranging from 8% up to 40% in Hits@3 across multiple knowledge graphs. We find that it is possible to explain the outcome of our model in terms of the intermediate solutions identified for each of the complex query atoms. All our source code and datasets are available online (https://github.com/uclnlp/cqd).
We introduce Detect, Understand, Act (DUA), a neuro-symbolic reinforcement learning framework. The Detect component is composed of a traditional computer vision object detector and tracker. The Act component houses a set of options, high-level actions enacted by pre-trained deep reinforcement learning (DRL) policies. The Understand component provides a novel answer set programming (ASP) paradigm for effectively learning symbolic meta-policies over options using inductive logic programming (ILP). We evaluate our framework on the Animal-AI (AAI) competition testbed, a set of physical cognitive reasoning problems. Given a set of pre-trained DRL policies, DUA requires only a few examples to learn a meta-policy that allows it to improve the state-of-the-art on multiple of the most challenging categories from the testbed. DUA constitutes the first holistic hybrid integration of computer vision, ILP and DRL applied to an AAI-like environment and sets the foundations for further use of ILP in complex DRL challenges.
Computing the gradient of stochastic Plackett-Luce (PL) ranking models for relevance and fairness metrics can be infeasible because it requires iterating over all possible permutations of items. In this paper, we introduce a novel algorithm: PL-Rank, that estimates the gradient of a PL ranking model through sampling. Unlike existing approaches, PL-Rank makes use of the specific structure of PL models and ranking metrics. Our experimental analysis shows that PL-Rank has a greater sample-efficiency and is computationally less costly than existing policy gradients, resulting in faster convergence at higher performance.
The shipping industry is an important component of the global trade and economy. In order to ensure law compliance and safety, it needs to be monitored. In this paper, we present a novel ship type classification model that combines vessel transmitted data from the Automatic Identification System, with vessel imagery. The main components of our approach are the Faster R-CNN Deep Neural Network and a Neuro-Fuzzy system with IF-THEN rules. We evaluate our model using real world data and showcase the advantages of this combination while also compare it with other methods. Results show that our model can increase prediction scores by up to 15.4% when compared with the next best model we considered, while also maintaining a level of explainability as opposed to common black box approaches.
When considering two concepts in terms of extensional logic, their combination will often be trivial, returning an empty extension. Consider e.g. “a Fish Vehicle”, i.e., “a Vehicle which is also a Fish”. Still, people use sophisticated strategies to produce new, non-empty concepts. All these strategies involve the human ability to mend the conflicting attributes of the input concepts and to create new properties of the combination. We focus in particular on the case where a Head concept has superior ‘asymmetric’ control over steering the resulting combination (or hybridisation) with a Modifier concept. Specifically, we propose a dialogical model of the cognitive and logical mechanics of this asymmetric form of hybridisation. Its implementation is then evaluated using a combination of example ontologies.
Although heuristic search is one of the most successful approaches to classical planning, this planning paradigm does not apply straightforwardly to Generalized Planning (GP). This paper adapts the planning as heuristic search paradigm to the particularities of GP, and presents the first native heuristic search approach to GP. First, the paper defines a program-based solution space for GP that is independent of the number of planning instances in a GP problem, and the size of these instances. Second, the paper defines the BFGP algorithm for GP, that implements a best-first search in our program-based solution space, and that is guided by different evaluation and heuristic functions.
Online peer-to-peer support platforms enable conversations between millions of people who seek and provide mental health support. If successful, web-based mental health conversations could improve access to treatment and reduce the global disease burden. Psychologists have repeatedly demonstrated that empathy, the ability to understand and feel the emotions and experiences of others, is a key component leading to positive outcomes in supportive conversations. However, recent studies have shown that highly empathic conversations are rare in online mental health platforms. In this paper, we work towards improving empathy in online mental health support conversations. We introduce a new task of empathic rewriting which aims to transform low-empathy conversational posts to higher empathy. Learning such transformations is challenging and requires a deep understanding of empathy while maintaining conversation quality through text fluency and specificity to the conversational context. Here we propose Partner, a deep reinforcement learning (RL) agent that learns to make sentence-level edits to posts in order to increase the expressed level of empathy while maintaining conversation quality. Our RL agent leverages a policy network, based on a transformer language model adapted from GPT-2, which performs the dual task of generating candidate empathic sentences and adding those sentences at appropriate positions. Through a combination of automatic and human evaluation, we demonstrate that Partner successfully generates more empathic, specific, and diverse responses and outperforms NLP methods from related tasks such as style transfer and empathic dialogue generation.
When prototyping AI experiences (AIX), interface designers seek effective ways to support end-user tasks through AI capabilities. However, AI poses challenges to design due to its dynamic behavior in response to training data, end-user data, and feedback. Designers must consider AI's uncertainties and offer adaptations such as explainability, error recovery, and automation vs. human task control. Unfortunately, current prototyping tools assume a black-box view of AI, forcing designers to work with separate tools to explore machine learning models, understand model performance, and align interface choices with model behavior. This introduces friction to rapid and iterative prototyping. We propose Model-Informed Prototyping (MIP), a workflow for AIX design that combines model exploration with UI prototyping tasks. Our system, ProtoAI, allows designers to directly incorporate model outputs into interface designs, evaluate design choices across different inputs, and iteratively revise designs by analyzing model breakdowns.
In this paper, we describe a black-box sockpuppeting audit which we carried out to investigate the creation and bursting dynamics of misinformation filter bubbles on YouTube. Pre-programmed agents acting as YouTube users stimulated YouTube's recommender systems: they first watched a series of misinformation promoting videos (bubble creation) and then a series of misinformation debunking videos (bubble bursting). Meanwhile, agents logged videos recommended to them by YouTube. After manually annotating these recommendations, we were able to quantify the portion of misinformative videos among them. The results confirm the creation of filter bubbles (albeit not in all situations) and show that these bubbles can be bursted by watching credible content. Drawing a direct comparison with a previous study, we do not see improvements in overall quantities of misinformation recommended.
Current approaches for optimizing parameters in unrolled computation graphs suffer from high variance gradients, bias, slow updates, or large memory usage. We introduce a method called Persistent Evolution Strategies (PES), which divides the computation graph into a series of truncated unrolls, and performs an evolution strategies-based update step after each unroll. PES eliminates bias from these truncations by accumulating correction terms over the entire sequence of unrolls. PES allows for rapid parameter updates, has low memory usage, is unbiased, and has reasonable variance.
We resolve the min-max complexity of distributed stochastic convex optimization (up to a log factor) in the intermittent communication setting, where M machines work in parallel over the course of R rounds of communication to optimize the objective, and during each round of communication, each machine may sequentially compute K stochastic gradient estimates. We present a novel lower bound with a matching upper bound that establishes an optimal algorithm.
Spatial data are ubiquitous and have transformed decision-making in many critical domains, including public health, agriculture, transportation, etc. While recent advances in machine learning offer promising ways to harness massive spatial datasets (e.g., satellite imagery), spatial heterogeneity -- a fundamental property of spatial data -- poses a major challenge as data distributions or generative processes often vary over space. Recent studies targeting this difficult problem either require a known space-partitioning as the input, or can only support limited special cases (e.g., binary classification). Moreover, heterogeneity-pattern learned by these methods are locked to the locations of the training samples, and cannot be applied to new locations. We propose a statistically-guided framework to adaptively partition data in space during training using distribution-driven optimization and transform a deep learning model (of user's choice) into a heterogeneity-aware architecture. We also propose a spatial moderator to generalize learned patterns to new test regions. Experiment results on real-world datasets show that the framework can effectively capture footprints of heterogeneity and substantially improve prediction performances.
Signed languages are the primary means of communication for many deaf and hard of hearing individuals. Since signed languages exhibit all the fundamental linguistic properties of natural language, we believe that tools and theories of Natural Language Processing (NLP) are crucial towards its modeling. However, existing research in Sign Language Processing (SLP) seldom attempt to explore and leverage the linguistic organization of signed languages. This position paper calls on the NLP community to include signed languages as a research area with high social and scientific impact. We first discuss the linguistic properties of signed languages to consider during their modeling. Then, we review the limitations of current SLP models and identify the open challenges to extend NLP to signed languages. Finally, we urge (1) the adoption of an efficient tokenization method; (2) the development of linguistically-informed models; (3) the collection of real-world signed language data; (4) the inclusion of local signed language communities as an active and leading voice in research.
Dense Retrieval (DR) has achieved state-of-the-art first-stage ranking effectiveness. However, the efficiency of most existing DR models is limited by the large memory cost of storing dense vectors and the time-consuming nearest neighbor search (NNS) in vector space. Therefore, we present RepCONC, a novel retrieval model that learns discrete Representations via CONstrained Clustering. RepCONC jointly trains dual-encoders and the Product Quantization (PQ) method to learn discrete document representations and enables fast approximate NNS with compact indexes. It models quantization as a constrained clustering process, which requires the document embeddings to be uniformly clustered around the quantization centroids. We theoretically demonstrate the importance of the uniform clustering constraint and derive an efficient approximate solution for constrained clustering by reducing it to an instance of the optimal transport problem. Extensive experiments on two popular ad-hoc retrieval benchmarks show that RepCONC substantially outperforms a wide range of existing retrieval models in terms of retrieval effectiveness, memory efficiency, and time efficiency.