| Total: 33

A logic program with function symbols is called finitely ground if there is a finite propositional logic program whose stable models are exactly the same as the stable models of this program. Finite groundability is an important property for logic programs with function symbols because it makes feasible to compute such program’s stable models using traditional ASP solvers. In this paper, we introduce a new decidable class of finitely ground programs called POLY-bounded programs, which, to the best of our knowledge, strictly contains all decidable classes of finitely ground programs discovered so far in the literature. We also study the related complexity property for this class of programs. We prove that deciding whether a program is POLY-bounded is EXPTIMEcomplete.

In our daily life, short texts have been everywhere especially since the emergence of social network. There are countless short texts in online media like twitter, online Q&A sites and so on. Discovering topics is quite valuable in various application domains such as content recommendation and text characterization. Traditional topic models like LDA are widely applied for sorts of tasks, but when it comes to short text scenario, these models may get stuck due to the lack of words. Recently, a popular model named BTM uses word co-occurrence relationship to solve the sparsity problem and is proved effectively. However, both BTM and extended models ignore the inside relationship between words. From our perspectives, more related words should appear in the same topic. Based on this idea, we propose a model named RIBS-TM which makes use of RNN for relationship learning and IDF for filtering high-frequency words. Experiments on two real-world short text datasets show great utility of our model.

HEX-programs extend of answer-set programs (ASP) with ex-ternal sources. In previous work, notions of equivalence ofASP programs under extensions have been developed. Mostwell-known are strong equivalence, which is given for pro-grams P and Q if P ∪ R and Q ∪ R have the same answersets for arbitrary programs R, and uniform equivalence, whichis given if this is guaranteed for sets R of facts. More fine-grained approaches exist, which restrict the set of atoms inthe added program R. In this paper we provide a characteriza-tion of equivalence of HEX -programs. Since well-known ASPextensions (e.g. constraint ASP) amount to special cases ofHEX , the results are interesting beyond the particular formal-ism. Based on this, we further characterize inconsistency ofprograms wrt. program extensions. We then discuss possibleapplications of the results for algorithms improvements.

HEX-programs are an extension of answer set programming(ASP) towards external sources. To this end, external atomsprovide a bidirectional interface between the program and anexternal source. Traditionally, HEX -programs are evaluatedusing a rewriting to ordinary ASP programs which guess truthvalues of external atoms; this yields answer set candidateswhose guesses are verified by evaluating the source. Despitethe integration of learning techniques into this approach, whichreduce the number of candidates and of necessary verificationcalls, the remaining external calls are still expensive. In thispaper we present an alternative approach based on inliningof external atoms, motivated by existing but less general approaches for specialized formalisms such as DL-programs. External atoms are then compiled away such that no verification calls are necessary. To this end, we make use of supportsets, which describe conditions on input atoms that are sufficient to satisfy an external atom. The approach is implementedin the DLVHEX reasoner. Experiments show a significant performance gain.

Persuasion is an activity that involves one party (the persuader) trying to induce another party (the persuadee) to believe or do something. For this, it can be advantageous forthe persuader to have a model of the persuadee. Recently, some proposals in the field of computational models of argument have been made for probabilistic models of what the persuadee knows about, or believes. However, these developments have not systematically harnessed established notions in decision theory for maximizing the outcome of a dialogue. To address this, we present a general framework for representing persuasion dialogues as a decision tree, and for using decision rules for selecting moves. Furthermore, we provide some empirical results showing how some well-known decision rules perform, and make observations about their general behaviour in the context of dialogues where there is uncertainty about the accuracy of the user model.

We consider the problem of identifying the causal direction between two discrete random variables using observational data. Unlike previous work, we keep the most general functional model but make an assumption on the unobserved exogenous variable: Inspired by Occam's razor, we assume that the exogenous variable is simple in the true causal direction. We quantify simplicity using Renyi entropy. Our main result is that, under natural assumptions, if the exogenous variable has low H0 entropy (cardinality) in the true direction, it must have high H0 entropy in the wrong direction. We establish several algorithmic hardness results about estimating the minimum entropy exogenous variable. We show that the problem of finding the exogenous variable with minimum H1 entropy (Shannon Entropy) is equivalent to the problem of finding minimum joint entropy given n marginal distributions, also known as minimum entropy coupling problem. We propose an efficient greedy algorithm for the minimum entropy coupling problem, that for n=2 provably finds a local optimum. This gives a greedy algorithm for finding the exogenous variable with minimum Shannon entropy. Our greedy entropy-based causal inference algorithm has similar performance to the state of the art additive noise models in real datasets. One advantage of our approach is that we make no use of the values of random variables but only their distributions. Our method can therefore be used for causal inference for both ordinal and also categorical data, unlike additive noise models.

Taxonomy is indispensable in understanding natural language. A variety of large scale, usage-based, data-driven lexical taxonomies have been constructed in recent years.Hypernym-hyponym relationship, which is considered as the backbone of lexical taxonomies can not only be used to categorize the data but also enables generalization. In particular, we focus on one of the most prominent properties of the hypernym-hyponym relationship, namely, transitivity, which has a significant implication for many applications. We show that, unlike human crafted ontologies and taxonomies, transitivity does not always hold in data-drivenlexical taxonomies. We introduce a supervised approach to detect whether transitivity holds for any given pair of hypernym-hyponym relationships. Besides solving the inferencing problem, we also use the transitivity to derive new hypernym-hyponym relationships for data-driven lexical taxonomies. We conduct extensive experiments to show the effectiveness of our approach.

Knowledge base(KB) plays an important role in artificial intelligence. Much effort has been taken to both manually and automatically construct web-scale knowledge bases. Comparing with manually constructed KBs, automatically constructed KB is broader but with more noises. In this paper, we study the problem of improving the quality for automatically constructed web-scale knowledge bases, in particular, lexical taxonomies of isA relationships. We find that these taxonomies usually contain cycles, which are often introduced by incorrect isA relations. Inspired by this observation, we introduce two kinds of models to detect incorrect isA relations from cycles. The first one eliminates cycles by extracting directed acyclic graphs, and the other one eliminates cycles by grouping nodes into different levels. We implement our models on Probase, a state-of-the-art, automatically constructed, web-scale taxonomy. After processing tens of millions of relations, our models eliminate 74 thousand wrong relations with 91% accuracy.

With the large volume of new information created every day, determining the validity of information in a knowledge graph and filling in its missing parts are crucial tasks for many researchers and practitioners. To address this challenge, a number of knowledge graph completion methods have been developed using low-dimensional graph embeddings. Although researchers continue to improve these models using an increasingly complex feature space, we show that simple changes in the architecture of the underlying model can outperform state-of-the-art models without the need for complex feature engineering. In this work, we present a shared variable neural network model called ProjE that fills-in missing information in a knowledge graph by learning joint embeddings of the knowledge graph’s entities and edges, and through subtle, but important, changes to the standard loss function. In doing so, ProjE has a parameter size that is smaller than 11 out of 15 existing methods while performing 37% better than the current-best method on standard datasets. We also show, via a new fact checking task, that ProjE is capable of accurately determining the veracity of many declarative statements.

We study description logics (DLs) supporting number restrictions on transitive roles. We first take a look at SOQ and SON with binary and unary coding of numbers, and provide algorithms for the satisfiability problem and tight complexity bounds ranging from EXPTIME to NEXPTIME. We then show that by allowing for counting only up to one (functionality), inverse roles and role inclusions can be added without losing decidability. We finally investigate DLs of the DL-Lite-family, and show that, in the presence of role inclusions, the core fragment becomes undecidable.

Answer Set Programming (ASP) is a well-established formalism for nonmonotonic reasoning. An ASP program can have no answer set due to cyclic default negation. In this case, it is not possible to draw any conclusion, even if this is not intended. Recently, several paracoherent semantics have been proposed that address this issue,and several potential applications for these semantics have been identified. However, paracoherent semantics have essentially been inapplicable in practice, due to the lack of efficient algorithms and implementations. In this paper, this lack is addressed, and several different algorithms to compute semi-stable and semi-equilibrium models are proposed and implemented into an answer set solving framework. An empirical performance comparison among the new algorithms on benchmarks from ASP competitions is given as well.

Probabilistic databases (PDBs) are usually incomplete, e.g., containing only the facts that have been extracted from the Web with high confidence. However, missing facts are often treated as being false, which leads to unintuitive results when querying PDBs. Recently, open-world probabilistic databases (OpenPDBs) were proposed to address this issue by allowing probabilities of unknown facts to take any value from a fixed probability interval. In this paper, we extend OpenPDBs by Datalog+/- ontologies, under which both upper and lower probabilities of queries become even more informative, enabling us to distinguish queries that were indistinguishable before. We show that the dichotomy between P and PP in (Open)PDBs can be lifted to the case of first-order rewritable positive programs (without negative constraints); and that the problem can become NP^PP-complete, once negative constraints are allowed. We also propose an approximating semantics that circumvents the increase in complexity caused by negative constraints.

We present SAT encoding schemes for distance-based belief merging operators relying on the (possibly weighted) drastic distance or the Hamming distance between interpretations, and using sum, GMax (leximax) or GMin (leximin) as aggregation function. In order to evaluate these encoding schemes, we generated benchmarks of a time-tabling problem and translated them into belief merging instances. Then, taking advantage of these schemes, we compiled the merged bases of the resulting instances into query-equivalent CNF formulae. Experiments have shown the benefits which can be gained by considering the SAT encoding schemes we pointed out. Especially, thanks to them, we succeeded in computing query-equivalent formulae for merging instances based on hundreds of variables, which are out of reach of previous implementations.

Powerful formalisms for abstract argumentation have been proposed. Their complexity is often located beyond NP and ranges up to the third level of the polynomial hierarchy. The combined complexity of Answer-Set Programming (ASP) exactly matches this complexity when programs are restricted to predicates of bounded arity. In this paper, we exploit this coincidence and present novel efficient translations from abstract dialectical frameworks (ADFs) and GRAPPA to ASP.We also empirically compare our approach to other systems for ADF reasoning and report promising results.

TBox abduction explains why an observation is not entailed by a TBox, by computing multiple sets of axioms, called explanations, such that each explanation does not entail the observation alone while appending an explanation to the TBox renders the observation entailed but does not introduce incoherence. Considering that practical explanations in TBox abduction are likely to mimic minimal explanations for TBox entailments, we introduce admissible explanations which are subsets of those justifications for the observation that are instantiated from a finite set of justification patterns. A justification pattern is obtained from a minimal set of axioms responsible for a certain atomic concept inclusion by replacing all concept (resp. role) names with concept (resp. role) variables. The number of admissible explanations is finite but can still be so large that computing all admissible explanations is impractical. Thus, we introduce a variant of subset-minimality, written ⊆ds-minimality, which prefers fresh (concept or role) names than existing names. We propose efficient methods for computing all admissible ⊆ds-minimal explanations and for computing all justification patterns, respectively. Experimental results demonstrate that combining the proposed methods is able to achieve a practical approach to TBox abduction.

In this paper, we present an algorithm and a tool for computing minimal, equivalent EL concepts wrt. a given ontology. Our tool can provide valuable support in manual development of ontologies and improve the quality of ontologies automatically generated by processes such as uniform interpolation, ontology learning, rewriting ontologies into simpler DLs, abduction and knowledge revision. Deciding whether there exist equivalent EL concepts of size less than k is known to be an NP-complete problem. We propose a minimisation algorithm that achieves reasonable computational performance also for larger ontologies and complex concepts. We evaluate our tool on several bio-medical ontologies with promising results.

Knowledge graphs play a significant role in many intelligent systems such as semantic search and recommendation systems. Recent works in this area of knowledge graph embeddings such as TransE, TransH and TransR have shown extremely competitive and promising results in relational learning. In this paper, we propose a novel extension of the translational embedding model to solve three main problems of the current models. Firstly, translational models are highly sensitive to hyperparameters such as margin and learning rate. Secondly, the translation principle only allows one spot in vector space for each golden triplet. Thus, congestion of entities and relations in vector space may reduce precision. Lastly, the current models are not able to handle dynamic data especially the introduction of new unseen entities/relations or removal of triplets. In this paper, we propose Parallel Universe TransE (puTransE), an adaptable and robust adaptation of the translational model. Our approach non-parametrically estimates the energy score of a triplet from multiple embedding spaces of structurally and semantically aware triplet selection. Our proposed approach is simple, robust and parallelizable. Our experimental results show that our proposed approach outperforms TransE and many other embedding methods for link prediction on knowledge graphs on both public benchmark dataset and a real world dynamic dataset.

LPMLN is a recently introduced formalism that extends answer set programs by adopting the log-linear weight scheme of Markov Logic. This paper investigates the relationships between LPMLN and two other extensions of answer set programs: weak constraints to express a quantitative preference among answer sets, and P-log to incorporate probabilistic uncertainty. We present a translation of LPMLN into programs with weak constraints and a translation of P-log into LPMLN, which complement the existing translations in the opposite directions. The first translation allows us to compute the most probable stable models (i.e., MAP estimates) of LPMLN programs using standard ASP solvers. This result can be extended to other formalisms, such as Markov Logic, ProbLog, and Pearl's Causal Models, that are shown to be translatable into LPMLN. The second translation tells us how probabilistic nonmonotonicity (the ability of the reasoner to change his probabilistic model as a result of new information) of P-log can be represented in LPMLN, which yields a way to compute P-log using standard ASP solvers and MLN solvers.

Trivia is any fact about an entity which is interesting due to its unusualness, uniqueness or unexpectedness. Trivia could be successfully employed to promote user engagement in various product experiences featuring the given entity. A Knowledge Graph (KG) is a semantic network which encodes various facts about entities and their relationships. In this paper, we propose a novel approach called DBpedia Trivia Miner (DTM) to automatically mine trivia for entities of a given domain in KGs. The essence of DTM lies in learning an Interestingness Model (IM), for a given domain, from human annotated training data provided in the form of interesting facts from the KG. The IM thus learnt is applied to extract trivia for other entities of the same domain in the KG. We propose two different approaches for learning the IM - a) A Convolutional Neural Network (CNN) based approach and b) Fusion Based CNN (F-CNN) approach which combines both hand-crafted and CNN features. Experiments across two different domains - Bollywood Actors and Music Artists reveal that CNN automatically learns features which are relevant to the task and shows competitive performance relative to hand-crafted feature based baselines whereas F-CNN significantly improves the performance over the baseline approaches which use hand-crafted features alone. Overall, DTM achieves an F1 score of 0.81 and 0.65 in Bollywood Actors and Music Artists domains respectively.

The need to extend business process languages with the capability to model complex data objects along with the control flow perspective has lead to significant practical and theoretical advances in the field of Business Process Modeling (BPM).On the practical side, there are several suites for control flow and data modeling; nonetheless, when it comes to formal verification, the data perspective is abstracted away due to the intrinsic difficulty of handling unbounded data. On the theoretical side, there is significant literature providing decidability results for expressive data-aware processes. However, they struggle to produce a concrete impact as being far from real BPM architectures and, most of all, not providing actual verification tools. In this paper we aim at bridging such a gap: we provide a concrete framework which, on the one hand, being based on Petri Nets and relational models, is close to the widely used BPM suites, and on the other is grounded on solid formal basis which allow to perform formal verification tasks. Moreover, we show how to encode our framework in an action language so as to perform reachability analysis using virtually any state-of-the-art planner.

Adding datatypes to ontology-mediated queries (OMQs) often makes query answering hard. As a consequence, the use of datatypes in OWL 2 QL has been severely restricted. In this paper we propose a new, non-uniform, way of analyzing the data-complexity of OMQ answering with datatypes. Instead of restricting the ontology language we aim at a classification of the patterns of datatype atoms in OMQs into those that can occur in non-tractable OMQs and those that only occur in tractable OMQs. To this end we establish a close link between OMQ answering with datatypes and constraint satisfaction problems over the datatypes. In a case study we apply this link to prove a P/coNP-dichotomy for OMQs over DL-Lite extended with the datatype (Q,<=). The proof employs a recent dichotomy result by Bodirsky and Kára for temporal constraint satisfaction problems.

Ontology-based data integration systems allow users to effectively access data sitting in multiple sources by means of queries over a global schema described by an ontology. In practice, datasources often contain sensitive information that the data owners want to keep inaccessible to users. In this paper, we formalize and study the problem of determining whether a given data integration system discloses a source query to an attacker. We consider disclosure on a particular dataset, and also whether a schema admits a dataset on which disclosure occurs. We provide lower and upper bounds on disclosure analysis, in the process introducing a number of techniques for analyzing logical privacy issues in ontology-based data integration.

Abstraction refinement is a recently introduced technique using which reasoning over large ABoxes is reduced to reasoning over small abstract ABoxes. Although the approach is sound for any classical Description Logic such as SROIQ, it is complete only for Horn ALCHOI. In this paper, we propose an extension of this method that is now complete for Horn SHOIF and also handles role- and equality-materialization. To show completeness, we use a tailored set of materialization rules that loosely decouple the ABox from the TBox. An empirical evaluation demonstrates that, despite the new features, the abstractions are still significantly smaller than the original ontologies and the materialization can be computed efficiently.

We study the problem of consistency checking for constraint networks over combined qualitative formalisms. We propose a framework which encompasses loose integrations and a form of spatio-temporal reasoning. In particular, we identify sufficient conditions ensuring the polynomiality of consistency checking, and we use them to find tractable subclasses.

We develop a general framework for agent abstraction based on the situation calculus and the ConGolog agent programming language. We assume that we have a high-level specification and a low-level specification of the agent, both represented as basic action theories. A refinement mapping specifies how each high-level action is implemented by a low-level ConGolog program and how each high-level fluent can be translated into a low-level formula. We define a notion of sound abstraction between such action theories in terms of the existence of a suitable bisimulation between their respective models. Sound abstractions have many useful properties that ensure that we can reason about the agent's actions (e.g., executability, projection, and planning) at the abstract level, and refine and concretely execute them at the low level. We also characterize the notion of complete abstraction where all actions (including exogenous ones) that the high level thinks can happen can in fact occur at the low level.