Total: 1
Large-scale probabilistic knowledge bases are becoming increasingly important in academia and industry alike. They are constantly extended with new data, powered by modern information extraction tools that associate probabilities with database tuples. In this paper, we revisit the semantics underlying such systems. In particular, the closed-world assumption of probabilistic databases, that facts not in the database have probability zero, clearly conflicts with their everyday use. To address this discrepancy, we propose an open-world probabilistic database semantics, which relaxes the probabilities of open facts to default intervals. For this open-world setting, we lift the existing data complexity dichotomy of probabilistic databases, and propose an efficient evaluation algorithm for unions of conjunctive queries. We also show that query evaluation can become harder for non-monotone queries.