Probabilistic Data with Continuous Distributions

28 Jan 2021  ·  Martin Grohe, Benjamin Lucien Kaminski, Joost-Pieter Katoen, Peter Lindner ·

Statistical models of real world data typically involve continuous probability distributions such as normal, Laplace, or exponential distributions. Such distributions are supported by many probabilistic modelling formalisms, including probabilistic database systems. Yet, the traditional theoretical framework of probabilistic databases focusses entirely on finite probabilistic databases. Only recently, we set out to develop the mathematical theory of infinite probabilistic databases. The present paper is an exposition of two recent papers which are cornerstones of this theory. In (Grohe, Lindner; ICDT 2020) we propose a very general framework for probabilistic databases, possibly involving continuous probability distributions, and show that queries have a well-defined semantics in this framework. In (Grohe, Kaminski, Katoen, Lindner; PODS 2020) we extend the declarative probabilistic programming language Generative Datalog, proposed by (B\'ar\'any et al.~2017) for discrete probability distributions, to continuous probability distributions and show that such programs yield generative models of continuous probabilistic databases.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Databases

Datasets


  Add Datasets introduced or used in this paper