ProbDB | DBDesign / DBDesign

There are currently two main models for defining probabilistic tables: the tuple-level and the attribute-level model. See the Model page for more detailed information about these two models.

ProbDB can handle both models for storing the tables. The attribute-level model is less space consuming but it requires multiples classical tables for each probabilistic table, and thus is less convenient for calculations over the probabilistic attributes. On the other hand the tuple-level model is greedy in terms of space requirement, but as all the data are in the same table, it can be more convenient for the computations.

Meta-data

In order to store and retrieve probabilistic data and to provide probabilistic queries, we use different tables for defining which attributes represent the probability of a tuple to appear in a possible world, and which tables are correlated.

Tuple-Level Model

The simplest way to store a probabilistic table is to use the tuple-leve model. In this model, each tuple has its own probability to exist in the world, and two tuples are independent. We use an extension of this model, which provides mutual exclusion between the tuples. Each tuple has a special attribute, the tuple_id attribute. All tuples having the same tuple_id are mutually exclusive. Tuples with different tuple_id are independent.

The table ptable1 is an example of a probabilistic table, with id as the tuple_id attribute and prob as the probability attribute.

ptable1
id	name	value1	prob
1	t1	3	0.7
1	t2	5	0.3
2	t3	0	0.8
2	t4	3	0.2

In the table ptable1, the first two tuples are mutually exclusive, so are the last two. But the tuple t1 is independent from the tuple t3. There are 4 possible worlds:

World	Tuples
PW13	{t1, t3}
PW14	{t1, t4}
PW23	{t2, t3}
PW24	{t2, t4}

We store the probabilistic metadata of the database in tables. The metadata of tuple-level tables are stored in the table tuple_level_tables, as follows:

tuple_level_tables
table_name	tuple_id_attr	prob_attr
ptable1	id	prob
...	...	...

Attribute-Level Model

With the attribute-level model, we assume that the tables have probabilistic attributes. Tuples from a probabilistic table can have probabilistic values, each of these values being a set of mutually exclusive couples (tuple,probability).

The table ptable1 can be represented:

values

name	value1	prob
t1	3	0.7
t2	5	0.3

name	value1	prob
t3	0	0.8
t4	3	0.2

Briefly speaking, each probabilistic value in the attribute-level model looks like a tuple-level probabilistic table. This is how we will store it in the database.

The table attribute_level_tables stores the metadata of the probabilistic tables defined by the attribute-level model.

attribute_level_tables
table_name	tuple_id_attr	table_prob_attr
ptable1_attr_lev	values	ptable1
...	...	...

The table ptable1_attr_lev has one probabilistic attribute called values which refers to the tuple_id of the table ptable1.

ptable1_attr_lev
id	values
1	1
2	2