Domain algebra

Motivation

The domain algebra complements the relational algebra in the Aldat database programming language. It provides operations on relational attributes. For intellectual simplification, it does so in a way as isolated from the relations as possible, so that the programmer can think AbOUT attribute operations independently from relational operations.

When the attributes the domain algebra operates on are scalars, such as numbers, strings or boolean, the corresponding relations are in "first normal form". If the relations are "nested" (attributes may themselves be relations), the domain algebra subsumes the relational algebra so that relational operators can be applied to these attributes. (This requires that binary relational operators have explicit names, such as natjoin, natcomp and div.) The domain algebra so extended accomplishes any desired processing of nested relations. Recursive nesting (relations contain themselves) is supported by recursive domain algebra.

Overview

Domain algebra expressions create virtual attributes: to maintain independence from relations, these attributes do not belong to any relation until some particular relation is created by "actualizing" virtual attributes using the relational algebra. Actualization is the only point of contact between the domain and relational algebras. A virtual attribute May Be (mentally) associated with any relation or relational expression containing the attributes on which, directly or indirectly, the virtual attribute is defined by the domain algebra.

Domain algebra operators are either scalar or aggregate. Scalar expressions are self-evident, such as A + B (numerical attributes A and B), X cat Y (string attributes X and Y) or P and Q (boolean attributes P and Q). Any expression may be built up from them. This includes unary operators and functions such as -A or sin(A). A conditional expression, if...then...else..., is also a scalar operation. For nested relations, relational operators may be included in scalar domain algebra expressions, such as R natjoin S (relational attributes R and S). Scalar expressions must be able to compute values separately for each tuple of any relation they are actualized from.

Aggregate expressions apply to single attributes (or attribute expressions) and perform aggregates across all tuples of a relation. Thus red + of A sums all values of numeric attribute A in the relation it is actualized from; equiv + of A by G sums all values of A within groups of tuples having a single value for attribute G (of any type); fun + of A order Q gives the cumulative sum of A in an ordering of the tuples induced by the values of attribute Q (of any type); and par + of A order Q by G gives the corresponding group-by cumulative sum. These four families cover the whole of aggregate domain algebra. They may be used in any way that makes sense: operators other than + (such as *, and or natjoin) may be used; there may be lists of attributes after by and order (such as by G1,G2,G3 or order Q1,Q2); and any attribute anywhere may be replaced by an arbitrary attribute expression including further aggregate expressions. The only, and important, restriction is that operators following red and equiv must be both associative and commutative, or else the result is ambiguously defined, because relational tuples are not inherently ordered.

Virtual attributes are defined from attribute expressions by statements of the domain algebra e.g., let V be if A<B then A else B;

Definitions and Examples

  • A relation is a set of tuples. (Because it is a set, tuple order is irrelevant, and duplicate tuples are not allowed.)
  • A tuple is a mapping from a set of attributes to a set of values.
  • An attribute is an identifier.

The set of values that an attribute can map to is a subset of a domain.

  • A domain is a set of values. (Frequently it is characterized simply as a type, say the set of integers.)
  • (More specifically) a relation is a subset of the Cartesian product of its domains (i.e., of the domains its attribute are drawn from).
  • A database is a tuple.

First-normal-form ("flat") relation

  • relation R; attributes A, B, X, Y, P, Q; 3 tuples

R(

A

B

X

Y

P

Q)

1

1

"hi"

"there"

T

T

2

2

"bye"

"now"

T

F

3

2

"hi"

"there"

T

F

Domain algebra statements related to R:
let AB be A + B;
let XY be X cat Y;
let PQ be P and not Q;
let AA be red + of A;
let BB be red * of B;
let BA be equiv + of A by B;
let PP be red and of P;
let QQ be red or of Q;
let XQ be red or of Q by X;
let Bcum be fun + of B order A;

R showing the virtual attributes (outside the parentheses: they are not actual attributes of R):

R(

A

B

X

Y

P

Q)

AB

XY

PQ

AA

BB

BA

PP

QQ

XQ

Bcum

1

1

"hi"

"there"

T

T

2

"hithere"

F

6

4

1

T

T

T

1

2

2

"bye"

"now"

T

F

4

"byenow"

T

6

4

5

T

T

F

3

3

2

"hi"

"there"

T

F

5

"hithere"

T

6

4

5

T

T

T

5

Non-first-normal-form ("nested") relation

  • relation nestedPartOf; attributes A, SQ; SQ has attributes S, Q; nestedPartOf has 4 tuples

nestedPartOf

( A

wallplug

cover

fixture

plug

Since for nested relations the domain algebra must include relational algebra operators, we'll suppose three, the NATURAL join, natjoin, the set union, and the projection, [..] in...
First, we "unnest" nestedPartOf to the flat relation PartOf:

  • use the only special syntax for nested relations, the relation() operator of the domain algebra, to create a singleton relation with the attribute A;
  • use natjoin to get the cartesian product of this relation with SQ;
  • use reduction to get the union of the resulting relation over all four tuples of nestedPartOf, and do it "anonymously" (i.e., without giving The New attribute a name) to raise the nesting level;
  • project the resulting flat relation from nestedPartOf and call it PartOf.


let relA be relation(A);
let ASQ be relA natjoin SQ;
PartOf <- [red ujoin of ASQ] in nestedPartOf;

PartOf(

A

S

Q

)

wallplug

cover

1

wallplug

fixture

1

cover

screw

2

cover

plate

1

fixture

screw

2

fixture

plug

2

plug

connector

2

plug

mould

1

Here are the intermediate steps.

nestedPartOf

( A

wallplug

cover

fixture

plug

Second, we "nest" the flat relation PartOf back to nestedPartOf:

  • use relation(S,Q) to create a singleton relation on S and Q;
  • use equivalence reduction to take unions of these singletons for each value of A;
  • project A and the unions from PartOf. (The "anonymous" projection used

is explained a little later in the discussion of "level raising".)


let relSQ be relation(S,Q);
let SQ be equiv union of relSQ by A;
nestedPartOf <- [A,SQ] in PartOf;
Here are the intermediate steps.

PartOf(

A

S

Q

)

relSQ

SQ

(S

Q

)

(S

wallplug

cover

1

cover

1

cover

fixture

wallplug

fixture

1

fixture

1

cover

fixture

cover

screw

2

screw

2

screw

plate

cover

plate

1

plate

1

screw

plate

fixture

screw

2

screw

2

screw

plug

fixture

plug

2

plug

2

screw

plug

plug

connector

2

connector

2

connector

mould

plug

mould

1

mould

1

connector

mould

Suppose we want to know how many subassemblies (sum of Q in SQ) make up each assembly (A) in nestedPartOf. To add up all the values of Q for each SQ, we just use red + of Q. But
let count be red + of Q
gives a virtual attribute at the same level as Q

nestedPartOf

( A

wallplug

cover

fixture

plug

We could now make these counts available at the next level up, as an attribute of nestedPartOf, by a projection in the domain algebra,
let C be [count] in SQ;
but the result is still a nested relation.

nestedPartOf

( A

wallplug

cover

fixture

plug

We can see this clearly if we project it out from nestedPartOf
countPartsOf <- [C] in nestedPartOf;
giving the nested relation countPartsOf (C (count )).

It would be more appropriate to see count at the same level as SQ, i.e., as an attribute of nestedPartOf, one level up.

Such a "level raising" can be done when the attribute has only one value (which red guarantees): we make the middle attribute disappear by simply not naming it. Here is count "anonymously".
let count be[red + of Q] in SQ;

nestedPartOf

( A

wallplug

cover

fixture

plug

countPartsOf <- [count] in nestedPartOf;
now gives a flat relation, countPartsOf(count).

Third, we show recursive domain algebra. For this we use a recursively nested version of the above data.

assembly

(component

wallplug

To get a list of all components, we define cmpnt recursively in the domain algebra.
let cmpnt be component union [red union of cmpnt] in subassembly ;
Then
AllComponents <- [cmpnt] in assembly;
gives

AllComponents

(cmpnt

wallplug

cover

fixture

plate

screw

plug

mould

connector

Note that recursion is allowed in the domain algebra only if there is a level change in the recursive statement.

Summary

The domain algebra processes attributes independently of their participation in relations, with the connection between the two being made only when the virtual attributes defined by the domain algebra are actualized in a relation newly created by a statement of the relational algebra.

The domain algebra includes all operators of the relational algebra, so that attributes may be relations in nested relations. Recursion in the domain algebra is used to bridge levels of nesting in such relations, and supports recursively nested relations.

Further Reading

  • T. H. Merrett "Experience with the Domain Algebra" Proc. 3rd Internat. Conf. on Data and Knowledge Bases 1988 San Mateo CA, Morgan Kaufmann Publishers Inc. pp.335-346
  • T. H. Merrett "Aldat: a Retrospective on a Work in ProgresS" Information Systems 32(4),2007.