Relation (database)

From Justapedia, unleashing the power of collective wisdom
Jump to navigation Jump to search
Relation, tuple, and attribute represented as table, row, and column respectively.

In database theory, a relation, as originally defined by E. F. Codd,[1] is a set of tuples (d1, d2, ..., dn), where each element dj is a member of Dj, a data domain. Codd's original definition notwithstanding, and contrary to the usual definition in mathematics, there is no ordering to the elements of the tuples of a relation.[2][3] Instead, each element is termed an attribute value. An attribute is a name paired with a domain (nowadays more commonly referred to as a type or data type). An attribute value is an attribute name paired with an element of that attribute's domain, and a tuple is a set of attribute values in which no two distinct elements have the same name. Thus, in some accounts, a tuple is described as a function, mapping names to values.

A set of attributes in which no two distinct elements have the same name is called a heading. It follows from the above definitions that to every tuple there corresponds a unique heading, being the set of names from the tuple, paired with the domains from which the tuple's domain elements are taken. A set of tuples that all correspond to the same heading is called a body. A relation is thus a heading paired with a body, the heading of the relation being also the heading of each tuple in its body. The number of attributes constituting a heading is called the degree, which term also applies to tuples and relations. The term n-tuple refers to a tuple of degree n (n ≥ 0).

E. F. Codd used the term "relation" in its mathematical sense of a finitary relation, a set of tuples on some set of n sets S1S2, .... ,Sn.[4] Thus, an n-ary relation is interpreted, under the Closed-World Assumption, as the extension of some n-adic predicate: all and only those n-tuples whose values, substituted for corresponding free variables in the predicate, yield propositions that hold true, appear in the relation.

The term relation schema refers to a heading paired with a set of constraints defined in terms of that heading. A relation can thus be seen as an instantiation of a relation schema if it has the heading of that schema and it satisfies the applicable constraints.

Sometimes a relation schema is taken to include a name.[5][6] A relational database definition (database schema, sometimes referred to as a relational schema) can thus be thought of as a collection of named relation schemas.[7][8]

In implementations, the domain of each attribute is effectively a data type[9] and a named relation schema is effectively a relation variable (relvar for short).

In SQL, a database language for relational databases, relations are represented by tables, where each row of a table represents a single tuple, and where the values of each attribute form a column.

Examples

Below is an example of a relation having three named attributes: 'ID' from the domain of integers, and 'Name' and 'Address' from the domain of strings:

ID (Integer) Name (String) Address (String)
102 Yonezawa Akinori Naha, Okinawa
202 Nilay Patel Sendai, Miyagi
104 Murata Makoto Kumamoto, Kumamoto
152 Matsumoto Yukihiro Okinawa, Okinawa

A predicate for this relation, using the attribute names to denote free variables, might be "Employee number ID is known as Name and lives at Address". Examination of the relation tells us that there are just four tuples for which the predicate holds true. So, for example, employee 102 is known only by that name, Yonezawa Akinori, and does not live anywhere else but in Naha, Okinawa. Also, apart from the four employees shown, there is no other employee who has both a name and an address.

Under the definition of body, the tuples of a body do not appear in any particular order - one cannot say "The tuple of 'Murata Makoto' is above the tuple of 'Matsumoto Yukihiro'", nor can one say "The tuple of 'Yonezawa Akinori' is the first tuple." A similar comment applies to the rows of an SQL table.

Under the definition of heading, the attributes of an element do not appear in any particular order either, nor, therefore do the elements of a tuple. A similar comment does not apply here to SQL, which does define an ordering to the columns of a table.

Relation variables

A relational database consists of named relation variables (relvars) for the purposes of updating the database in response to changes in the real world. An update to a single relvar causes the body of the relation assigned to that variable to be replaced by a different set of tuples. Relvars are classified into two classes: base relation variables and derived relation variables, the latter also known as virtual relvars but usually referred to by the short term view.

A base relation variable is a relation variable which is not derived from any other relation variables. In SQL the term base table equates approximately to base relation variable.

A view can be defined by an expression using the operators of the relational algebra or the relational calculus. Such an expression operates on one or more relations and when evaluated yields another relation. The result is sometimes referred to as a "derived" relation when the operands are relations assigned to database variables. A view is defined by giving a name to such an expression, such that the name can subsequently be used as a variable name. (Note that the expression must then mention at least one base relation variable.)

By using a Data Definition Language (DDL), it is able to define base relation variables. In SQL, CREATE TABLE syntax is used to define base tables. The following is an example.

CREATE TABLE List_of_people (
 ID INTEGER,
 Name CHAR(40),
 Address CHAR(200),
 PRIMARY KEY (ID)
)

The Data Definition Language (DDL) is also used to define derived relation variables. In SQL, CREATE VIEW syntax is used to define a derived relation variable. The following is an example.

CREATE VIEW List_of_Okinawa_people AS (
 SELECT ID, Name, Address
  FROM List_of_people
  WHERE Address LIKE '%, Okinawa'
)

See also

References

  1. ^ E. F. Codd (Oct 1972). "Further normalization of the database relational model". Data Base Systems. Courant Institute: Prentice-Hall. ISBN 013196741X. R is a relation on these n domains if it is a set of elements of the form (d1, d2, ..., dn) where dj ∈ Dj for each j=1,2,...,n.
  2. ^ C.J. Date (May 2005). Database in Depth. O'Reilly. p. 42. ISBN 0-596-10012-4. ... tuples have no left-to-right ordering to their attributes ...
  3. ^ E.F. Codd (1990). The Relational Model for Database Management, Version 2. Addison-Wesley. p. 3. ISBN 0-201-14192-2. One reason for abandoning positional concepts altogether in the relations of the relational model is that it is not at all unusual to find database relations, each of which has as many as 50, 100, or even 150 columns.
  4. ^ Codd, Edgar F (June 1970). "A Relational Model of Data for Large Shared Data Banks" (PDF). Communications of the ACM. 13 (6): 377–87. doi:10.1145/362384.362685. The term relation is used here in its accepted mathematical sense
  5. ^ Jeffrey D. Ullman (1989). Principles of Database and Knowledge-Base Systems. Jeffrey Ullman. pp. 410–. Retrieved 28 November 2012.
  6. ^ Dennis Elliott Shasha; Philippe Bonnet (2003). Database Tuning: Principles, Experiments, and Troubleshooting Techniques. Morgan Kaufmann. p. 124. ISBN 978-1-55860-753-8.
  7. ^ Peter Rob; Carlos Coronel, Peter Rob (2009). Database Systems: Design, Implementation, and Management. Cengage Learning. pp. 190–. ISBN 978-1-4239-0201-0. Retrieved 28 November 2012.
  8. ^ T. A. Halpin; Antony J. Morgan (2008). Information Modeling and Relational Databases. Morgan Kaufmann. pp. 772–. ISBN 978-0-12-373568-3. Retrieved 28 November 2012.
  9. ^ Michael F. Worboys (1995). Gis: A Computing Perspective. Taylor & Francis. pp. 57–. ISBN 978-0-7484-0065-2. Retrieved 22 November 2012.