Relational Databases

[ Home ] [ Up ] [ Relational Databases ] [ Structured Query Language (SQL) ] [ What is JDBC? ] [ Connecting to Databases ] [ SQL Errors and Warnings ] [ Accessing Databases ] [ Creating and Populating a Database ] [ Transactions ] [ Prepared and Callable Statements ] [ Using Metadata ]

Relational databases implement the Relational Model formulated by E.F. Codd in his seminal paper "A Relational Model of Data for Large Shared Data Banks", Communications of the ACM, 1970.

The Relational Model

The major features of the Relational Model are:

Data is represented as a set of relations, or tables,
How data is stored in a database is independent of the relationships between the data. (the logical schema and the storage schema are independent of each other)
The model provides a mathematical basis for operations on the data.

Relational Terminology

Here is some terminology commonly used in relational databases:

Table (relation) : The basic logical organizational unit for data.
Data in a table is organized in:
- Rows (sometimes known as tuples)
- Columns (sometimes known as attributes), which have names.

For example, imagine a database which stores information about employees for a company. An Employees table might look like the following:

EmployeeID	Name	Department	Age	Salary
10063	Alfred J Prufrock	10	45	120000
204567	Angela Parfitt	30	36	78000
345890	Michael Solistes	20	28	90000
89567	Charles Schultz	20	56	230000
56435	Barbara Smith	30	48	134000
...
...
...

and a Departments table might look like:

DepartmentID	Name	Manager
10	Accounting	10063
20	Sales	89567
30	Information Technology	56435
...
...
...

Note that, in the above example, there are two columns which refer to rows in the other table:

The Department column in the Employees table contains DepartmentIDs to refer to the appropriate row in the Departments table.
The Manager column in the Departments table contains EmployeeIDs to refer to the appropriate row in the Employees table.

This is done in the relational model to avoid duplication of data, and to improve the data structure/organization.

Relational Data Types

Each column in a relational database table has a data type. The Relational Model emphasizes that each column's data type must be atomic; that is, a data type is indivisible and must be treated as a unit. Some common datatypes are:

INTEGER
REAL
CHARACTER(n)
CHARACTER VARYING(n)

Most relational databases today provide more complex types such as Binary Large Objects (BLOBs) and Character Large Objects (CLOBs), which may in fact be considered divisble by the user (although perhaps not by the database itself).

The value of a particular data type can:

Have a value of that data type, or:
Be NULL, which indicates no value, or unknown value.

Relational Operations

The Relational Model defines a number of operations that may be performed on tables, on rows, and on individual data elements.

Selection

The most important operation is selection. This is the operation of selecting one or more rows from a table which satisfy one or more predicates. A predicate is a statement about the truth of something (for example, whether the employee ID is equal to 1234, or whether his/her salary is higher than $20000.) The rows in the table are filtered on the basis of whether they satisfy the predicate(s).

Join

A common relational operation is that of a join. A join involves one or more tables, and involves combining rows and columns from those tables into a new set of data which also forms a (temporary) table with a new set of columns and rows. For example, if you wished to list all the departments and their managers in the above employees example, you would have to perform a combination of a join operation and a selection operation to come up with a table that looks something like:

Department	Manager
Accounting	Alfred J Prufrock
Sales	Charles Schultz
Information Technology	Barbara Smith
...
...

Union

It sometimes happens that two or more tables have the same structure of columns and data types. If you wished to produce a report that combined the data from the two tables, you would perform a union operation to accomplish this. For example, you might have a Customers table which could contain similar information to the Employees table (except of course for the Employee ID and Department columns). You could use a combination of selection, join and union to come up with a table that contains rows of data from both the Employees and the Customers tables.

Closure

A very important aspect of the relational model is the concept of closure, which says that the result of any operation on one or more relations is also a relation.

The Transaction Model

Most relational databases support the concept of a transaction. A transaction is a unit of work performed against a database which must either completely succeed or completely not succeed. For example, imagine that you work for a bank, and you wish to transfer money from one account to another account. To accomplish this, you would have to perform the following steps in the following sequence:

Withdraw the desired amount from the first account
Deposit that amount into the second account

But what if something went wrong after step 1) that prevented you from completing step 2)? Without the concept of a transaction, you would likely lose track of the money withdrawn, and the banks books would no longer balance. If you perform the two steps within a transaction, the database takes care to ensure that one of the following occurs:

(Eventually) both actions are completed,

or:

(Eventually) neither action is completed.

Commit vs Rollback

When a transaction is ready to complete all its operations, it asks the system to commit those actions.

When a transaction discovers difficulty with performing its actions, and decides that all those actions cannot be performed, then it asks the system to rollback any actions that it has (provisionally) performed.

The ACID Properties

It is generally agreed that a database transaction must have the following "ACID" properties:

Atomic: A transaction should either complete all its actions, or none of them should be performed. (For example, a money transfer is guaranteed either to completely succeed, or to have no effects on the system.)
Consistent: A transaction should only perform correct transformations between valid system states (For example, An employee should always have an employee ID and an associated department; A department should always have a Manager assigned, etc.)
Isolated: While a transaction is performing changes in a system's state, the data may at various points be inconsistent. Such inconsistency should not be visible to other transactions. The system must give a transation the illusion that it is running in isolation.
Durable: Once a transaction has committed, the changes it has made to the database must be preserved, even if the system crashes (due to either hardware or software failure).

The page was last updated February 19, 2008