Data integrity, Normalization Process and usage of triggers

.       Data Integrity

Data  Integrity is one of the cornerstones of the relational model and has been over the years incorporated in every RDBMS (Relational Database Management System) on the market. There are four types of integrity:
·         Domain integrity
·         Entity integrity
·         Referential integrity
·         Enterprise integrity

a.       Domain integrity
A domain defines the possible values of an attribute. Domain integrity rules govern these values. In a database system, the domain integrity is defined by:
The data type and the length
The NULL value acceptance
The allowable values, through techniques like check constraints or rules
The default value
b.      Entity integrity
The entity integrity states that every instance of an entity has to be uniquely identified. The existence of the primary key is the core of the entity integrity. If you defined a primary key for each entity, they follow the entity integrity rule. Some time, UNIQUE constraint is also employed to enforce entity integrity.
c.       Referential integrity
The referential integrity rules are enforced by the relationships between entities. As a starting point, the referential integrity rules state that a child instance cannot exist if there is no corresponding parent instance. For example, an order cannot exist without a matching customer, or an order detail cannot exist without the associated order.
d.      Enterprise integrity
The last type of integrity is enterprise integrity, also called business rules. These rules, generally implemented through programmatic methods, like stored procedures or triggers on the database server side, define the way the company works. For example, you can state that a customer cannot place a new order if he still owes more than $10,000, or that an order greater than $200,000 has to be approved by the sales manager before being shipped. Enterprise integrity is generally not defined in the data model, but rather in the function model.

In general, there are 2 types of implementation of Data  Integrity enforcement. Namely, Declarative Implementation and Procedural Implementation. The following is a table on how they are employed:


Integrity Type Declarative ImplementationProcedural Implementation
Domain integrityData type, nullability
DEFAULT constraint
CHECK constraint
UDTs
Default Rule ( MS recommends not to use Rules, as it will be faced out in the future version of database)
Entity integrityPRIMARY KEY constraint
UNIQUE constraint
UNIQUE index
Stored procedure
Trigger
Referential integrityFOREIGN KEY constraintStored procedure
rigger
Enterprise integrityNAStored procedure
rigger


             It is generally accepted practices  that always use Declarative Implementation whenever possible.

The concept of Data Integrity and guidelines described here are not specific to Microsoft or any specific RDBMS, rather for any RDBMS in general.  you can click the following links to find some reference in both MS SQL and Oracle:

From MSDN
http://msdn.microsoft.com/en-us/library/ms184276.aspx

from Oracle
http://download.oracle.com/docs/cd/B28359_01/server.111/b28318/data_int.htm#insertedID1

               
2.       The Relational Model and Normalization
Most of Modern  RDBMS engines support  up to Third Normal Form. The following are the academy definition of them:

·         A relational table is in First Normal Form (1NF) if:
                                                               i.      It has a primary key.
                                                             ii.      Each column is atomic.
                                                            iii.      There is no repeating group of columns.
·         A relational table is in Second Normal Form (2NF) if:
                                                               i.            It is in 1NF.
                                                             ii.            Every non-key column is fully functionally dependent on the primary key.
·         A relational table is in Third Normal Form (3NF) if:
                                                               i.            It is in 2NF.
                                                             ii.            Every non-key column is functionally dependent only on the primary key. In other words, a non-key column cannot be dependent on another non-key column.

A normalized database will be a database free of redundant data. However most of Database architects start with a normalized data model and then de-normalize the model for performance reason. A de-normalized database  will be a database with some redundant data.  It is important to note that application developer (who normally responsible of enforcing enterprise integrity, in another words, they write stored procedures) enjoy the benefit de-normalized database offers, but they do not need to worry about maintaining the redundant data.  triggers behind the tables or computed columns on the tables will be responsible for maintaining redundant data.

3.       Usage of the triggers
It is true that managing triggers could be challenging for some DBAs. Abuse of  triggers could cause poor performance, poor maintainability to the database. However use it  for appropriate purpose with well-defined guidelines would improve database performance and database Integrity and  development productivity. It is generally accepted practices  that triggers are only used for maintaining redundant data and establish audit trail of the transactional data.

No comments:

Post a Comment