Referential Integrity or Domain Integrity

Well, as the standard answer you get from any architect, it depends.

 A few questions to ask:

  
1)    Are the permissible values maintainable by user (admin user)?

2)    Is there any business logic key off from some specific permissible value?

 Normally, the answers to above 2 questions are exclusive of each other. If it needs to be maintainable, then there should not be business logic key off from some specific value; if there is some business logic key off from some specific permissible value, then it is not possible to make it maintainable.

Take an example of Citizenship. Suppose we want it to be maintainable, let’s imagine there is a table with the name of Citizenship with 2 columns (Id and Description). We allow user to add/ edit/ delete rows from this table.  The data is referenced in Person Table. Suppose we have 3 rows data in Citizenship table as following:
    1.   US citizen
    2.   US Green card holder
    3.   Foreign

Supposed one day, we got a new requirement saying that when the person is Foreign, we need to ask his or her Visa number and Visa Expiration date.
With the data as it is, one way or another, we got some logic in our code saying

If (Citizenship = 3 ‘Foreign)
{
//todo: ask for Visa information
}

 Okay, mission accomplished. We pet ourselves at back for the job well done.
Remember the data in Citizenship table is maintainable by user. Suppose one day, our Mr. User changed the data in the table into following:

1.    US citizen
2.    US Green card holder
3.    H1 Visa
4.    H4 Visa
5.    T1 Visa
6.    T2 Visa
7.    J1 Visa
8.    F1 Visa
9.    Visitor Visa

Will our logic continue to work? I bet you would agree with me when I say “it does not work. We have to do something more to make it work as expected”. This is what I meant “there are exclusive of each other”.

If the business logic meets the following criteria, Domain integrity is recommended:

  • If the permissible values are not maintainable by user, you can consider using domain integrity.
  • If there is some business logic key off from some specific permissible value, domain integrity is highly recommended.
If the business logic meets the following criteria, Referential integrity is recommended:
  • If there is no business logic key off from some specific permissible value, you can consider using referential integrity.
  • If the permissible values need to be maintainable by user, Referential is highly recommended.

What happen if the answers to these questions are as following and we do not know what they will be in the future?
  • There is no business logic key off from some specific permissible value.
  • The permissible values do not need to be maintainable by user.
In this case, either referential integrity or domain integrity works. We would make decision based on our bet on in the future which demission it will extended to. If the future is completely not visible to us at the point of time,  the recommendation is go with the simplest solution and domain integrity is simpler than referential integrity, so go with domain integrity.

 Typical candidates for domain integrity are something like Gender, Citizenship, and Status.

Typical candidates for referential integrity are something like City, Country and Company in the context of Address.

To use domain integrity, in most of cases, all it takes is define a check constant in the table defintion. The following is a simple example:

CREATE TABLE  Person
(
            Id Int IDENTITY(1,1) NOT NULL,
            LastName Varchar (20),
            FirstName Varchar (20),
            Gender tinyint NULL,
            CONSTRAINT PersonPK PRIMARY KEY CLUSTERED (Id),
            check (Gender =1 or Gender = 2)          
)

Let’s say we got 3 rows of data inserted into the table.

 insert into Person values ( 'Peter', 'Lu', 1)
insert into Person values ( 'John', 'Don', null)
insert into Person values ( 'Maggie', 'Yin', 2)

Let’s create a function as following:

CREATE  Function Gender (@GenderCode tinyint) RETURNS VARCHAR(10)

AS
BEGIN
RETURN
      CASE @GenderCode
            WHEN 1 THEN  'Male'
            WHEN 2 THEN  'Female'
            ELSE  'Unknown'
      END
END

This funciton will be even simplier in SQL/Server 2012 with 
CHOOSE ( index, val_1, val_2 [, val_n ] ) function

CREATE  Function Gender (@GenderCode tinyint) RETURNS VARCHAR(10)

AS
BEGIN
   RETURN CHOOSE ( @GenderCode  , 'Male', 'Female', 'Unknown')
END


When we issue the following query:

select
      id,
      LastName,
      FirstName,
      Gender as GenderCode,
      dbo.Gender(Gender) as  GenderDesc
from
      Person



We will get the rsult looks like the following:

1              Peter     Lu           1              Male
2              John      Don        NULL     Unknown
3              Maggie Yin          2              Female

In .net application side,

All it takes is define an enum as following:

 
public enum Gender: short
{
    Unknown = 0,
    Male = 1,
    Female = 2
}

Enum.GetValues<T> will provide a list of KeyValuePair, which can be used as data source for combo box.

Supposed we deccied to use Domain Integrity, and sometime down the road a new requirement comes to us saying it need to be maintainable. To change from Domain Integrity to Referental Integrity is very simple:

1) remove the check constant
2) add a master reference table and insert rows for permissible data
3) define a Foreign Key

Change from Referental Integrity  to  Domain Integrity is even simplier.

1) drop the Foreign Key
2) drop the reference table
3) add a check constant for permissible data
4) add a funciton for description

No comments:

Post a Comment