Thursday, December 19, 2013

Bad smell in code - Data Class

In the book titled “Refactoring Improving the design of existing code”, the author wrote the following with regard to the data class.

 
These are classes that have fields , getting and setting methods ( setters and getters) for the fields, and nothing else. Such class are dumb data holders and almost certainly be manipulated in far too much detail by other classes. In early stages these class may have public fields. If so, you should immediately apply Encapsulate Field before anyone notices. If you have collection fields, check to see whether they are properly encapsulated and apply Encapsulate Collection if they aren’t.  Use  Remove Setting Method ( setters) on any field that should not be changed.

Looking for where these getting and setting method are used by other classes. try to use Move Method to move behavior into the data class. If you can’t move a whole method, use Extract Method to create a method that can be moved. after a while you can start Hide Method and the getters and setters.

Data class are like children, they are okay as a starting point, but to participate as grownup objects, they need to take some responsibilities

DTO is another form of data class, ( all properties and fields, no behaviors).. which is used to pass data between layers. as different layers have their own Business Entities, Transformation is need to transform between BEs and DTOs.  Take an example, supposed we have a Data Layer Component  ( DLC) and a Business Layer Component ( BLC), there are business entities in respective layers ( let’s call them BE-DLC and BE-BLC respectively. Then in between BLC and DLC, we will have another set of classes call DTO-BLC-DLC).

 
In this design, when BLC talking to DLC they use DTO-BLC-DLC passing data back and forth.  when  they want to passing data to the other side, they transform their own Bes into the DTOs and then pass them over, on the receiving party, after receiving the data in DTO format, it will transform them into their own Bes.  The goal is this design is to achieve independency between layers. if one component need to change its own Bes,  it only need to change its own transformer, there will be no impact to other parties. however, if the DTOs need to be changed, both parties need to be changed.

 
In the book titled “Microsoft .Net Architecting Application for Enterprise”, the authors wrote the following regarding DTO:

The use of DTOs in conjunction with BO ( Business Entity in our case)  is one of those topics that can trigger an endless, and pointless, discussion within the team.  The Theory suggests that DTOs are used in all cases to reduce coupling between layers and to lend the system greater formal neatness. the practices, though, often reminds us that the complexity is generally high enough that we should avoid any unnecessary additions. As a practical rule, you probably don’t want to double the numbers of classes just to be neater in your implementation. in this case, a DTA might likely be the same as BOs.

 
I  personally strongly against the use of DTO when all layers are under your design. For one, it increases the complexity of the system for sure, 4 sets of classes are added to implement DTO. DTO itself,  one transformer at each side.  plus the BE classes is each layers as comparing only one set of Bes for the system.  Secondly it negatively impact the overall performance due to the time taking for transformation. The fact is that when all components are under one architect / design team, most likely they are the same or can be made same. hence, most of  the transformation is  just direct mapping. but these classes live in different namespaces, even if it is direct mapping, the transformation is still needed. in a way, you transform an apple from DLC to an apple in DTO and then transform the apple in DTO to the apple in BLC. through they are repenting the same apple from business point of view, technically they are distinct objects. you cannot do compare between them, you cannot assign an instance of  one to a field with the type of the other. in summary: it is pure over-engineering practice and should be avoid whenever possible. It is viewed as one of
anti-patterns, Over Engineering.
However, DTO concept will offer its promise when you deal with third parties in your system. let’s say your system needs to interface with Exchange Server, or your system need to interface with Facebook. , or your system need to interface with the system run  by government agency ( e.g. SSA).  In these cases, you are not in position to influence their BE/DTO design, whatever they give you, you just take, no question asked. On the other end, in most cases, you are only interested at a subset of their services whether it is BEs. or DTOs  in these scenario, you get the library for their DTOs from them or you code them yourself and add in your own transformation logic in your components. in this way the changes in the interface would be isolated in transformation logic. your Business Entity will be much stable  than otherwise.

Other than the proper employment of DTOs, we should avoid data class in all other scenario. Plain and simple “, they need to take some responsibilities”, failure in doing so, it will produce following  undesirable result in your system:

 

1)    High class coupling

2)    Duplicated business logic

3)    high codebase size

4)    inappropriate intimacy

5)    lower maintainability and high complexity

 

In summary, cost more time and money to develop and do maintain, produce lower quality product in all aspects.

 
the following table shows  some metrics of a project I know about:
 
Component
Cyclomatic Complexity
Maintainability Index
Lines Of Code
# of classes
BE (Business Entity)
6.329%
94%
66,259
1137
BLC ( Business Layer Component)
47.486%
69%
39,500
263
UI (User Interface)
21.894%
82%
398,612
406
SLC(Service Layer Component)
32.226%
69%
31,149
235
UIP ( User Interface Process)
41.095%
70%
30,293
135
DLC ( Data Layer Component)
18.537%
53%
66,327
220
It is a fairly large system, about one million LOCs.  From the figures here one could tell the BE could be more accurately described as "Data Entity" or "Data Class".  Together with other technical challenges,  it leads the project to some kind of undesired state.
 
 
Ok, one would ask, we have what we have here.  You listed all kind of reasons why it smells bad, and what kind of results I will face. more importantly, how we can go from here? the author offered a few technics in the paragraph I quoted above:


  1. Encapsulate Field
  2. Encapsulate Collection
  3. Remove Setters if can be made read only
  4. Move Method to move behavior into the data class
  5. Extract Method and move

I will add one more there, Encapsulate business logic as read only property or methods in BE classes whenever you can and remember to name them properly so that others in the team could be benefited from your work. .

 
Let’s collectively give our BE components some more intelligence and give them some more responsibilities. let’s gown it from baby infant into its adulthood .

No comments:

Post a Comment