Exception Management Strategy in developing enterprise system using .Net Platform

1         History of Exception

In Microsoft world, before .Net framework, there was no such term call Exception. In that time, instead of “Exception Handling” we called it “Error Handling”. With the release of .Net, Microsoft introduced Exception concept. While it looks like just terminology changes, it carries much more fundamental meaning.
“Exception” means something happened in run time of the application that the programmer did not anticipate to happen. It may or may not be caused by programing bug.
“Error” suggests that something bad happened because we did not write good code as a programmer.

With this, we can come to the following conclusion   “all errors are exceptions but not all exceptions are errors”.  This set a better mindset for application developer on how to manage exceptions and use exceptions to develop more robust enterprise application.
2         Example of Exceptions

From a real life development project, I randomly picked up 10 logged (unhandled) exceptions

1)        System.Data.SqlClient.SqlException, The INSERT statement conflicted with the FOREIGN KEY constraint "WORKSTATION_POS_ACTIVITY_TXN_STATUS_ACTIVITY_ID". The conflict occurred in database "PeterLu-DB", table "dbo.WORKSTATION_POS_ACTIVITY", column 'ACTIVITY_ID'.  The statement has been terminated. 
2)       System.InvalidCastException Unable to cast object of type “PeterLu.MyDreamApp.DrivingLicense' to type PeterLu.MyDreamApp.TitleRegistration'.
3)       System.InvalidCastException: Conversion from string "    " to type 'Integer' is not valid
4)       System.InvalidOperationException The stored procedure 'MyStoredProcedureName’ doesn't exist.
5)       System.Data.SqlClient.SqlException: Procedure or function 'MyStoredProcedureName’  expects parameter '@I_OFFICE_ID', which was not supplied.
6)       System.IndexOutOfRangeException: CLASS_ID at PeterLu.MyDreamApp.DrivingLicense.DataAccess.DrivingLicenseDAL.GetCustomer(Int32 CustomerId)
7)       System.NullReferenceException, Object reference not set to an instance of an object. at PeterLu.MyDreamApp.Server.BusinessObjects.Transaction.GetTitleAndRegistrationTransactionSummary(TransactionBE Transaction) in C:\TFS\MyDreamApp\Server\BAL\Transaction.vb:line 9771
8)       System.InvalidCastException Conversion from string "" to type 'Long' is not valid. at PeterLu.MyDreamApp .Server.DataAccess.Transaction.TransactionDAL.GetSearchResult(DbDataReader dataReader) in C:\MyDreamApp\Server\DataAccess\Transaction\Transaction.vb:line 7218
9)       System.Web.Services.Protocols.SoapHeaderException.    at AgencyWebServiceV31.Agency3InitVerif(String ClientSftwrVer, String AlienNumber, String I94Number, String SevisId, String PassportNumber, String VisaNumber, String ReceiptNumber, String NaturalizationNumber, String CitizenshipNumber, String UserField, Int32[] BenefitCodes, String DocLastName, String DocFirstName, String DocMiddleName, String DocBirthDate, Int32 DocumentId, String DocOtherDesc, String DocExpDate)     at PeterLu..MyDreamApp.Server.BusinessProcessController.ThirdPartyInterface.Send(myRequest as Request)   
10)   System.Data.SqlClient.SqlException, Cannot insert the value NULL into column 'TXN_STEP_ID', table 'MyDreamApp-DB.dbo.USER_ASSIGNMENT'; column does not allow nulls. INSERT fails.  The INSERT statement conflicted with the FOREIGN KEY constraint …The statement has been terminated.


By scanning this list, you would agree with me on the point that “all errors are exceptions but not all exceptions are errors”.  You would also agree with me that in our case, most of the exceptions are caused by programming bug when it is in development stage.

3          Strategies in dealing with Exception
 
3.1       Strategies
There are three lines of defense in dealing with exceptions:

1)      Write better code, better test case to prevent exception (exception prevention)
2)      If some exceptions are not preventable, you want to handle it (exception handling)
3)      If some exceptions are not predicable, you want to record it  ( exception logging)
3.2       Case study

Now, let me use these 10 exceptions as examples on how we applied these lines of defense.

Case 1: it could be a programming bug in the stored procedure. Assume in the stored procedure, it insert a record in WORKSTATION_POS_ACTIVITY and then insert a record in WORKSTATION_POS_ACTIVITY_TXN_STATUS, in the event the first insertion is not success, we would have the error we see here on the second insertion. questions to be asked 1) why the first insertion is not success, is it caused by another programming bug? If it is a valid business case, sometime, the first insertion could not be success, then, how should the stored procedure take care of the logic? Finally, do we have a unit test case to execute the DAL method? Conclusion: this exception is preventable. Now come back to how we are doing it now, we just log it in the LOG table. When it happens, what is the state of the business transaction?  The way we handle this case is no way to be considered as acceptable in a commercial enterprise system,

Case 2: it is a clear cut a programming bug, how can we cast a DrivingLicenseBE to TitleRegBE. When it happens all we did was log it? In this case, “Exception logging” hides the bug from the developer in the development time. It would be better if in the development environment, the exceptions were thrown to the developer’s face, so that they could address them in the earliest time.

Case3, it is a case, where in the database the data type is varchar, and in the BE the property data type is integer. First question to be asked, why there is such inconsistency? Okay, given the data model as it is and at this time of the game, we do not want to make changes to the database model. Question to be asked, if we got blank or dbnull from the database, what should we do? (Logging may not be the best thing to do). Assume, we got the response from the client, says “take it as zero”, than this is handledable exception, we should handle it instead of logging it and leave the system to an uncertain state.

Case 4: it could be a programming bug, (typo) or an environmental issue. It could be the developer mistyped the name of the stored procedure, it could be due to the deployment process, and the stored procedure was removed. (But it is in development env, it is not likely. However, it could be possible in other environment). If it happens in the development environment, like this case, it will be most likely caused by the typo. Hence, it is preventable, again, it would be better if this exception were thrown to the developer’s face rather than logging in Log table. If it happens in UAT or production environment, logging it would help in troubleshooting process.

Case 5: similar to case 4, it could be either way. Considering it is in development environment, it is likely caused by the mismatch between .net code and stored procedure code.  Again, as a developer, I would like to get to know it right away instead of having to check it in the Log table; the better way would be write a unit test for the DAO method in development time.

Case 6: Similar to case 5, it is likely caused by the mismatch between .net code and stored procedure code and by fixing the code, we can prevent the exception.

Case 7: it is a programming bug, and the programmer would live to see it when they debug the code. Of cause, it is preventable.

Case 8: very similar to case 3, is handledable

Case 9: this is the only case, which is not handledable or preventable; logging could be the only thing we could do

Case 10: this is caused by a programming bug, and it is preventable, again, as a developer, I would like to get to know it right away instead of having to check it in the Log table, the better way would be write a unit test for the DAO method in development time.

Let’s do some summary for these 10 cases. These are observations we could make:
1)     Most of exceptions are caused by the programming bug, and are preventable
2)     By logging it to the Log table, the developers missed the opportunity to fix these bugs in the development environment. There are 2 possible solutions to it
a.    Do not log exception in development (controlled by some configuration setting)
b.    Develop unit test case and execute the code bypassing the exception logging process in BPC
3)     These handledable exceptions were not handled where it should

With that, let’s move on to each Strategy on how these are implemented in enterprise applaiciton development process.
3.3       Exceptions prevention

Same as many developers (programmers), you write simple code first for the functionality you are tasked to write. Supposedly, the world is prefect; everything should act as they should.

Then you test your code. Sure enough you get some exceptions. (Like the ones in case 4, case 5 or another cases for that matter). Then take a closer look, and fix your code. If you coded the stored procedure name wrong, correct it; if you forgot to provide a parameter to the stored procedure, provide it. So on and so for.

Then you test your code again… now you get some other exceptions, (exceptions like in case 3, 6, 8) this time. You would want to take a look at the stored procedure, and the table definitions on the column type, column value… then you would further modify your code to use some Tryparse instead of Parse, you would use some System.Convert method instead of Ctype.

Take a note here, up to now; you did not use “try-catch-finally” yet. You are still in Exception prevention mode. Because you do not have “try-catch” block yet, especially you do not have “Catch ex as Exception”  all exceptions are thrown to your face when it is thrown, because by default, IDE set “break when Exceptions are unhandled”  ( go to <Debug><Exceptions> to check how it is setup in your IDE).

3.4       Exception Handling

Shortly you will find there is just so much you could do in exception prevention, there are some exceptions are not preventable.  That’s when you move to Exception handling mode from exception prevention mode. (That does not mean you do not go back to exception prevention mode at later time)

You will find exceptions like ones in case 3), 8) are not preventable in your scope of work. Ideally the database model should match to object model, but more than often, they are not. The following are a few possible reasons:
1)     The Database model is under database team control, the object model is under development team’s control.  More than often, the database model is behind object model, and sometime the database model is finalized while object model is still under gone changes daily.
2)     The database model is determined by other part of the application, you are not in the position to make change to it.

Let’s accept the fact that there are some mismatchs from time to time, these are few common discrepancies:
1)     The data type of certain column is varchar(n) while its respective data type of the property is  numerical type
2)     In the database it is varchar(100) while its respective data type of the property is string ( instead of string( max 100))
3)     In the database it is nullable, while the data type of its respective property is not nullable
I am sure you have seen much more than this list,,, what do we do as a developer, we handle them in the form of exception handling. What are the principles on exception handling?

New, let’s establish some basic guidelines for exception handling ( not Exception Logging)
1.     Exception handling should be done as precise as possible.
a.     It means you are not supposed to just catch Exception, instead you should catch some specific exceptions like  SqlException, InvalidCastException, NullReferenceException etc… and handle them.
b.    It also means you are not supposed to put the big block of code into Try block
2.     Exception handling should be done as close to the source of the exception as possible
a.     It means you want to put some specific lines of code into the try-catch block and handle them logically, so that the flow of the application will be maintained.
b.    It also means you want to put “try-catch” block in where the exception is, wherever it is, sometime it could goes to BE classes. 
3.     Do not put exception handling logic until you have seen the exception been thrown. Do not put exception handling logic based on some hypothetical assumptions. So the process of exception handling is
                   1) Make a test case to make the exception thrown,
                   2) Handle the thrown exception

Now, let me go a bit deeper on the DataAccess Layer method I list above:

First of all, let’s see what could be possible exceptions (not caused by programming bug).
1.     Exception due to not able to connect to the database
2.     Exception due to not able to find the stored procedure
3.     Exception due to casting error when establish the Entity objects

Second step is to identify which exception you want to handle
When the first type of exception is thrown, we do not even have Database object to discard or close, when the second type of the exception is thrown, Commend object is not established. In the development environment you want to check your connection string, and stored procedure name, in the production environment you want to log it for troubleshooting, in both case, you do not want to handle it or you are not able to handle it. Considering GC is running behind the sense, you do not even need to have a finally block to close them. So “no catch”, “no finally”, hence no try.

The only ones we can handle and should handle or even better you can try to prevent them are some InvalidCastException. We should take a closer look at the entity class and the database schema. Do we have some kind of data type mismatch? Another type of InvalidCastException could be caused by DBNull. In this case, in the DB the column is nullable, but in the BE it is not. When we try to cast DBNull to an integer, exception will happen.
Once you identified some specific line of code which could cause exception, then you could put some recover logic in catch block. Commonly used recover logic are as follows:
1)     Set the vale to zero in BE when InvalidCastException is thrown
2)     If it is DBNull set it null in BE if it is nullable
3)     Take other default value specificed by business requirement
Sure enough, it will take more effort in complete the method. And sure enough, it will generate fewer bugs at later time, and it will create far better user experience for sure. Is it true that is what a enterprise commercial product should do.
Then for each exception, decide what logic is needed in the finally block. For this case, I believe we do not need finally block to close the database or DataReader object.
One last argument, from time to time, I was told “to play safe, we should put a finally block to close the data reader object. Just so the data reader is closed in any case” It sounds logic.  However, the only different between putting a line after the while loop to close the DataReader and put the closing data reader in a finally block is “when an unhandled exception happens”   my take on this is: as a developer, isn’t it your interest to find out what exception happened and how to handle it?  When other exception happened, your programming logic does not perform the duty it is designed to perform, in this case, is it the most important matter if the data reader is not closed?   Finally, due to the “connected” nature of data reader, we better use Dataset instead of data reader. And GC handles out of scope dataset objects. In that case, we even have one less reason to put a finally block here.

3.5       Exception Logging

The final line of defense on exception is Exception logging. To be precise, Unhandled Exception Logging

When all handleable exceptions are handled, the left over ones are unhandled exceptions.

The purposes of unhandled exception logging are as follows:
1.       Provide information for system troubleshooting
2.       Identify unhandled exception that could be preventable or handleable. Based on the logged information one could file a defect for developer to prevent the exception or handle the exception

With that objectives in mind, it is commonly agreed by the community that
1)      The logged unhandled exception should be human consumable.
2)      It should be made available for production support team and development team.
3)      It is for the quality of the product (system) to establish some kind of review process to review the logged unhandled exceptions and create defects to address them.

Now the last set questions to answer:  how to log and where to log?  Besides logging, do we want to notify users of the system?
1.       Logging of unhandled exception should happen at the boundary of the application domain
2.       When the application interact with human, the application need to notify users
3.       Different application domains should log their unhandled exceptions independently, and should be able to reference from one another.
4.       Within the same application domain, the unhandled exceptions should be logged only once.

One should pay extra attention to unhandled exception logging in development environment to make sure the unhandled exceptions are thrown to .Net run time. This is so that the IDE will get an opportunity to break the code when unhandled exception is thrown. By default that is how it set in IDE. IDE set “break when Exceptions are unhandled” (go to <Debug><Exceptions> to check how it is setup in your IDE). 

No comments:

Post a Comment