THE ART OF DATA VALIDATION

Written by Stéphane Richard (Mystikshadows)

INTRODUCTION:

When you think of data validation, what's the first thing that comes to your mind? Is it, by any chance, to enter numbers when an application is expecting numbers out of you, the user? If so, then you're already well under way to grasping the full understand of just what data validation is all about. In essence data validation is about having valid data for a given type of variable and wether you are creating a business application or a game, good data validation will add a certain level of professionalism to your programming endeavours. There's more to data validation that simply validating the data, alot more as you'll see there is alot of things that can be done to help in the data validation process.

In this document, we will cover all you'll ever need to know about data validation, what it really means, how to use it effectively and ultimately how to minimize the use of data validation while still assuring that the data is indeed valid. So get ready, the journey begins here.

DATA VALIDATION DEFINITIONS AND CONCEPTS:

Data validation, as explained above, is making sure that all data (whether user input variables, read from file or read from a database) are valid for their intended data types and stay valid throughout the application that is driving this data. What this means is data validation, in order to be as successful as it can be, must be implemented at all parts that get the data, processes it and saves or prints the results. Let's take the time and list those parts here and explain why they should be considered.

These are the 3 major areas where you'd want to be sure that proper data validation is applied properly. This will only help with the rest of the application because you'll be able to cross out the data as being a cause for problem in possible errors that might slip into the application. That is, of course, once the data validation is debugged. But making sure that these are implemented in the development process rather than after the program is done will help minimize the burdon of integrating data validation in your application. Let's take a look and the types of validation that exist and are commonly used to give you an idea of where they can be used as well as how.

THE MANY TYPES OF DATA VALIDATION:

Why different types of data validation? The answer is simple. Essentially everything depends on the reason why you'd want to do data validation, it's degree of importance (as far as the data available to the application) and the actually type of the data as well. Here's a quick example just to put you into focus. Let's say you're making a mortage calculation program (that will be used by professionals all over the the continent). Typically, this kind of application would get initial data for the calculation of the mortgage itself. Hence a Principle amount, a length of the load, an interest rate, a data of first payment, the number of compounding periods per per year. This would be aside the client's information itself such as address, telephone number and the likes. Once the loan related data is entered the software would offer to generate an amortization table for each payment stating the payment amount, the amount give to the payment of the interest, the amount given towards the payment of the capital itself and some running totals at each payment period. If the initial data ends up being wrong, can you imagine just how bad the amortization table would end up being at the end of the calculation? Don't you think that the entry screen should be 100% sure of it's data before it goes ahead and performs the generation of the amortization table? The answer to this question ultimately must be yes. For the sake of the customer and the financial institution that will be loaning the money. With this in mind, let's see here what we can do to help the software make sure that the data is valid all the way.

We will be using the example of a mortgage calculation application because this, and other directly or indirectly related types of application such ass long term rental places, banks, realestate related businesses will love the fact that the data must be accurate for the reasons mentionned above and aq couple more that we'll see later in this document. So let's get right to it. The first thing will see is the different types of data validation that can be performed and describing them.

Many times people just don't bother implementation data validation at all while they are creating their application and this usually results in very bad things happening that really could have been avoided with proper data validation. In a big project, the right data validation should be established at the screen design phase of development as this is where you'd usually have time to talk about the role of each form and how important the data validation would be for the given forms. Let's see now what is available as far as tips and tricks to help eliminate or atleast minimize the risk of error in your application.

REMOVING/MINIMIZING HUMAN ERRORS:

As mentionned, there are ways around data validation. Ever heard the expression "prevention is the best policy"? Well as far as data validation is concerned, this saying definitaly applies. The best place to prevent human error is of course at the data entry screen level. From there, giving the user the right means of entering his data can, in most cases, eliminate the need for data validation. Indeed, a good data entry screen will be equipped with all that's needed to validate the data long before it can become dangerous to other parts of the application. Depending on the datatype at hand different types of field specific validations can be made available for different purposes. Let's see the different techniques and I'll explain what kind of data they apply to and how to effectively use them.

The best thing I can say at this point is to use common sense when evaluating the need for data validation. Typically it's not worth implementing data validation on small projects that you'd make for yourself because since they're your creations, chances are you'll know what data you need to enter and you can deal with the consequences of wrong data. For a game (depending on how widespread the game is) you might want to consider it for the sake of not having to answer repetitive questions that not having data validation might give you. For business applications and commercial applications of any type, you'd usually want to be as sure as possible about data validation. There's no room to mess around in the commercial application industry and if you can't be sure of what you are acquiring from the user, how can you garantee the results for the businesses that will be using it? Inserting data validation where appropriate only lifts your application's overall quality to a higher level and in today's computer world, any level higher is an excellent thing, it gives your application the edge it needs make itself worthy of considerations by those that will need an application like yours.

Many developers today already have libraries, in many languages, to take care of atleast part of the data validation process. Some even offer them for use in your own applications so you could find them and download them for your own needs if you don't want to code for them yourself. Me I like to use data validation in any project, even those I do for myself I try to put myself in the user's shoes, think of how I'd like to see things done and then make it happen. Sure when you're creating the application (even those for yourself) you might think I'll always know what goes where, but think of what if life gets you out of using that application for a couple years and you get back to it then. Would you remember everything that goes everywhere? The moral of the story is data validation can even protect you from yourself.

THE FINAL WORD:

And there you have it. Another aspect of the development process explained clearly, I hope. For an ending word, I can say that data validation, in many cases, has helped me even in ways I didn't think it could. I think we've all heard about the KISS development method (Keep It Simple and Stupid) and data validation is one of the best tools you have to help keep things simple, for the potential users of your application. Of course, the simpler you make it for your users, usually, the harder (and consequently the longer) it is for you to create the solution. That goes without saying but the rewards of such an effort are greater than the effort you put into the data validation process. Users will like your application for one thing. But they will also tend to use your application correctly as well which results in less support phone calls and emails and that should always be your ultimate goal, to make something that won't cause you to have to stay up all night answering questions about your application.

Like everything else I write, I can only hope I was clear enough. If I wasn't I would like to know, that's how one betters himself after all. So if there's a part of this article that just doesn't seem clear to you, email me, I'll see what I can do to make it as clear as it can be and that will be added/altered in this document to help make it as clear as possible. Happy coding.

MystikShadows
Stéphane Richard
srichard@adaworld.com