In this modern world that we live in, computers are used almost everywhere. The computers, however, can only turn out information that is correct if the DATA is correct. If a machine is given incorrect or unreliable data then it will be expected to produce incorrect and unreliable information from the process.
Because our world revolves around data, we need it to be accurate and reliable. Only precise data is useful to us, it is no good having data that might be correct, it simply has to be correct if we want the correct information.
To keep data reliable it needs to be up to date and complete. Old data is useless to us, if say a person moved house and we were still sending information to their old address, then they would simply not get it. Data also has to be complete, uncompleted data such as postcodes missing a letter or number is again useless to us because we can’t be sure the person will ever get the information we will be sending them.
The area where data is usually corrupted is during the capture of data (surveys, questionnaires, school forms and so on). Data capture is simply the collecting of data for computer processing. The computer needs data to process and data capture is how we get this. The most common way of getting data is via data capture forms. These data capture forms can cause lots of problems when the computer tries to process the data. Data capture forms need to be complete, easy to read and fill, they need to instruct the user on what to do, they need to be simple, useful and accurate.
If you were filling in a form for your school and it said ‘Date of Birth’ and gave no reference as to how you were supposed to fill it in (dd/mm/yy) then this data has to sorted through before it is entered into the computer taking up more time.
Address fields also need to be split up. It is no good having three lines for your address as the computer may only need to process say your postcode. Therefore fields should be set out so data you need can be easily accessed and is already in the correct format that you require.
There are three methods by which data can be capture, and these are: Manual, Semi-Automatic and Automatic.
* Manual – Paper based data capture forms such as questionnaires and surveys you are asked to fill in when you buy certain products.
* Semi-Automatic – This is where the data capture is shared between manual and automatic. Optical Mark Recognition does this. If you get questionnaires you have to fill out by filling in a certain box then you have entered the data manually but it is inputted and processed automatically.
* Automatic – This is where data is automatically captured and processed by a computer with no human intervention. A good example of an automatic data capture system is a weather station. They will have devices that automatically capture and process temperatures without any human intervention.
Of course, each of these have their advantages. Automatic data capture is obviously faster and more accurate than manual data capture due to lack of human error. While Manual data capture is sometimes the only possible way of capturing the data (data such as names and addresses cannot be read and processed by computers). Semi-Automatic has the best of both worlds, it has the speed of processing and can do things Automatic data capture can not do.
However, data capture can be unreliable whether it is automatic or manual, so data still has to be checked to make sure it is accurate and complete.
There are two main ways in which data can be checked: Verification and Validation.
This compares data with the same data else where to make sure they both match up. An example of this is proof reading, this is done manually rather than it being a computerized process.
This method of data checking ensures that data is sensible by performing checks such as:
* Range checks
* Data type checks
* Hash totals
* Check Digits
* Transposition Errors
* Transcription Errors
Validation is usually done by computers and built in programs such as spell check.
Using a semi-computerized system is usually best for most data capture and processing. This way we can check for certain things that a computer cant check for, and the computer can speed up our checks by doing calculations that would take us a lot longer to do.