Conventional data quality tools are designed to be used by specialist data administrators, working in concert with data stewards, who are technically-oriented business users. The conventional approach to data quality is typically centralized within the IT management group, and depends on specialist analysts and batch processes to work. Periodically, an administrator profiles existing data, and applies software tools to normalize/standardize entities such as street- and business-names. Once data is standardized, additional routines then attempt to identify duplicate records and errors, and specialists will then instigate business processes to merge previously fragmented data.
Obviously, this conventional approach is a start. By profiling and analyzing data, organizations discover (often for the first time) how serious their data quality issues really are. By standardizing data, some of the inconsistencies that cause conventional software to fail are eliminated.
But how is it that these data quality issues happened in the first place? And how can organizations avoid them recurring? Conventional approaches don’t really address these issues.
Data quality problems don’t happen by accident: they usually reflect either:
• Errors at the point of data entry, or….
• Inconsistencies and differences among internal systems, or…
• Inconsistencies among 3rd party and various internal data sources
The conventional data quality approach doesn’t address any of these problems at the root. Instead, it looks to clean up problems after the fact. A better alternative combines this conventional data quality approach with one that is more dynamic, less centralized, and doesn’t depend on rigid data standardization. One that attacks data quality issues at the root cause, in real time.
So, one of the keys to data quality is to empower people responsible for data entry to find what they’re looking for at the beginning of the process. Users need an easy to use facility, built-in to their existing business processes, that works reliably despite typical data errors and inconsistencies. The trick is to ensure that users come up with the right match the first time: studies show that front line personnel resist having to conduct multiple requests. Conventional, batch-oriented data quality software doesn’t help here at all. Moreover, users shouldn’t have to use a separate GUI. The matching facility should be integrated with the application they’re already using, and require minimal change to the existing workflow.
The most impactful business benefits are downstream – in the unrealized economies of scale and improved efficiency by not throwing a lot of people and repeated work effort at the multiple problems created by poor data quality.