Archive for December, 2009

Data quality is for life not just for Christmas

Thursday, December 10th, 2009

As Christmas rushes towards us, we’re once again reminded that those considering pets as gifts must keep in mind the ongoing responsibility they represent: “A dog is for life, not just for Christmas”. In considering this recently, I was struck that the adage could similarly be applied to data quality (without meaning to trivialise the original message). Data quality is not a one off exercise, a gift to users or marketing campaigns, but an ongoing commitment that requires management buy-in and appropriate resourcing.

It’s well known that data decays rapidly, particularly in B2B which must contend with individuals being promoted, changing jobs, moving companies and so on, together with mergers, acquisitions, wind-ups and more. I often refer to this as the “Data Half Life”, the period of time it takes for half of a database or list to become out-of-date, which can be two years or fewer. It’s this fact that makes data quality maintenance an ongoing task and not simply something that can be done once, ahead of a big campaign or new system implementation.

Yet time and again, I’m asked how best to “clean-up” a database in exactly such a situation, or I hear of efforts to do so. I’m not saying such an  undertaking shouldn’t be made, it’s certainly better to do so than not, but the effort and expense is substantially wasted if it’s conducted on an ad hoc or piecemeal basis. Data immediately starts to decay, as contacts move, addresses change, new records are added and inevitable duplicates created, standardisation rules disregarded, fields not properly completed and other issues creep in. Very soon the data is in the same state as it was before “the big clean” took place.

It’s tempting to suggest undertaking a batch cleanse on a regular basis then, recognising these problems and trying to stay on top of them. Depending on the nature of your database, this could well be a viable approach, and might be quite cost effect, particularly if you contract a bureau or data management supplier on an annual basis, say. Unless your database is a relatively static campaign management system that can be taken offline whilst such an operation is undertaken – which could be several days – this approach presents its own issues. Considerations here include what to do with data that changes in the system whilst it’s at the same time away being cleansed, how to extract and reload updates handling the merging of any identified duplicates.

Far better though is an approach to data quality management that builds quality into the heart of an organisation’s processes and operations. Something along the lines that I outlined here some time ago and which incorporates Gartner’s “data quality firewall” concept. (This suggests that just as a network firewall should protect against unwanted intrusion, a data quality firewall should prevent bad data from reaching an organisation’s systems.) Ideally, one of the growing number of data quality software platforms should be deployed in order to create a framework for this environment (recognising that neither the issue or the solution is solely one of technology). Competition in this area continues to erode the cost of such solutions, and indeed open-source vendor Talend even offer a version of their Talend Open Studio product as a free download.

Adopting this way of managing data quality is a longer term play that may lack the one-off satisfaction of a quick clean-up and require maintenance, nurturing and care long after the initial “gift” of project approval is received. But just like a dog, this is a gift that will keep on giving in terms of operational effectiveness and business agility, making rapid and flexible campaign execution a reality and not a chore.