Archive for the ‘Data quality’ Category

Does Data Quality matter?

Friday, June 11th, 2010

Very touched to have been asked to speak at the recent British Computer Society/Data Management Association joint meeting on the topic of Does Data Quality matter? Rather than writing a lo of words here, I thought I’d just share the slides from the talk!

Does data quality matter? View from the business

View more presentations from Percassity Solutions.

When to stop flogging a dead horse

Tuesday, April 13th, 2010

There’s a strong tendency when planning a data selection for a forthcoming campaign or programme to pull as much as possible in order to maximise the reach of the activity and corresponding response. This is nearly always self-defeating however, and not least when it comes to using every record meeting your selection criteria, regardless of how long ago it was collected or when any kind of response was last received. Even if such data is not obviously out of date, there are many reasons to exclude it from ongoing activity.

Although this is likely to be an issue restricted to email activity rather than relatively more expense direct mail, it’s still applicable to both. The greater cost involved with DM creates a natural incentive to fine-tuning selections ahead of launching a campaign. Even so, it’s extraordinary how poorly targeted such activity can often still be, with the obvious parameter of data age not taken into account.

The seemingly next-to-nothing cost of email though makes it easy to think that that there is no impact to using all available data, but as we all know (albeit don’t necessarily acknowledge) this is not the case. Diligent email marketers will of course remove bounced email addresses from their lists in order to maintain a clean database and eliminate records known to be no longer active (although not always, see Email bounces and database updates). And it goes without saying that opt-outs and unsubscribes must be removed in order to maintain privacy compliancy. Other than that, if you’ve got a usable record, use it, right?

Well, an obvious effect of taking this approach is to actually diminish your percentage open rates, since the opens that you do achieve will be diluted by all those disengaged recipients. Now you might be thinking that this is just damned lies and statistics, since the overall number of opens isn’t changed by the total number of recipients. If you’re monitoring these metrics however, they will be giving you a false, and unnecessarily pessimistic, impression. It will be much harder to achieve improvements due to the dead weight of of those recipients who are never going to look at what you send them.

Continuing to market to an artificially inflated list also obscures the number of people you’re actually reaching. The absolute open and click rates are crucial of course, but continuing to hope that non-responsive recipients will at some point come to life again may mask deeper issues with your database. Perhaps you should be looking for fresh subscribers or prospects via external data acquisition or increased social media activity to encourage opt-in. (Don’t just rush out and rent a list though – see the point on Data acquisition in my recent post How to take advantage of a recovery.)

How then should you go about honing your list selection when preparing a new campaign? Well obviously it goes without saying that your activity should be carefully targeted at individuals meeting relevant criteria across role, industry, interest, behaviour and so. A quick and easy way to eliminate the unresponsive element of your database however is to apply a filter I and others often refer to as “recency” (accepting this is a made-up word!). This is by no means rocket science, but takes a little discipline and good data management. Put simply, those individuals in your database that have not responded or interacted in any way for a defined period of time, usually 2-3 years, should be excluded from activity going forwards. Even if their email address is still in use they’re simply never going to respond and are just skewing your results as discussed. The minuscule possibility that they will respond in the future is just not worth the negative impact of continuing to include these recipients in your activity.

The trick here of course is the ability to effectively determine who these non-responders are. You will need the outcomes of your email and other direct activity to be fed back to your database in order to readily make a selection based on these criteria. As well as email opens and clicks, you should also take into account website log-in if applicable, event attendance, purchase (obviously) and any other behaviour you can identify and track. Increasingly, this might include social media activity, such as Twitter or Facebook. It’s quite possible that lack of actual response to email doesn’t mean lack of interest, but you need to demonstrate this, not just make an assumption. The ability to make this part of your selection criteria clearly needs to be a “production” capability, built-in to your marketing operations, and not a hugely labour intensive task for every campaign execution.

It’s worth noting also that the lack of response to marketing activity could itself be used as a trigger for some other kind of follow-up, particularly for high value contacts. If a past customer or senior-level prospect has stopped responding, a quick call using a low-cost resource (i.e. not an expensive Inside Sales rep) to check their status could be worthwhile. Maybe the contact has left and been replaced, changed roles or allowed your company to fall off their radar. You might be able to re-engage, but if not, move on.

Recency should be a field in your database that is constantly calculated based on all the criteria outlined above, which can be readily included in a selection. Just to make the point, this is completely different from “last edit date”, which can often be set when a record in a database is merely viewed, regardless of whether a real change was made or activity performed by the contact. Implementing this simple addition to your campaign selection will have an instant, positive effect on your marketing metrics and save you from flogging dead horses.

External marketing service provider or internal database?

Friday, March 12th, 2010

A recent discussion in the pages of Database Marketing magazine regarding the merits of in-house versus out-sourced data management was reassuringly familiar. I’ve been involved in many debates as to the best approach over the years, with no definitive answer being reached. It depends, of course, on the circumstances of the organisation and it’s marketing requirements.

It may be possible, with a databases that doesn’t involve too many feeds or updates, to hold it externally and undertake batch cleansing. Increasingly though, the proliferation of data sources, update frequency and links to other systems means such a stand-alone approach isn’t feasible. In addition, data cleansing can’t be viewed as simply an occasional process, but one of continuous improvement.

This doesn’t necessarily dictate an in-house or external solution, but whatever the solution is, it must be able to integrate with other corporate systems and data sources. Enquiries captured on the website need to be stored in the marketing database for inclusion in ongoing nurturing activity, campaign outcomes fed back for tracking and measurement and qualified leads passed to Sales, with the eventual results recorded for ROI analysis.

Marketing systems and processes must increasingly be integrated with the wider enterprise and both MSPs and solution vendors must ensure this is what they are delivering.

Data quality is for life not just for Christmas

Thursday, December 10th, 2009

As Christmas rushes towards us, we’re once again reminded that those considering pets as gifts must keep in mind the ongoing responsibility they represent: “A dog is for life, not just for Christmas”. In considering this recently, I was struck that the adage could similarly be applied to data quality (without meaning to trivialise the original message). Data quality is not a one off exercise, a gift to users or marketing campaigns, but an ongoing commitment that requires management buy-in and appropriate resourcing.

It’s well known that data decays rapidly, particularly in B2B which must contend with individuals being promoted, changing jobs, moving companies and so on, together with mergers, acquisitions, wind-ups and more. I often refer to this as the “Data Half Life”, the period of time it takes for half of a database or list to become out-of-date, which can be two years or fewer. It’s this fact that makes data quality maintenance an ongoing task and not simply something that can be done once, ahead of a big campaign or new system implementation.

Yet time and again, I’m asked how best to “clean-up” a database in exactly such a situation, or I hear of efforts to do so. I’m not saying such an  undertaking shouldn’t be made, it’s certainly better to do so than not, but the effort and expense is substantially wasted if it’s conducted on an ad hoc or piecemeal basis. Data immediately starts to decay, as contacts move, addresses change, new records are added and inevitable duplicates created, standardisation rules disregarded, fields not properly completed and other issues creep in. Very soon the data is in the same state as it was before “the big clean” took place.

It’s tempting to suggest undertaking a batch cleanse on a regular basis then, recognising these problems and trying to stay on top of them. Depending on the nature of your database, this could well be a viable approach, and might be quite cost effect, particularly if you contract a bureau or data management supplier on an annual basis, say. Unless your database is a relatively static campaign management system that can be taken offline whilst such an operation is undertaken – which could be several days – this approach presents its own issues. Considerations here include what to do with data that changes in the system whilst it’s at the same time away being cleansed, how to extract and reload updates handling the merging of any identified duplicates.

Far better though is an approach to data quality management that builds quality into the heart of an organisation’s processes and operations. Something along the lines that I outlined here some time ago and which incorporates Gartner’s “data quality firewall” concept. (This suggests that just as a network firewall should protect against unwanted intrusion, a data quality firewall should prevent bad data from reaching an organisation’s systems.) Ideally, one of the growing number of data quality software platforms should be deployed in order to create a framework for this environment (recognising that neither the issue or the solution is solely one of technology). Competition in this area continues to erode the cost of such solutions, and indeed open-source vendor Talend even offer a version of their Talend Open Studio product as a free download.

Adopting this way of managing data quality is a longer term play that may lack the one-off satisfaction of a quick clean-up and require maintenance, nurturing and care long after the initial “gift” of project approval is received. But just like a dog, this is a gift that will keep on giving in terms of operational effectiveness and business agility, making rapid and flexible campaign execution a reality and not a chore.

Email bounces and database updates

Friday, August 28th, 2009

Commencing an engagement earlier in the summer with a company for which I had previously worked, I was issued with an Exchange account for internal communications whilst on-site. Not surprisingly, my external email address was the same as it had been when I was employed there, since it adopted a standard format comprising my first and surname together with the company’s domain. What did surprise me though, eighteen months after leaving the company, was the steady stream of emails I began to receive from lists to which I had been subscribed before I left.

Now perhaps I should have diligently ensured, before moving on, that I had unsubscribed from these lists or informed their senders of my change of address. The reality though is that this is often harder than it seems, between keeping track of the lists to which you have subscribed and knowing how to advise your new details. It’s usually not the highest priority when moving on either.

These emails sent to my old address would certainly have been bouncing back to the originator for quite some time. The failure, or conscious decision, by these senders not to process these bounces and use them as an opportunity to update their databases is astonishing. Across the entirety of their databases and subscriber lists, given the rate of decay of business data, these senders must experience significant volumes of email delivery failures.

Just as with spam, it’s tempting to dismiss such considerations on the grounds that the cost of continuing to send to dead addresses is minimal, the effort of doing something about it substantial and the overall impact negligible. This is not the case however, and persisting in sending to bounced addresses can lead to deliverability issues and represents a missed opportunity for database management.

Repeatedly sending to non-existent addresses and incurring the bounce back messages this generates gets noticed and can lead to being placed on spam offender lists. This could cause all email to be blocked by spam filters with obvious dire consequences for campaign effectiveness. You may not even know that here is a problem, except for the rather disappointing response rates.

Failing to update marketing databases with bounced addresses also means that the opportunity to track the fact that the record itself may be invalid is also lost. If other activity is being driven from the database, such as DM, then significant cost can be incurred sending to contacts who are no longer there. Acting on email bounces also offers the opportunity to proactively update the database. If an individual represented a high value contact (someone in a senior position or a frequent purchaser), perhaps it’s worth a call to establish where they’ve moved in order to re-establish contact or identify a replacement?

I’m not complaining that I’m receiving some of these emails again, and it may even be to some of the senders’ benefit in the end. But the likelihood of this situation arising is tiny and the potential negative impact significant. There’s no excuse for bad practice.

On good form

Wednesday, April 29th, 2009

I wanted to briefly mention a great new resource for anyone involved in online data collection, brought to us by international data quality and addressing guru, Graham Rhind. “Better data quality from your web form” is a free download ebook in pdf format that is designed to help achieve effective international name and address Internet data collection. In the spirit of full disclosure I should mention that Graham asked me to take a look at the book before he published it and as such I can say it’s an invaluable source of information.

Exhibiting Graham’s customary thorough and comprehensive coverage of the topic, the book includes guidance on name and address capture, use of pick-lists and other form elements, usability and data validation. Longe-standing readers of my blog will know that web forms are something of a hot topic for me and I hope this book will help curb some of the worst examples of bad practice out there!

The book is available for download from Graham’s site, and whilst you’re there you should take a look at the wealth of additional information he makes available.

How data quality equals more revenue

Thursday, April 2nd, 2009

Writing in his “Optimize Your Data Quality” blog recently, Jan-Erik Ingvaldsen of data quality solution developer Omikron referenced an article on about a piece of research that’s a must have for anyone building their data quality business case.

In their recent research study “The Impact of Bad Data on Demand Creation”, sales and marketing advisory firm SiriusDecisions assert that following best practices in data quality led directly to a 66 percent increase in revenue. Whilst I’ve outlined some generic business case drivers in the past (see “Building a data quality business case“), this is the kind of quantitative study that can really grab C-level attention when you’re trying to justify investment in data quality. The research outlines how addressing quality issues early on in the data life-cycle has an almost exponential benefit in cost efficiency and highlights the importance of collaboration in driving quality improvements.

“It is something that your organization simply can’t afford not to do,” says SiriusDecisions’ senior director of research, Jonathan Block. No argument here!

Dimensions of data quality

Thursday, January 15th, 2009

I mentioned recently that I had signed up to the International Association for Information and Data Quality (IAIDQ), who run webinars from time to time as part of the membership package. One such session took place yesterday, in the form of an “Ask-the-Expert” presentation by Danette McGilvray of US based information quality consultants Granite Falls Consulting. Danette outlined 12 Dimensions of Data Quality, including considerations such as integrity, accuracy and coverage. Although all crucial to attaining high quality data, I particularly liked no 12 “Transactability”, defined as “A measure of the degree to which data will produce the desired business transaction or outcome.”

In some regards this is a “softer” dimension than more quantitative ones like validity, but is at the same time what all data quality should really be about. DQ shouldn’t be seen as something that’s just good to do, but something with an ultimate goal, namely allowing us to do business the way we need to.

International Association for Information and Data Quality

Thursday, November 13th, 2008

Also at the Data Management and Information Quality Conference (see previous post) I signed up with the snappily titled International Association for Information and Data Quality (IAIDQ), a worldwide organisation devoted to the pursuit and promotion of data quality. Professing that “All those impacted by poor data and information quality, and those who just want to learn more, are welcome”, I can recommend membership to anyone with an interest in this area. Registration is not expensive (I even received a free mug, although I can’t vouch for whether the offer is still available!) and provides access to a wealth of information and resources, including occasional webinars and online tutorials.

6 data quality rules

Tuesday, November 11th, 2008

Fine oratory (such as that of a certain Barack Obama) often deploys what’s known as the “tricolon” or rule of threes – three words or phrases that create a pleasing cadence and drive a message home. Whilst the oratory of information quality guru Larry English at the recent Data Management and Information Quality Conference may not quite have been of President-elect standard (though still pretty good!), here are two sets of three rules relating to data quality that I picked up – a double tricolon, if you will!

3 Steps to better data

  1. Understand – perform interactive analysis (profiling) to establish what you’re got and where any issues lie.
  2. Improve – apply change to both underlying data and processes to enhance data quality and address the issues identified in step 1.
  3. Protect and control – on an ongoing, business as usual basis ensure that issues are identified and improvements made.

Handle information as you would any asset

  1. Acquire – data, like other assets, must always come from somewhere, be it an external source such as a list broker or in-house data capture. Consider what these sources are and how they’re selected.
  2. Manage – assets such as plant, equipment and stock all need managing (especially if you’re stock is perishable, like data), so take care of it in the same way.
  3. Dispose – at the end of its lifecycle, an asset is disposed as it no longer has value or performs its task. Data decays and looses value, so plan for it’s disposal.