Archive for the ‘Data quality’ Category

The third rail – sales order processing databases

Thursday, October 9th, 2008

I’ve written a lot about integrating sales and marketing databases (posts too numerous to provide links – search on “integration” in the sidebar), but so far I haven’t mentioned the third source in the marketing data ecosystem – order processing systems. Order processing systems are where the sales orders that leads and opportunities (hopefully!) eventually turn into are captured, invoices created and ultimately customer status converted. It may also be known as an enterprise resource planning (ERP) system, and also handle financials, human resources and other functions (possibly even CRM).

The reason these systems are important within a marketing operations context is because they are generally the system of record regarding whether an organisation is a customer or not, and what their purchase history is if so. Although the sales and marketing systems should have a view of completed opportunities and closed deals, there is inevitably a disconnect from what was supposed to have been sold and what was actually booked. Put starkly, once the deal is clinched, Sales’ enthusiasm for making sure it is accurately reflected in the SFA system wanes considerably; commissions are likely to be calculated based on what the order processing system says.

Care needs to be given to designing order processing links though. Here are some considerations:

  • Is the feed uni or bi-directional? In other words does the marketing database just receive updates of customer status and possibly purchase history. Such feeds are often one-way, as the owner of the order system will jealously guard their data integrity – not unreasonably, as it represents the “real” customer database for the company. However, if there is no feedback mechanism, then it may not be possible to correct issues with the data, such as missing address elements, inconsistent country values or duplicates.
  • How does the order system handle accounts and organisations. As a result of the different imperatives of ordering systems (delivery, invoicing, credit accounts), data is frequently held in a way that is inconsistent with that of the marketing database. If different departments of the same organisation, for instance, have made separate purchases, the order system may create separate records which will be perceived by the marketing database as duplicates. Take care in removing these duplicates from the marketing database however; not only might they simply turn up again with the next order system update, but you will loose the account number reference in the marketing database which might be a crucial external reference.
  • What purchase history data is available? If the feed is at “account” level (which may not be the same as unique organisations) it may include most recent order, invoice or contract date. That might be enough to derive a “customer” status, such as having ordered within a specified time frame or are within a maintenance contract, but may not include any information on what was ordered. On the other hand, you might be faced with a feed of every order or invoice, which is considerably more challenging to integrate.

Unlike the third rail of an electric railway, which you shouldn’t ever touch in order to avoid electric shock, the order processing systems is generally avoided even though they’re a crucial source of marketing data. Which isn’t to say you won’t get a shock if you try and integrate it!

Enhance and advance

Monday, August 11th, 2008

I’ve written before (see Using reference sources in data quality maintenance) about the benefits of matching marketing data, particularly organisations, to external reference data with regards to data quality improvement. We’ve just signed an agreement with Dun and Bradstreet to match a core subset of our database with their global database and enhance it with key attributes, such as industry, size and enterprise relationship. The plan is to refresh the matched data on a monthly basis so that we always have the most up to date view.

The data we’re enhancing consists of customers from the last few years together with Sales’ key prospects. By developing a better understanding of these organisations, we can not only target them more effectively, for instance by undertaking industry selections, but also better understand the interrelationships between organisations. We may have received a lead or have an existing relationship with a subsidiary that could be leveraged into the parent organisation, for instance. Our marketing activity can become more advanced, in terms of targeting and segmentation, as a result of this intelligence.

De-duplication is also a key benefit, as I’ve said before, as D&B are able to match using previous names and alternative trade styles together with other sophisticated techniques, that highlight duplicates that were otherwise not evident. Again, this can bring together otherwise hidden relationships and opportunities.

The drawback with D&B is that they’re quite expensive, and matching/enhancing hundreds of thousands of records is prohibitive. Although we’re enhancing our core data, some of the benefits I’ve outlined are lost when working with a subset; we don’t know if the records we’ve chosen are duplicated or have a relationship with others in the database. I’m hoping to discuss with D&B the idea of matching our entire database (an inexpensive activity at a few cents a record) and then enhance only those in which we’re interested, specifically those related to our core dataset. This isn’t a standard service D&B offer, and it can be a challenge to have them move outside their usual modus operandi, but hopefully they can be persuaded! I’ll let you know how I get on.

Address to impress – smart web form data collection

Thursday, April 17th, 2008

I’ve written previously about the importance of address management (see International address management) in maintaining data quality, and I mentioned that we planned to implement a new set of web enquiry forms with an address auto-completion feature (see Using web visits to build contact profiles). Well, I’m pleased to say the forms are online and working very nicely, improving not only the quality of address capture but also the user experience as well. Reducing the keystrokes required to complete a form, I believe, leaves more goodwill with the enquirer to answer a few more profile building questions.

The easiest way to see how the forms work is to try them for yourself, so take a look at the UK form and try filling it in. Once you’ve completed the postal code, the system looks up the address in the background, and as you start typing the first few characters of the street address, it presents options as to what the address should be. Once you type enough for a definitive selection, the address is completed (or you can pick from the list). In the UK, many business postal codes are sufficiently specific that the address is completed without typing any further, except perhaps for a street number.

The forms work across nearly all of our local EMEA sites and are localised for each one. In fact, on the UK form linked above, if you change the country and language options, the address field labels change to match. Unfortunately we’re not quite slick enough to change the entire form, but if you link via the relevant local site the page is fully localised, with the address elements driven by the addressing solution.

The address look-up solution is powered by UK specialists Postcode Anywhere who support the system via a simple AJAX based web service. The service is charged on a per click basis and is remarkably inexpensive, with credit packs covering several thousand look-ups available for just a few hundred pounds. Due to technical resource constraints, the forms themselves are actually hosted by my old friends at CRM Technologies but we’ve tried to make the overall experience as seamless as possible.

A number of potential enhancements have already presented themselves, in particular the ability to perform an organisation look-up on the fly and pre-populate profile fields such as revenue, number of employees and industry, based on Dun & Bradstreet data. This will even include DUNS number, adding to the reliability with which we can match web data capture back into the marketing database. I hope to update on progress soon!

Building a data quality business case

Tuesday, September 11th, 2007

Offering general advice on putting together a business case for a data quality initiative is challenging, because the business benefits and therefore payback are so dependent on specific circumstances. Here, however, are some key areas around which I’m constructing our justification, which I’ve tried to make sufficiently generic to be of wider use.

  • Website, Internet and miscellaneous data capture – our process relies on extensive manual effort for transposing/re-keying data, with very limited validation and standardisation etc, particularly for international data. Ironically, this has some benefits as there is a human element involved in matching incoming contact to existing data, but it’s hugely time consuming. If there’s any manual effort involved in your process, it’s an obvious source of efficiency gains and savings, not to mention quality improvement.
  • Address quality and duplication – based on various initial data quality assessments (such as outlined in Data health check previously), there is a 10% undeliverable and 3 % duplication rate among contacts in our database. Based on even a single direct mail execution per year, the waste in terms of undelivered and duplicated mail pieces is significant.
  • Campaign execution – list preparation effort (identification, selection, cleaning) can be greatly increased due to poor data issues, whilst limited targeting and segmentation may still only possible. According to a recent Aberdeen Group study (“Customer Data Quality: Roadmap for Growth and Profitability”, June 2007), “89% of Best-in-Class firms reported positive performance in the time necessary in preparing customer data” on improving their data quality.
  • Legal and best practice compliancy – the ability to reliably match new and existing data is crucial to recognising and observing privacy and other communication preferences. The reputational impact of not respecting contact preferences together with legal compliancy failure (especially in Europe) creates exposure to the risk of litigation or prosecution with potentially substantial penalties.
  • Lead quality and qualification – time savings and effectiveness benefits through more complete and informative leads (such as full contact details and organisation profile).
  • Time savings for general query resolution – reporting anomalies, data queries etc.

6 data quality solution requirements

Wednesday, July 18th, 2007

Further to the “vision” I outlined last month, the RFP for our data quality system has been released to vendors (see recent post “Data quality – a vision”). Whilst data quality is by no means a technology problem or one that technology alone can solve, I do believe that a good software solution can create a platform around which the right processes can be built to achieve and maintain better data quality. These are the key requirements to which I’ve asked the vendor short list to respond. (A list of data quality suppliers is available on my main website.)

  1. Data profiling/validation rule generation – appropriate analysis of existing data structures and content in order to determine rules for ongoing data validation and exceptions reporting.
  2. Initial database address standardisation and de-duplication – perform initial address standardisation to local postal authority conventions, appending an address quality score to each record. Conduct org level de-duplication on the standardised data at country and site level and once organisations have been de-duped, conduct an individual level de-dupe at organisation level.
  3. Operational data processing – ongoing ad hoc data loads from internal and external sources, requiring address standardisation and merge/append (i.e. de-duplication) processing for loading to the main database. Monitoring and reporting of data validity and rule compliancy.
  4. Monitoring and maintenance – proactive identification of data quality issues resulting from invalid data loads or user updates. Present data requiring review/correction to appropriate users in order that amendments can be made and then prepared for loading back into the central database.
  5. Profiling and metrics – ongoing data quality metrics (consistency, completeness, frequency counts, scoring) and intervention reporting (duplicates identified and removed, automated validity amendments, manual corrections) based on set rules. Presentation via “dashboard” type report for easy review.
  6. Online data capture – real-time validation, standardisation and enhancement of data captured via web-based forms, including contact name and job title, email, telephone number and other elements. Apply formatting to all data (capitalisation etc) and telephone (local presentation conventions). Process captured data to be merge/appended to the main database.

Whilst I’m waiting for proposals from the vendors, the next step is to develop the business case for the project itself, to which I’ll return here soon.

Data quality – a vision

Thursday, June 21st, 2007

Here’s my vision for our data quality solution, for your inspiration!

  • Data quality platform
    • Validate and standardise online data capture (inc address look-ups)
    • Monitor marketing data quality and highlight issues for action
    • Process all data loads (de-dupe, merge, append)
  • Data enhancement
    • Individual and organisation enhancements including role, industry, IT usage
    • Campaign execution
    • High degree of data and process “readiness” for selection, segmentation, tracking and reporting
    • “Outcome” feedback wherever possible (esp. email)
  • Lead management
    • Scoring and pass-over threshold
    • Maintenance of a single lead per contact through identification and merging of subsequent responses
  • Privacy
    • Adherence to compliance and best practice

I’ll outline how these translate into a set of requirements for a data quality solution soon.

Data health check

Tuesday, June 12th, 2007

In getting to grips with a new database, a good first step is to have a complimentary data health check undertaken by one of the several suppliers who offer such a service. Obviously they’re designed as a hook to draw you into their services, so try and choose someone you think you’ll use in the future. Having worked with Global Database Management on some other initiatives recently, I supplied a dataset to them to put through their process. As their names suggests, international data is a particular strong point for them, including addressing and personal names, so I was keen to see results on the database for which I’m newly responsible.

Consisting of a well presented PDF document, the quality assessment covered address, naming, country coverage and duplicates, with statistics and commentary on various aspects of the database. The results were much as I expected, some of the validity checks in particular making for amusing reading, with entries such as “***update address ***”, “not existing individuals” and “hugh jarse”! (If you have a few minutes and want to bring a smile to your face, try looking for cartoon character names in your database, there are bound to be a few!) Their quantification though will be invaluable in terms of determining how to approach the issues and in particular constructing a business plan. I’ll return to this as work progresses.

Institute of Direct Marketing Data Council Summit

Saturday, March 3rd, 2007

This week saw the first Institute of Direct Marketing Data Council Summit, a day of presentations from such luminaries as Sean Kelly, Steve Wills and Huw Davis along with practitioners from BP, the AA and others. Themed “Data Management Strategies that Create Competitive Advantage”, the conference was intended to address issues such as building competitive advantage through customer intelligence and insight, using data to improve the customer experience and demonstrating the value of data to the board.

Chair for the day Ian Lovett, of data consultancy Blue Sheep, opened with a stern warning to the direct marketing profession that the growing consumer impression of environmental damage and intrusion through wasteful direct mail was creating a political will to introduce ever greater restrictions on privacy and data use, such as requiring opt-in for all marketing. Better targeting and management of data quality was needed to demonstrate that direct marketing is a responsible and considerate discipline that can be trusted with personal data. “Love you data,” said Ian: “Clean it, use it and don’t abuse it!”.

Other themes running through the day were the idea that marketing has failed to keep up with the technology available to it, the growing recognition of the strategic value of data and a topic close to my heart, the creation of central insight departments in marketing organisations. Presenting a retail segmentation case study, Sean Kelly suggested that the failure to create marketing intelligence capability based on the latest technology prior to operational capability is the single greatest reason for CRM failure. He likened it to having the ability to talk but without a brain to control what to say! Peter Mouncey from the Cranfield University School of Management echoed this, saying that organisations’ data strategy lags behind their CRM strategies, adding that they must be aligned to be effective.

Rosemary Albinson from BP and Steve Willis (yep, the guy I quote my my homepage) both commented that marketers must become comfortable with the hard data of marketing results and spend time in working in insight in order to progress to the boardroom. At the same time the scale of the task should be acknowledged; explaining marketing analytics to an accomplished scientist, Rosemary Albinson was told that is seemed more complicated than his field of climatology! Based on experience from the his Customer Insight Forum, Steve Willis also outlined his thinking regarding the management of insight and his vision for a dedicated function lead by an Insight Director, a position he says that is becoming increasingly common. Christine Bailey, from Cranfield University School of Management also commented that it helps to put a central insight team in place.

There were a few different offers on a definition of insight, from Steve Willis’ “embedded knowledge” to Christine Bailey’s multiple sources of actionable customer data. And although he couldn’t be there himself, former GE CEO Jack Welch was quoted as saying (and here I paraphrase) that competitive advantage is derived from the ability to learn faster and act faster than the competition.

The day was closed out by Huw Davis, always entertaining and worth listening to. He talked about the opportunities and challenges regarding international data strategies, particularly in the developing markets of China, India and elsewhere. Clearly the data infrastructure in those countries isn’t quite what we’re used to, but the opportunities for direct marketing are immense. Huw is also about to launch a new analytics business utilising lower cost analyst resource in Asia with a UK account team – you read it here first!

All told, quite an interesting day that reinforced some of my thoughts in this area and provided some interesting tips on data quality programmes and data warehouse projects. I’d better get back to immersing myself in marketing insight – next stop the boardroom!

Tracking address changes and out of business companies

Monday, January 15th, 2007

Over lunch a little while ago, I bemoaned the lack of international business change of address and out of business data. Whilst UK Changes has a strong UK based offering, it doesn’t extend to Europe or beyond. I was immediately corrected though, and told that such services do exist in other countries, if you know where to look.

Having been told “where to look” we set about running some sample data against various datasets to establish the extent of moved and deceased businesses. We tested the UK, Germany, France and the Netherlands with several other countries also available.

Inevitably, the methodology for each country was different, with results presented in varying formats. You begin to understand US colleagues’ exasperation when contemplating working across Europe! This variation though is as much to do with how business data is obtained in each country. Whilst in the UK Companies House maintains a list of incorporated businesses, in France all traders are required by law to register. This of course makes it very much easier to track smaller businesses than in the UK.

The results themselves, whilst not altogether surprising, were still very interesting. Broadly grouped into movers with and without a new address and those businesses no longer trading, the match rate for some countries was quite high. This obviously suggests significant potential wasted direct marketing expenditure as well as distorting database metrics. Further analysis is clearly appropriate, but at a fairly modest cost of about £15,000 to tackle the countries tested, this could represent a worthwhile investment in data quality.

Using reference sources in data quality maintenance

Monday, December 11th, 2006

A key tenet of data quality management is the use of reference data to ensure consistency and validity. An obvious example is a pick-list (Mr, Mrs, Ms etc) for the Title field on a contact entry screen, ensuring standardised data capture and prevention of invalid entries. Or even misinterpretation of the field itself, for instance resulting in a job title being inadvertently entered (it happens!).

A more sophisticated example is the use of address look-up and enhancement solutions. Referring to postal address reference data maintained by national postal administrations helps ensure the accuracy of address data. The same is also possible though with organisation data, by referring to an external source of comprehensive business data to verify companies on a database. Matching to such an external source and appending key data such as size, industry or number of installed PCs is not new of course. This is a good way of understanding more about the organisations on your database and undertaking customer profiling. The benefits go wider than this though and can contribute to general data quality enhancement.

Matching two organisation records on your database, that appear to be different to each other, to an external source may highlight that in fact they are the same. The resulting de-duplication at organisation level may in turn reveal contact duplicates that can also be removed, with obvious benefits to mailing costs and brand perception. It may be possible to identify that an organisation on your database is part of a group of companies with whom you are already doing business, with potential gains in relation to privacy permissions and messaging. The ability to append phone numbers is also a real win, avoiding having to look-up phone numbers as part of any telemarketing you may be undertaking.