Archive for July, 2007

6 data quality solution requirements

Wednesday, July 18th, 2007

Further to the “vision” I outlined last month, the RFP for our data quality system has been released to vendors (see recent post “Data quality – a vision”). Whilst data quality is by no means a technology problem or one that technology alone can solve, I do believe that a good software solution can create a platform around which the right processes can be built to achieve and maintain better data quality. These are the key requirements to which I’ve asked the vendor short list to respond. (A list of data quality suppliers is available on my main website.)

  1. Data profiling/validation rule generation – appropriate analysis of existing data structures and content in order to determine rules for ongoing data validation and exceptions reporting.
  2. Initial database address standardisation and de-duplication – perform initial address standardisation to local postal authority conventions, appending an address quality score to each record. Conduct org level de-duplication on the standardised data at country and site level and once organisations have been de-duped, conduct an individual level de-dupe at organisation level.
  3. Operational data processing – ongoing ad hoc data loads from internal and external sources, requiring address standardisation and merge/append (i.e. de-duplication) processing for loading to the main database. Monitoring and reporting of data validity and rule compliancy.
  4. Monitoring and maintenance – proactive identification of data quality issues resulting from invalid data loads or user updates. Present data requiring review/correction to appropriate users in order that amendments can be made and then prepared for loading back into the central database.
  5. Profiling and metrics – ongoing data quality metrics (consistency, completeness, frequency counts, scoring) and intervention reporting (duplicates identified and removed, automated validity amendments, manual corrections) based on set rules. Presentation via “dashboard” type report for easy review.
  6. Online data capture – real-time validation, standardisation and enhancement of data captured via web-based forms, including contact name and job title, email, telephone number and other elements. Apply formatting to all data (capitalisation etc) and telephone (local presentation conventions). Process captured data to be merge/appended to the main database.

Whilst I’m waiting for proposals from the vendors, the next step is to develop the business case for the project itself, to which I’ll return here soon.

The broken Salesforce.com leads model

Tuesday, July 3rd, 2007

Now, I realise that criticising Salesforce.com is tantamount to heresy in this day and age, but I’m not averse to controversy here. What is the source of this cynicism? Well, several hours locked in meeting rooms and on conference calls arguing about the definition of a lead doesn’t help. It has though, made me realise that the way Salesforce.com treats leads is woefully simplistic, with substantial knock-on effects:

  • Salesforce.com treats every response as a lead. This means that every whitepaper download, webinar registration or information request from the same person creates a new lead. This drives Sales insane, as they perceive the person responding to be the lead and basically ignore the subsequent responses, branding them irrelevant. The blame of course falls on Marketing, whose protestations of “It’s SALESforce.com, don’t blame us!” go unheeded.
  • As well as the impact on sales reps’ mental wellbeing, this lead proliferation causes rampant contact duplication. Any given individual could now have multiple leads associated with them, which are in effect separate records. I know of one company who suddenly discovered tens of thousands of lead records sitting in Salesforce.com that had been dumped there over time, representing horrendous duplication but also a complete loss of value as it was so difficult to know what was worth keeping and was not.
  • The absence of any linking together of leads for the same individual makes it very difficult for Sales to review response history and formulate an intelligent, holistic follow-up to a contact, further reducing the value of a lead.
  • Multiple leads per individual also makes analysis very difficult too, especially understanding lead conversation rates. If fifty leads from ten people convert into two opportunities, it looks like a 4% conversion rate, but really it’s 20% at the contact level. Making your lead conversation appear a fifth as good as is actually the case seems unnecessarily self-deprecating, and does nothing to support the argument for marketing investment.

In the discussions in which I’ve been involved regarding the definition of a lead, I’ve been trying to say that a lead is a person, not just a response or other action of an individual. In fact, I’d go further and say that a lead fundamentally represents a potential piece of business, to which multiple people could be related from within the decision making unit of a particular project in an organisation. Another word for this of course is opportunity, which is a completely different entity in Salesforce.com and handled much better (including the ability to link multiple contacts).

I understand that leads are a simpler entity that may come to nothing, and therefore shouldn’t be burdened with unnecessary complexity, but over simplification results in a solution that is not fit for purpose. And keeps me trapped in meeting rooms longer than I’d like.