6 data quality solution requirements

Further to the “vision” I outlined last month, the RFP for our data quality system has been released to vendors (see recent post “Data quality – a vision”). Whilst data quality is by no means a technology problem or one that technology alone can solve, I do believe that a good software solution can create a platform around which the right processes can be built to achieve and maintain better data quality. These are the key requirements to which I’ve asked the vendor short list to respond. (A list of data quality suppliers is available on my main website.)

  1. Data profiling/validation rule generation – appropriate analysis of existing data structures and content in order to determine rules for ongoing data validation and exceptions reporting.
  2. Initial database address standardisation and de-duplication – perform initial address standardisation to local postal authority conventions, appending an address quality score to each record. Conduct org level de-duplication on the standardised data at country and site level and once organisations have been de-duped, conduct an individual level de-dupe at organisation level.
  3. Operational data processing – ongoing ad hoc data loads from internal and external sources, requiring address standardisation and merge/append (i.e. de-duplication) processing for loading to the main database. Monitoring and reporting of data validity and rule compliancy.
  4. Monitoring and maintenance – proactive identification of data quality issues resulting from invalid data loads or user updates. Present data requiring review/correction to appropriate users in order that amendments can be made and then prepared for loading back into the central database.
  5. Profiling and metrics – ongoing data quality metrics (consistency, completeness, frequency counts, scoring) and intervention reporting (duplicates identified and removed, automated validity amendments, manual corrections) based on set rules. Presentation via “dashboard” type report for easy review.
  6. Online data capture – real-time validation, standardisation and enhancement of data captured via web-based forms, including contact name and job title, email, telephone number and other elements. Apply formatting to all data (capitalisation etc) and telephone (local presentation conventions). Process captured data to be merge/appended to the main database.

Whilst I’m waiting for proposals from the vendors, the next step is to develop the business case for the project itself, to which I’ll return here soon.

Leave a Reply

You must be logged in to post a comment.