2008 Jun 01
Technical Measures for Data Quality Investments
David M. Raab
DM Review
June 2008
Last month’s column presented several return on investment calculations for data quality projects. These were the measures that business people look at: profit per customer, cialis promotion effectiveness, physician value per response, return on promotion expense. Let’s look at technical measures of data quality for those same cases.

– profit per customer. An automobile dealer made service history available to salespeople while a new car purchase was being negotiated. The business value came from targeted offers to increase use of the highly profitable service department.

Technical data quality measures included:

– speed of access: this is the time it takes the salesperson to retrieve data for a customer. Multiple queries may be needed before the system returns a satisfactory result, and salespeople will not bother if it takes too much effort. Elapsed time would be gathered from system logs.

– match rate: this is the proportion of successful matches returned by the system. There are separate statistics for correct matches, false matches, and missed matches. Match accuracy is often difficult to measure because the correct answer is not known. But in this case, the customer will know whether she has previously used the service department. The salesperson should therefore know when to look and keep trying until the system returns a match. This means the most important measure is “correct results returned on the first try,” as shown by the number of successful single-search sessions. Successful searches are followed by a request to view the underlying data. Abandoned searches are not.

– service data quality: this includes all quality components—accuracy, completeness, consistency, currency, and suitability to task. Since the service history is derived from the service department’s billing system, it should be reasonably accurate, current and complete. This would be confirmed by the company’s normal auditing functions. Consistency is measured by profiling the data over time to identify unexpected values or value distributions. Profiling can also detect improper or fraudulent billing—something the service manager may or may not be particularly eager to explore.

Suitability to task is a particular challenge, since the data is being used for something other than its original purpose. The system must summarize the raw service data to show aggregate purchases, changes in usage patterns, types of work (e.g. all routine maintenance or only major repairs), and inferences about customer needs (high mileage, off-road travel, heavy loads, etc.). Summarization depends on the core data quality measures of accuracy, completeness and consistency.

Even summarized data can be difficult for a salesperson to interpret, so the system should also recommend a best offer. Recommendation quality is measured by tracking how many recommendations are presented by the salespeople, how many of these are accepted by the customers, and their long-term impact on customer profitability. Presentations and acceptances can be measured directly so long as salespeople record their results. Long-term impact requires tracking customers over time.

Similar technical data quality measures apply to the other three cases. Space is limited, so, briefly:

– promotion effectiveness. This was a project to improve accuracy of a packaged goods manufacturer’s lists of distributor contacts. The business value was better execution of retail promotions. Technical data quality measures include:

– list accuracy: determined by random telephone calls to the distributors to verify the names on the existing lists. Returned mail and rejected email addresses may also provide information.

– update speed: determined by tracking how often the sales force provides list updates. This will identify salespeople who are not participating.

– value per response. This described an online marketer’s project to reduce bad debt and improve product recommendations through better real-time access to customer history. Technical data quality measures include:

– match rates with internal systems: measures include the percentage of successful matches, the percentage of confident matches (using a system-generated confidence score), and the percentage of multiple matches (more than one customer record matches a single input). Here, independent validation of match accuracy may not be available.

– match rates from external sources: confidence scores may not available, so the only measure is the match rate itself. Some verification is needed to measure false matches—a particular issue with external vendors who are paid on the number of hits.

– quality of results from internal systems. Completeness is measured by the scope of data provided: purchases, payments, returns, refunds, and service interactions. These may originate in several different systems. Currency is measured by how long it takes a new transaction to become available. It can range from milliseconds to a month.

Important non-data quality measures include response time and prediction accuracy.

– return on promotion. This described direct response marketer who used lifetime value to optimize promotion spending. Technical data quality measures included:

– cost data: accuracy, completeness, and consistency. The most important measure is percentage of missing values, since many marketers fail to record the necessary information in the marketing system. Another key measure is variation between the marketing system and the accounting system, since entries in the market system may not be revised to reflect actuals.

– customer integration: accuracy and completeness. Records for the same customer are set up independently in several systems and then merged. Measures of incomplete merges include refunds without a corresponding purchase, and repurchases without an initial order.

* * *

David M. Raab is a Principal at Raab Associates Inc., a consultancy specializing in marketing technology and analytics. He can be reached at draab@raabassociates.com.

Leave a Reply

You must be logged in to post a comment.