Validity vs. Correctness Continued: Accuracy Percentages

Yesterday I shared some thoughts about the differences between data validity and data correctness, and why validity is a good start but ultimately is not the right measure for quality. Today I am still ruminating about what data correctness or accuracy really means.

For example, I have been thinking for a long time about the existence (or more accurately, nonexistence) of benchmarks for data quality methods and tools, especially when it comes to data accuracy. On the one hand, I often see both vendors and their customers reporting “accuracy percentages” (e.g. “our customer data is 99% accurate”) and I wonder what is meant by accuracy and how those percentages are both calculated and verified.

  Read more

Initial Thoughts: Data Quality of Non-Persistent Data Elements

February 22, 2011 by · 1 Comment
Filed under: Data Quality, Performance Measures 

Last week I attended the Data Warehousing Institute’s World Conference in Las Vegas, teaching a class on Practical Data Quality Management. As part of the discussion on critical data elements, I suggested some alternatives for qualifying a data element as “critical,” including “presence on a published report.” In turn, I pointed out an interesting notion that often does not occur to many data analysts, but does require some attention: monitoring the quality of non-persistent data.
Read more