Initial Thoughts: Data Quality of Non-Persistent Data Elements

February 22, 2011 by · 1 Comment
Filed under: Data Quality, Performance Measures 

Last week I attended the Data Warehousing Institute’s World Conference in Las Vegas, teaching a class on Practical Data Quality Management. As part of the discussion on critical data elements, I suggested some alternatives for qualifying a data element as “critical,” including “presence on a published report.” In turn, I pointed out an interesting notion that often does not occur to many data analysts, but does require some attention: monitoring the quality of non-persistent data.
Read more

Graphical Presentation of Impacts of Poor Data Quality

January 13, 2011 by · 2 Comments
Filed under: Data Analysis, Data Governance, Metrics 

After having led or participated in a number of data quality assessments, I continue to think about good ways to present results of the analysis that convey both the severity of speciifc issues while simultaneously allowing the reader to compare the different issues. I will admit that I am not a “visualization” person, nor do I advocate creating dashboards and scorecards as the end product of a data quality activity. Rather, the scorecard is the means to an end, which is the prioritzation of the issues so that most effective use of resources can get the maximum benefit.
That being said, I do think that radar charts are one good visualization paradigm. A radar chart allows you to map multiple variable in a 2-dimensional view that conveys comparative information. Here is an example:

This example portrays the measures of severity for four different value driver areas for a single data quality issue. By looking at this graph, you can quickly see that incomplete dates have a high financial impact, but relatively low risk and productivity impacts. I am still experimenting with these types of images, and tinkering with excel to figure out how to get multiple axes represented in a single graph so that I can overlay the impact dimension with a “remediation suitability” dimension that presents the time to value, cost to resolve, and staff effort. Together that would provide a summary of the severity of the issue and the feasibility of its resolution. If you have some suggestions, let me know, and when I figure it out I will post a follow up.