Graphical Presentation of Impacts of Poor Data Quality
After having led or participated in a number of data quality assessments, I continue to think about good ways to present results of the analysis that convey both the severity of speciifc issues while simultaneously allowing the reader to compare the different issues. I will admit that I am not a “visualization” person, nor do I advocate creating dashboards and scorecards as the end product of a data quality activity. Rather, the scorecard is the means to an end, which is the prioritzation of the issues so that most effective use of resources can get the maximum benefit.
That being said, I do think that radar charts are one good visualization paradigm. A radar chart allows you to map multiple variable in a 2-dimensional view that conveys comparative information. Here is an example:
This example portrays the measures of severity for four different value driver areas for a single data quality issue. By looking at this graph, you can quickly see that incomplete dates have a high financial impact, but relatively low risk and productivity impacts. I am still experimenting with these types of images, and tinkering with excel to figure out how to get multiple axes represented in a single graph so that I can overlay the impact dimension with a “remediation suitability” dimension that presents the time to value, cost to resolve, and staff effort. Together that would provide a summary of the severity of the issue and the feasibility of its resolution. If you have some suggestions, let me know, and when I figure it out I will post a follow up.