Papers on the Value of Data Quality Improvement

Thanks to my friends at Informatica, I have been able to allocate some time to organizing and documenting some ideas about understanding different types of business impacts related to data quality improvement and how they can be isolated, organized, measured, and communicated to a variety of stakeholders across the organization.

There are four papers:

Understanding the Financial Value of Data Quality Improvement
Improved Risk Management via Data Quality Improvement
Improving Organization Productivity through Improved Data Quality

Increasing Confidence, and Satisfaction Through Improved Data Quality (link coming soon)

Exploiting the Crisis: Driving Data Quality Using Anecdotes

February 3, 2011 by · 1 Comment
Filed under: Business Impacts, Data Quality 

Anyone who regularly reads this web site as well as my other media outlets knows that I am an advocate of clearly defining measures for assessing how poor data quality impacts the business. In fact, one of the main challenges of establishing a data quality program is effectively communicating the value of improved data quality to senior managers. You might think that showing some specific failures and their impacts would be enough to make the argument to invest in improvements, but unfortunately this is often not the case.
Management’s attention is more dramatically grabbed by catastrophic events. Acute disasters linked to data issues, ongoing horror stories, and even entertaining anecdotes will probably resonate with many people in the organization because of the drama as well as the chance for motivated individuals to react to the problem with a heroic effort that appears to save the day.
Read more

Data Edits, Data Quality Controls, and a Performance Question

January 28, 2011 by · 2 Comments
Filed under: Data Quality 

In a number of places I have used the term “data quality control” in the context of a service that inspects the compliance of a data instance (at different levels of granularity, ranging from a data value to an entire table or file, for example) to a defined data quality rule. The objective of the control is to ensure that the expectations of the process consuming are validated prior to the exchange of data.

Some times when I have used this term, people respond by saying “oh yes, data edits.” But I think there is a difference between a data edit and a control. In most of the cases I have seen, an edit is intended to compare a value to some expectation and then change the value if it doesn’t agree with the intended target’s expectation. This is actually quite different than the control, which generates a notification or other type of event indicating a missed expectation. Another major difference is that the edit introduces inconsistency between the supplier of the data and the consumer of the data, and while the transformation may benefit the consumer in the short run, it may lead to issues at a later point when there is a suddent need for reconciliation between the target and the source.

A different consideration is that data edits are a surreptitious attempt to harmonize data to meet expectations, and does so without the knowledge of the data supplier. The control at least notifies a data steward that there is a potential discrepancy and allows the data steward to invoke the appropriate policy as well as notify both the supplier and the consumer that some transformation needs to be made (or not).

One other question often crops up about controls: how do they affect performance? This is a much different question, and in fact a recent client mentioned to me that they used to have controls in place but it slowed the processing down and so they removed them. This means it boils down to a decision of what is more important: ensuring high quality data or ensuring high throughput. The corresponding questions to ask basically center on the cumulative business impacts. Observing processing time service level agreements may be contractually imposed (with corresponding penalties for missing the SLA), which may even suggest that allowing some errors through and patching the process downstream might be less impactful than incurring SLA non-observance penalty costs.

March 1,2,3 David Loshin Events – Strategic Business Value from Enterprise Data

January 19, 2011 by · Leave a Comment
Filed under: Events 

I have been invited by data quality and MDM tool company Ataccama to be the invited guest speaker at a series of breakfast seminar events in early March at the following locations:

March 1 Bridgewater NJ

March 2 Chicago, IL

March 3 Charlotte, NC

The topic is “Strategic Business Value from your Enterprise Data,” and I will be discussing aspects of business value drivers for Data Quality and MDM. I believe that attendees will also get a copy of my book “Master Data Management.”

I participated in a few similar events at the end of 2010 and found that some of the attendees posed ssome extremenly interesting challenges, and I hope to share some new insights at these upcoming events!

Optimized Maintenance and Physical Asset Data Quality

December 2, 2010 by · Comments Off on Optimized Maintenance and Physical Asset Data Quality
Filed under: Business Impacts, Business Intelligence, Data Analysis, Data Quality, Master Data, Metrics, Performance Measures 

It would be unusual for there to be a company that does not use some physical facility from which business is conducted. Even the leaders and managers of home-based and virtual businesses have to sit down at some point, whether to access the internet, make a phone call, check email, or pack an order and arrange for its delivery. Consequently, every company eventually must incur some overhead and administrative costs associated with running the business, such as rent and facility maintenance, as well as telephones, internet, furniture, hardware, and software purchase/leasing and maintenance.

Today’s thoughts are about that last item: the costs associated with building, furniture, machinery, software, and grounds maintenance. There is a balance required for effective asset maintenance – one would like to essentially optimize the program to allocate the most judicious amount of resources to provide the longest lifetime to acquired or managed assets.

As an example, how often do offices need to be painted? When you deal with one or two rooms, that is not a significant question, but when you manage a global corporation with hundreds of office buildings in scores of countries, the “office painting schedule” influences a number of other decisions regarding bulk purchasing of required materials (e.g. paint and brushes), competitive engagement of contractors to do the work, temporary office space for staff as offices are being painted, etc., which provide a wide opportunity for cost reduction and increased productivity.

And data quality fits in as a byproduct of the data associated with both the inventory of assets requiring maintenance and the information used for managing the maintenance program. In fact, this presents an interesting master data management opportunity, since it involves the consolidation of a significant amount of data from potentially many sources regarding commonly-used and shared data concepts such as “Asset.” The “Asset” concept can be hierarchically organized in relation to the different types of assets, each of which exists in a variety of representations and each of which is subject to analysis for maintenance optimization. Here are some examples:

  • Fixed assets (real property, office buildings, grounds, motor vehicles, large manufacturing machinery, other plant/facility items)
  • Computer assets (desktops, printers, laptops, scanners)
  • Telephony (PBX, handsets, mobile phones)
  • Furniture (desks, bookcases, chairs, couches, tables)

I think you see where I am going here: errors in asset data lead to improper analyses with respect to maintenance of those assets, such as arranging for a delivery truck’s oil to be changed twice in the same week, or painting some offices twice in a six month period while other office remain unpainted for years. Therefore, there is a direct dependence between the quality of asset data and the costs associated with asset maintenance.

Next Page »