Data Governance and Quality: Data Reuse vs. Data Repurposing

February 22, 2012 by · 3 Comments
Filed under: Business Impacts, Data Governance, Data Quality 

I have been assembling a slide deck for an upcoming TDWI web seminar on Strategic Planning and the World of Big Data, and I am finding that I might sometimes use two different terms (“data reuse” and “data repurposing,” in case you ignored the tootle of this post) interchangeably when in fact those two words could have slightly different meanings or intents. So should I be cavalier and use them as synonyms?

When I thought about it, I did see some clarity in differentiating the definitions:

  • “data reuse” means taking a data asset and using more than once for the same purpose.
  • “data repurposing” means taking a data asset previously used for one (or more) specific purpose(s) and using that data set four a completely different purpose. Read more

Download Updated Version of “The Analytics Revolution”

I recently updated a white paper I did for IBM called “The Analytics Revolution – Optimizing Reporting and Analytics to Make Actionable Intelligence Pervasive.” Click here to download this revised masterpiece.

Managing History in Master Data Management

Yesterday was the first of a series of breakfast presentations I am making with Ataccama (a data quality and MDM tools company) on the value of master data management, data quality, and data governance. One of the attendees works for a company that has invested a significant amount of budget and effort in MDM, yet is finding some challenges today regarding the management of history, slowly-changing reference concepts, and associated semantics.
Read more

Model-Driven Design: The Start or the End?

February 4, 2011 by · 1 Comment
Filed under: Data Governance, Master Data, Metadata 

I had two interesting briefings this week. One was from a company called Orchestra Networks providing a tool for model-driven master data management, and the other was a company called Collibra, which provides a model-driven tools for capturing metadata and semantics.
Read more

Data Quality and Data Profiling

Data warehousing, business intelligence, application renovation – these are all situations that will eventually lead to some need for data integration or migration. But data integration is difficult to do if you are unfamiliar with what is actually sitting in your data. Older data sets are frequently undocumented, and even the documentation of newer systems often does not track with reality as system changes are deployed. There is a need for methods to analyze your data to assess the “ground truth” as a prelude to any data integration or migration initiative. And this is done using a tool called a data profiler.

Data profiling has become such a commonly-used piece of technology that it is often specifically equated with the concept of data quality assurance. And as data profiling has emerged as a critical commodity tool, it could actually be seen as a set of technical tools that can be applied in support of numerous information management programs, including data quality assessment, data quality validation, metadata management, ETL processing, migrations, and modernization projects. The value of data profiling lies in the ability to integrate the capabilities of a technical tool with knowledge of how to apply what can be learned in support of a program’s goals.

Data profiling incorporates a collection of analysis and assessment algorithms that when applied in the proper context will provide shed some light into what potential issues exist within a data set. If you want to learn more about data profiling, in chapter 14 of my book, we consider some of the analyses and algorithms that are performed and how those analyses are used to provide value in a number of application contexts, including assessment or potential anomalies, business rule discovery, business rule validation, validation of metadata, and data model validation.