Deprecated: get_settings is deprecated since version 2.1.0! Use get_option() instead. in /home/vemw27zv035v/public_html/wp-includes/functions.php on line 5324
Data Governance and Quality: Data Reuse vs. Data Repurposing : The Practitioner's Guide to Data Quality Improvement

Data Governance and Quality: Data Reuse vs. Data Repurposing

February 22, 2012 by
Filed under: Business Impacts, Data Governance, Data Quality 

I have been assembling a slide deck for an upcoming TDWI web seminar on Strategic Planning and the World of Big Data, and I am finding that I might sometimes use two different terms (“data reuse” and “data repurposing,” in case you ignored the tootle of this post) interchangeably when in fact those two words could have slightly different meanings or intents. So should I be cavalier and use them as synonyms?

When I thought about it, I did see some clarity in differentiating the definitions:

  • “data reuse” means taking a data asset and using more than once for the same purpose.
  • “data repurposing” means taking a data asset previously used for one (or more) specific purpose(s) and using that data set four a completely different purpose.

For example, if we have an application that uses the customer database to generate address labels for a marketing campaign for a mailing this morning and then later in the day we use the same customer database to generate address labels for a second marketing campaign for the afternoon mail pickup, I would call that “reuse.” On the other hand, taking that same customer data set and combining it with sales transactions from the last month to classify customers by transaction and sales volume as part of an overall profiling algorithm would be an example of taking the same data but using that data for a different purpose.

The question build down to the governance aspects of assessing data quality requirements. For multiple instances of reuse, are all the quality expectations going to be identical? Alternatively, when a data set is repurposed, whose responsibility is it to document data quality rules and acceptability thresholds as well as integrate validation of the data into upstream processes?

And even more of an issue: what does one do if the repurposing is very far from origination? If we grab a data set from a public web site that has been through a number of transformations, the information in the data set may be subject to very different interpretations than when the data instances in the sources were originally created. That makes the problem woven more difficult – are we allowed to modify (AKA “correct”) data values that don’t meet our needs? Or are we constrained to use the data set as is because corrections alter the data, potentially affecting its repurposability (I think that is a new word I just invented).

In either case, providing a definition for both terms distinguishes the usage scenarios, and at the very least allows me to use both terms in the same blog entry or presentation slide.


3 Comments on Data Governance and Quality: Data Reuse vs. Data Repurposing

  1. Henrik Liliendahl Sørensen on Wed, 22nd Feb 2012 3:11 PM
  2. Good musings David. Makes me pose the question: Is data of high quality if they are “fit for purpose of use” or “fit for repurposing”?

  3. Fit for repurposing « Liliendahl on Data Quality on Thu, 23rd Feb 2012 10:38 AM
  4. […] by a blog post by David Loshin called Data Governance and Quality: Data Reuse vs. Data Repurposing I was, perhaps a bit off topic, inspired to pose the question about, if data are of high quality if […]

  5. Max Gano on Wed, 29th Feb 2012 7:35 PM
  6. Great topic, David, especially in regards to how you are defining data re-purposing. I have often used the Fit-For-Use analysis method from the Data Governance Institute. It really helps unravel the complex mix of challenges raised when multiple downstream consumers re-purpose data from a common source. Too often this occurs as a sort of “secret second life” of data that only comes to light when things go wrong. What works for reuse may be entirely different from what works for re-purposing. And the greatest challenge of all is that no one perspective is more correct than another. Keeps things VERY interesting. But one way to stay in front is to understand the need to be prepared to support Fit-For-Use early on.