iOS6, Apple Maps, and the Biggest Data Quality Story This Year

September 28, 2012 by · 1 Comment
Filed under: Data Governance, Data Quality 

As anticipated, as part of Apple’s recent release of iOS6, the incumbent Google Maps application was replaced by Apple’s homegrown version. Excitement has quickly degenerated into disappointment (at best) and anger (at worst) over the flaws in Apple’s version. And as of this morning, a quick scan at Google News reported almost 2000 articles reflecting Apple’s mea culpa culminating with a personal message from CEO Tim Cook stating:

“At Apple, we strive to make world-class products that deliver the best experience possible to our customers. With the launch of our new Maps last week, we fell short on this commitment. We are extremely sorry for the frustration this has caused our customers and we are doing everything we can to make Maps better.”

It looks like quite a firestorm over what is basically a data quality issue… Read more

Using Data Integration Testing for Reconciling Production Data Assets

In my last post, we started to discuss the need for fundamental processes and tools for institutionalizing data testing. While the software development practice has embraced testing as a critical gating factor for the release of newly developed capabilities, this testing often centers on functionality, sometimes to the exclusion of a broad-based survey of the underlying data asset to ensure that values did not (or would not) incorrectly change as a result.

In fact, the need for testing existing production data assets goes beyond the scope of newly developed software. Modifications are constantly applied within an organization – acquired applications are upgraded, internal operating environments are enhanced and updated, additional functionality is turned on and deployed, hardware systems are swapped out and in, and internal processes may change. Yet there are limitations in effectively verifying that interoperable components that create, touch, or modify data are not impacted. The challenge of maintaining consistency across the application infrastructure can be daunting, let alone assuring consistency in the information results. Read more

Best Practices for Data Integration Testing Series – Instituting Good Practices for Data Testing

August 3, 2012 by · 2 Comments
Filed under: Data Governance, Data Profiling, Data Quality, Metadata 

I have been asked by folks at Informatica to share some thoughts about best practices for data integration, and this is the first of a series on data testing.

It is rare, if not impossible, to develop software that is completely free of errors and bugs. Early in my career as a software engineer, I spent a significant amount of time on “bug duty” – the task of looking at the list of reported product errors and evaluating them one-by-one to try to identify the cause of the bug and then come up with a plan for correcting the program so that the application error is eliminated. And the software development process is one that, over time, has been the subject of significant scrutiny in relation to product quality assurance.

In fact, the state of software quality and testing is quite mature. Well-defined processes have been accepted as general best practices, and there are organized methods for evaluating software quality methodology capabilities and maturity. Yet when all applications are a combination of programs applied to input data to generate output information, it is curious that the testing practices for data integration and sharing remain largely ungoverned manual procedures. Read more

Response to “Eight Problems with Big Data”

April 26, 2012 by · Leave a Comment
Filed under: Analytics, Business Impacts, Data Analysis 

After reading Jay Stanley’s ACLU article on “Eight Problems with Big Data,” it is worth reflecting on what could be construed as a fear-mongering indictment of the use of big data analytics and the implication that big data analytics and its implementation of data mining algorithms are tantamount to all-out invasion of privacy. What is interesting, though, is the presumption that privacy advocates have been “grappling” with data mining since “not long after 9/11,” yet data mining was already quite a mature discipline by that point in time, as was the general use of customer data for marketing, sales, and other business purposes. Raising an alarm about “big data” and “data mining” today is akin to shutting the barn door decades after the horses have bolted. Read more

Data Governance and Quality: Data Reuse vs. Data Repurposing

February 22, 2012 by · 3 Comments
Filed under: Business Impacts, Data Governance, Data Quality 

I have been assembling a slide deck for an upcoming TDWI web seminar on Strategic Planning and the World of Big Data, and I am finding that I might sometimes use two different terms (“data reuse” and “data repurposing,” in case you ignored the tootle of this post) interchangeably when in fact those two words could have slightly different meanings or intents. So should I be cavalier and use them as synonyms?

When I thought about it, I did see some clarity in differentiating the definitions:

  • “data reuse” means taking a data asset and using more than once for the same purpose.
  • “data repurposing” means taking a data asset previously used for one (or more) specific purpose(s) and using that data set four a completely different purpose. Read more

« Previous PageNext Page »