As anticipated, as part of Apple’s recent release of iOS6, the incumbent Google Maps application was replaced by Apple’s homegrown version. Excitement has quickly degenerated into disappointment (at best) and anger (at worst) over the flaws in Apple’s version. And as of this morning, a quick scan at Google News reported almost 2000 articles reflecting Apple’s mea culpa culminating with a personal message from CEO Tim Cook stating:
“At Apple, we strive to make world-class products that deliver the best experience possible to our customers. With the launch of our new Maps last week, we fell short on this commitment. We are extremely sorry for the frustration this has caused our customers and we are doing everything we can to make Maps better.”
It looks like quite a firestorm over what is basically a data quality issue… Read more
Filed under: Business Intelligence, Data Analysis, Data Governance, Data Integration, Data Quality
In my last post, we started to discuss the need for fundamental processes and tools for institutionalizing data testing. While the software development practice has embraced testing as a critical gating factor for the release of newly developed capabilities, this testing often centers on functionality, sometimes to the exclusion of a broad-based survey of the underlying data asset to ensure that values did not (or would not) incorrectly change as a result.
In fact, the need for testing existing production data assets goes beyond the scope of newly developed software. Modifications are constantly applied within an organization – acquired applications are upgraded, internal operating environments are enhanced and updated, additional functionality is turned on and deployed, hardware systems are swapped out and in, and internal processes may change. Yet there are limitations in effectively verifying that interoperable components that create, touch, or modify data are not impacted. The challenge of maintaining consistency across the application infrastructure can be daunting, let alone assuring consistency in the information results. Read more
Filed under: Data Governance, Data Profiling, Data Quality, Metadata
I have been asked by folks at Informatica to share some thoughts about best practices for data integration, and this is the first of a series on data testing.
It is rare, if not impossible, to develop software that is completely free of errors and bugs. Early in my career as a software engineer, I spent a significant amount of time on “bug duty” – the task of looking at the list of reported product errors and evaluating them one-by-one to try to identify the cause of the bug and then come up with a plan for correcting the program so that the application error is eliminated. And the software development process is one that, over time, has been the subject of significant scrutiny in relation to product quality assurance.
In fact, the state of software quality and testing is quite mature. Well-defined processes have been accepted as general best practices, and there are organized methods for evaluating software quality methodology capabilities and maturity. Yet when all applications are a combination of programs applied to input data to generate output information, it is curious that the testing practices for data integration and sharing remain largely ungoverned manual procedures. Read more
After reading Jay Stanley’s ACLU article on “Eight Problems with Big Data,” it is worth reflecting on what could be construed as a fear-mongering indictment of the use of big data analytics and the implication that big data analytics and its implementation of data mining algorithms are tantamount to all-out invasion of privacy. What is interesting, though, is the presumption that privacy advocates have been “grappling” with data mining since “not long after 9/11,” yet data mining was already quite a mature discipline by that point in time, as was the general use of customer data for marketing, sales, and other business purposes. Raising an alarm about “big data” and “data mining” today is akin to shutting the barn door decades after the horses have bolted. Read more
I have been assembling a slide deck for an upcoming TDWI web seminar on Strategic Planning and the World of Big Data, and I am finding that I might sometimes use two different terms (“data reuse” and “data repurposing,” in case you ignored the tootle of this post) interchangeably when in fact those two words could have slightly different meanings or intents. So should I be cavalier and use them as synonyms?
When I thought about it, I did see some clarity in differentiating the definitions:
- “data reuse” means taking a data asset and using more than once for the same purpose.
- “data repurposing” means taking a data asset previously used for one (or more) specific purpose(s) and using that data set four a completely different purpose. Read more