Filed under: Business Intelligence, Data Analysis, Data Governance, Data Integration, Data Quality
In my last post, we started to discuss the need for fundamental processes and tools for institutionalizing data testing. While the software development practice has embraced testing as a critical gating factor for the release of newly developed capabilities, this testing often centers on functionality, sometimes to the exclusion of a broad-based survey of the underlying data asset to ensure that values did not (or would not) incorrectly change as a result.
In fact, the need for testing existing production data assets goes beyond the scope of newly developed software. Modifications are constantly applied within an organization – acquired applications are upgraded, internal operating environments are enhanced and updated, additional functionality is turned on and deployed, hardware systems are swapped out and in, and internal processes may change. Yet there are limitations in effectively verifying that interoperable components that create, touch, or modify data are not impacted. The challenge of maintaining consistency across the application infrastructure can be daunting, let alone assuring consistency in the information results. Read more
The other day I had a conversation with a prospective client who mentioned to me that the company is looking at changing their key processing system and was told by one of the potential vendors that they had to clean up their data before it could be migrated into the new system. This person, intrigued by this comment, did a bunch of research about data cleansing and asked me whether this made sense. After a few questions, I learned that the vendor claimed that unless the data were “clean,” the new system would not work right. Of course, my curiosity was raised at this comment, since in my opinion, before you “clean” (or rather in this case, transform/normalize) the data for a target system, don’t you need to know which system you are planning to migrate to? And if they had not yet selected a vendor system, how would they know what they needed to “clean”?
This got me thinking about the link between data migration and data quality. Actually, in a number of client situations, the company is considering a large investment in a new system – a new contract administration system, a new pricing system, a new sales system – requiring a significant $$$ investment. And consequently, in each of these cases, the question of the quality of the legacy data is raised as a technical hurdle that must be jumped as opposed to a key component of making the new system meet the business needs of the organization. So this has triggered a few more questions about system replacement, data migration, and data cleansing:
• What is the intent of the new system?
• What features of the old system were inadequate? How were they related to the quality of the data?
• What are the features of the new system that are expected to alleviate those shortcomings? What are the dependencies on the existing data?
• What other business processes will derive value from the data created or modified within the new system?
• What is the target model? Is metadata available at the data element level?
• Who is assessing the target system data requirements?
• What process is in place for source to target mapping?
• What process is in place for programming the transformations?
• What do you do with data instances that do not transform properly? Is there a remediation process?
• What cleansing needs to be done? Is that different from transformation?
• What processes are in place for validating source data against target model expectations?
• What is the data migration plan?
• Will both systems need to run at the same time until the new system is validated?
Any thoughts of adding to the list? Please feel free to post additional questions by adding a comment…