Data Quality, Data Cleansing, Data Migration: Some Questions

February 1, 2011 by
Filed under: Data Quality 

The other day I had a conversation with a prospective client who mentioned to me that the company is looking at changing their key processing system and was told by one of the potential vendors that they had to clean up their data before it could be migrated into the new system. This person, intrigued by this comment, did a bunch of research about data cleansing and asked me whether this made sense. After a few questions, I learned that the vendor claimed that unless the data were “clean,” the new system would not work right. Of course, my curiosity was raised at this comment, since in my opinion, before you “clean” (or rather in this case, transform/normalize) the data for a target system, don’t you need to know which system you are planning to migrate to? And if they had not yet selected a vendor system, how would they know what they needed to “clean”?

This got me thinking about the link between data migration and data quality. Actually, in a number of client situations, the company is considering a large investment in a new system – a new contract administration system, a new pricing system, a new sales system – requiring a significant $$$ investment. And consequently, in each of these cases, the question of the quality of the legacy data is raised as a technical hurdle that must be jumped as opposed to a key component of making the new system meet the business needs of the organization. So this has triggered a few more questions about system replacement, data migration, and data cleansing:

• What is the intent of the new system?
• What features of the old system were inadequate? How were they related to the quality of the data?
• What are the features of the new system that are expected to alleviate those shortcomings? What are the dependencies on the existing data?
• What other business processes will derive value from the data created or modified within the new system?
• What is the target model? Is metadata available at the data element level?
• Who is assessing the target system data requirements?
• What process is in place for source to target mapping?
• What process is in place for programming the transformations?
• What do you do with data instances that do not transform properly? Is there a remediation process?
• What cleansing needs to be done? Is that different from transformation?
• What processes are in place for validating source data against target model expectations?
• What is the data migration plan?
• Will both systems need to run at the same time until the new system is validated?

Any thoughts of adding to the list? Please feel free to post additional questions by adding a comment…


2 Comments on Data Quality, Data Cleansing, Data Migration: Some Questions

    […] This post was mentioned on Twitter by murnane, Jose-Norberto Mazón and Lyndsay Wise, David Loshin. David Loshin said: Some comments and questions about #dataquality, data cleansing, and data migration at […]

  1. Dylan Jones on Tue, 1st Feb 2011 10:00 AM
  2. Great discussion David, thanks for starting.

    This is something I’ve deliberated on many times. I’ve actually created a large checklist for data migration that covers every phase of the migration, far too many questions to copy over so here is the link:

    For me, the big mistake everyone makes is to focus on data quality as dictated by their DQ products, not the strategy required by the business.

    Also, not enough focus on the transition strategy (i.e. not just pure data but business services etc.) and the window of opportunity.

    Data migration is one of the most poorly understood disciplines, even more so than data quality, hence the reason for such a detailed checklist.

Tell me what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!