Scoping the Information Management Practice

Even if in reality the dividing lines for data management are not always well-defined, it is possible to organize different aspects of information management within a virtual stack that suggests the interfaces and dependencies across different functional layers, which we will examine from the bottom – up. Slide1

Read more

The Need for Operational Synchronization

Over time, organizations have employed a variety of strategies for managing data assets in accordance with the specific needs of the different business applications in operation. For the most part, applications were designed to achieve specific objectives within each business function. Correspondingly, any data necessary for the business function would be managed locally, while any data deemed critical to the organization would be subsumed into a centralized repository.

Yet this approach to centralizing data has come under scrutiny. Read more

Managing Information Consistency and Trust During System Migrations and Data Migrations

November 29, 2012 by · Leave a Comment
Filed under: Data Integration, Data Quality, Metadata 

If you have been following this series of articles about data validation and testing, you will (hopefully) come to the conclusion that there is a healthy number of scenarios in which large volumes of data are being moved (using a variety of methods), and in each of these scenarios, the choices made in developing a framework for data movement can introduce errors. One of our discussions (both in the article and in discussions with Informatica’s Ash Parikh) focused on data integration testing for production data sets, while another centered on verification of existing extraction/transformation/loading methods for data integration (you can listen to that conversation also).

In practice, though, both of these cases are specific instances of a more general notion of migration. There are basically two kinds of migrations: data migrations and system migrations. A data migration involves moving the data from one environment to another similar environment, while a system migration involves transitioning from one instance of an application to what is likely a completely different application.

Read more

Best Practices for Data Integration Testing Series – Instituting Good Practices for Data Testing

August 3, 2012 by · 2 Comments
Filed under: Data Governance, Data Profiling, Data Quality, Metadata 

I have been asked by folks at Informatica to share some thoughts about best practices for data integration, and this is the first of a series on data testing.

It is rare, if not impossible, to develop software that is completely free of errors and bugs. Early in my career as a software engineer, I spent a significant amount of time on “bug duty” – the task of looking at the list of reported product errors and evaluating them one-by-one to try to identify the cause of the bug and then come up with a plan for correcting the program so that the application error is eliminated. And the software development process is one that, over time, has been the subject of significant scrutiny in relation to product quality assurance.

In fact, the state of software quality and testing is quite mature. Well-defined processes have been accepted as general best practices, and there are organized methods for evaluating software quality methodology capabilities and maturity. Yet when all applications are a combination of programs applied to input data to generate output information, it is curious that the testing practices for data integration and sharing remain largely ungoverned manual procedures. Read more

Cloud Computing and Data Security

At the recent DGIQ (Data Governance and Information Quality) conference, I had the opportunity to chat with Ian Rowlands, Senior Director of Strategy at ASG about historical trends in computing. In particular, we discussed how concepts such as centralization and distribution have come into and then out of vogue, as I pointed out that the new trend towards the “cloud” was essentially a re-boot of the old concept of time sharing on a mainframe.

Read more

Next Page »