Hi Folks, I recently published a new technical paper on the use of auxiliary processors on IBM System z class machines to support virtualization of mainframe data that allows you to bypass the need for extracting data prior to using that data for reporting and analysis. You can access the paper, which was sponsored by Rocket Software, via this link. Please email me or post comments and let me know what you think!
A few years ago I was working on configuring a test for comparing data transformation and loading into a variety of target platforms. Essentially I was hoping to assess the comparative performance of different data management schemes (open source relational databases, enterprise versions of relational databases, columnar data stores, and other NoSQL-style schemes). But to do this, I had two constraints that I needed to overcome. The first was the need for a data set that was massive enough to really push the envelope when it came to evaluating different aspects of performance. The second was a little subtler: I needed the data set to exhibit certain data error and inconsistency characteristics that simulated a real-life scenario.
Come on, seriously?
Poor data quality and Hurricane Irene
Just to prove to a content aggregator that I am serious about social media, I am putting this code:
inside a blog post.
I frequently monitor the price of my books on Amazon and noticed that this afternoon the Practitioner’s Guide to Data Quality Improvement was selling for $33.45, which is a discount of 44%. If you have been waiting to buy the book, now is a good time. This is the lowest price I have seen so far.
Here are some aspects I have tried to cover in the book:
- The ability to build a business case for instituting a data quality program;
- Assessing levels of data quality maturity;
- The guidelines and techniques for evaluating data quality and identifying metrics related to the achievement of business objectives;
- The techniques for measuring, reporting, and taking action based on these metrics; and
- The policies and processes used in exploiting data quality tools and technologies for data quality improvement