Filed under: Data Governance, Data Integration, Data Quality
What is now generally referred to as “data integration” is a set of disciplines that have evolved from the methods used for populating the data systems powering business intelligence: extracting data from one or more operational systems, their transfer to a staging area for cleansing, consolidation, transformations, and reorganization in preparation for loading into the target data warehouse. This process is usually referred to as ETL: extraction, transformation, and loading.
In the early days of data warehousing, the ETL scripts were, as one might politely say, “hand-crafted.” More colloquially, each script was custom-coded in relation to the originating source, the transformation tasks to be applied, and then the consolidation, integration, and loading. And despite the evolution of rule-driven and metadata-driven ETL tools that automate the development of ETL scripts, much time has been spent writing (and rewriting) data integration scripts to extract data from different sources, apply transformations, and then load the results into a target data warehouse or an analytical appliance. Read more