Business data is inherently susceptible to change with the passage of time and impacts the business in different ways. In data warehouses, the effect of time on our dimension records and facts requires careful study for the repository to meet the business intelligence objective of delivering up-to-date information to decision-makers.
Question is, how best to handle these changes?
Developing a dimension table or model that captures the different states of your data with respect to time is a key objective of an Enterprise Data Warehouse. For measures in our fact tables, we can use date dimensions and link them using foreign keys. For dimensions, the complexity of handling changes increases greatly. Each step of the Slowly Changing Dimension (SCD) flow must be hand-coded using multiple, complex SQL statements. The implementation is lengthy and complex, and affects the business’ ability to maintain its data quickly and reliably – which is always a critical consideration.
Slowly Changing Dimensions in Astera Centerprise
Compared to the traditional hand-coded approach to the slowly changing dimension flow, Astera offers an automated implementation using a completely drag-and-drop interface. Source system data is mapped to an SCD object in Centerprise, which pushes system-generated SQL statements directly to the target data warehouse (Read: Pushdown Optimization Mode in Centerprise) based on the field layouts defined by the user. Each column in the user’s table can be designated as Surrogate Key, Business Key, SCD1, SCD2, etc. (see below) within the component’s properties in Centerprise. The platform handles the update strategy, performance considerations, routing, and complex joins automatically on the backend, as long as the SCD Field Types are defined correctly.
Automating Type 1 & 2 Slowly Changing Dimension Implementation
Centerprise supports Slowly Changing Dimension Type 1 and Type 2 to update records with and without maintaining history.
SCD Type 1
This type deals with updates in the dimensional table, for cases when preserving history is not a consideration and you need to replace the old values in your table with recent ones.
To use Slowly Changing Dimension Type 1 in Centerprise, you can mark your column as ‘SCD1 – Update’ in the Layout Fields menu of the SCD object in Centerprise.
SCD Type 2
This type deals with changes in your dimension that need to be tracked. A new record is inserted with each change, and the existing record is marked as expired, by date, version, or status.
To use Slowly Changing Dimension Type 2 in Centerprise, mark your chosen column as ‘SCD2 – Update and Insert.’
Push-Down Optimization
Once the layout is defined and flow executed, the Astera SCD transformation generates the SQL code necessary to compare, join, route, and insert data in your target dimension and pushes the transformation logic down to a database, such as SQL Server, for processing.
Using this approach, the maintenance of large dimension attributes is significantly faster because all the processing is done by the database rather than the Centerprise server performing the operations and going back and forth between the database to read, compare, and write the data.
To learn more about the automated Slowly Changing Dimensions component in Centerprise and how to use it to manage your dimensions, download the white paper: How to Manage Slowly Changing Dimensions Using Centerprise.
Authors:
- Iqbal Ahmed