An SCD Data Warehouse is the crux to tracking changes in data. Change is ever-present in every business. From an HR perspective, employees are promoted and gain new titles, while remuneration and insurance policies are rewritten with each passing year. In Marketing, product lines are renamed, merged, or launched to great fanfare. In Sales, new clients arrive, and key accounts are transferred from one agent to another.
The question is, do your reports account for this constant evolution? Is your BI architecture built to provide both a current and historical view of your data, or do you only get periodic snapshots of each source system?
For someone building a data warehouse, the best way to deal with changing attributes is through slowly changing dimensions. Slowly changing dimensions in a data warehouse store both current and previous information over time. These tables reflect changes as they occur within selected attributes in each dimension. Depending on your requirements, they can overwrite existing entries, or create additional fields to show previous versions of a record.
Astera’s new data warehouse builder (ADWB) provides a complete data model designer that makes it easy to set up your dimension tables and assign different SCD types to selected attributes. Let’s take a look at some of the use cases our product can help you tackle.
Correcting Erroneous Records in an SCD Data Warehouse
Any source system that relies upon manual data entry is prone to human error. From names and addresses to quotes and estimates, there are various records across your source systems that may need to be fixed for accuracy. In such cases, maintaining a previous version of a record in the data warehouse would only add useless complexity to your data. The best solution is to treat entries in these fields as SCD-1 attributes, where current values overwrite the previous data.
Sales Data Mart Built Using The DWB Data Model Designer
Let’s say you’ve used DWB to engineer a dimensional model of your sales system, and you’ve noticed some incorrect customer names in the latest batch of updates. Just open up the Customer dimension in the data model designer and set the Customer Name as SCD1 – Update from the dropdown menu.
Setting SCD Type For CustomerName Attribute
Now, when corrections are made in the source database, the corresponding record will be updated in the data warehouse, as shown below. This is called type 1 dimension in a data warehouse.
Customer Dimension Table Before Correction
Customer Dimension Table with SCD1 Applied to Customer Name Field
Comparing Current vs. Past Data in a Dimensional Data Warehouse
One of the enterprise data warehouse’s key functions is supporting long-term strategic analysis, which requires changes within an attribute to be preserved and tracked over time.
Say your company recently merged several district branches and created new regions for its store outlets. If you want to track how sales performance is affected by the reorganization, you could treat the District field as a type 2 slowly changing dimension. This will ensure that changes to store districts are shown as new records within the table.
In DWB, you can apply the SCD2 – Update and Insert option for the District field in the Store dimension table. As changes are made to the source attribute, new versions of the record would be created and inserted into the dimension table, with a unique surrogate key assigned to each one. Based on the type of analysis the company wants to carry out, you can also select the Effective Date and Expiration Date options so that any changes in store regions can be tracked to a specific period.
The table below illustrates how these configurations would affect the output of your dimension table.
Existing Table
Updated Table – SCD2 (Effective Date and Expiration Date)
Tracking Changes in Status
Certain departments, such as HR, expect frequent changes to records within their systems. For example, an employee’s designation can change multiple times a year. In these cases, companies generally want to be able to maintain the history of data while still being able to quickly query it based on the most current attribute value.
In DWB, you could choose to deal with this situation by setting up the Job Title as an SCD6 field. This means that the table would use elements of SCD1, SCD2, and SCD3 to record changes in this attribute. These elements are as follows:
- A row is added to track changes in the attribute as they occur (SCD2)
- An additional column shows the current value for the attribute (SCD3)
- The current value field will be overwritten to show the updated attribute value (SCD1)
The example below shows how changes to the Job_Title field would appear if SCD6 is implemented.
Current Table
Updated Table – SCD6 (Active Flag, Effective Date, and Expiration Date)
Automate Your Enterprise SCD Data Warehouse
Astera’s metadata-driven Data Warehouse Builder can help you design, build, and deploy your data warehouse project in days. With a full range of dimensional modeling features and a robust ETL platform to fast-track data mapping, loading, and preparation, we give you a single platform with all the tools you need to make your EDW implementation a success.
Get a look at how our product can help you solve your use case today. Click here to get in touch with our technical team.