Overview
Data warehouses were originally built to be updated on a yearly, monthly, or weekly basis. In today’s business environment, enterprises can no longer wait for data to make its way to back-end data stores – they need to capture data directly from core transactional systems as it is being gathered. Change Data Capture (CDC) is a set of software design patterns used to determine and track data that has changed so that action can be taken using the changed data without delay.
Astera’s data management suite of products supports a variety of change data capture strategies, both batch and real-time, enabling a business to select an update strategy that optimizes the overarching data integration processes. This is especially important when data needs to be copied from production databases to an analytics data warehouse without disrupting the regular flow of data, which is the case when users are forced to wait for batch runs. Change data capture (CDC) streamlines modern analytics by leveraging event-driven data and making data integration more agile to deliver increased operational efficiency.
How Change Data Capture Works
Change Data Capture technology allows users to select the fields that need to be audited, and then capture database inserts, updates, and deletions automatically to make a record available of what changed, where, and when, in simple relational tables. These “change” tables contain the metadata necessary to understand the changes in the right context, ultimately driving better business decisions.
Identify
Astera’s CDC technology allows the user to choose from multiple identification options: database triggers, time-stamps, and log tables to identify changes. Using these change data capture approaches, a hybrid read strategy can also be deployed visually by defining business rules that tell the system what changes to identify and how.
Integration flows can be built using CDC strategies to “listen” to changes for subsequent propagation. CDC strategy can also be selected while configuring data warehouse load settings in the Astera platform.
Capture
Database INSERTs, UPDATEs, and DELETEs applied to SQL Server tables are captured using CDC to auto-create change tables that capture what changed, where, and when in tables that are being tracked. These change tables contain columns that reflect the column structure of the source table, along with the metadata needed to understand the changes that have been made.
CDC creates a mirror of the tracked table, with additional columns for metadata, and uses it to monitor changes.
Deliver
Once the changes are identified and recorded, Astera’s powerful, parallel-processing engine uses Extract, Transform, and Load (ETL) processes automatically on the backend to load changed data from the SQL Server source tables to the data warehouse or data mart, either per transaction or in aggregates.
Because CDC captures changes made at a data source and applies them throughout the enterprise, it minimizes the resources required for ETL processes because it only deals with data changes.
Recognize Business Events as They Occur and Capture Automatically
Ensuring data synchronicity and facilitating real-time data integration using Astera’s industrial-strength ETL engine with advanced automation capabilities, Change Data Capture enables seamless propagation of new entries made to the database to associated applications such as shipping, invoicing, etc. systems.
Key features of DWAccelerator include:
Meet Modern Requirements of Real-time Data Integration
The evolution of data-driven applications requires an increasingly agile, modern approach to data integration. Astera’s CDC technology captures data directly from core transactional systems as it is being gathered, therefore enabling real-time data integration.
Create CDC Configurations Quickly and Easily
With the drag-and-drop, code-free environment of our solutions, configuring the right CDC strategy is quick and easy. Users are required to just select ‘Incremental Load’ as the desired option for the required entities and define other settings to incrementally load new data from a source to a data warehouse.
Minimize the Resources Required for ETL Processes
Rather than batch-oriented, bulk load Extract-Transform-Load processes disrupting production and consuming massive processing power, use CDC to capture only changed data in source systems and deliver the changes across the enterprise.
Get Up-to-Date Data Faster to Make Better Business Decisions
Enterprises can no longer wait for data to arrive from backend stores since the value of business decisions is based primarily on their timeliness. To ensure this, Astera offers your employees the most current, complete, and accurate version of information with its Change Data Capture (CDC) technology.
Choice of Batch, Near Real-time, and Real-time CDC Strategies
Depending on business requirements and environment, configure CDC easily to load changed data in batches, in defined increments, or deliver the stream in real-time. The technology is flexible enough for all your business strategies.
Build Integration Flows that “Listen” to Database Changes
Astera’s visual builder provides users a simple drag-and-drop interface to build reusable dataflows that can be adapted to fit any data integration requirement, including the capability to “listen” to changes in database fields and update target systems.
Set Up CDC Using Automated Database Triggers
Astera’s CDC automatically applies database triggers and other change identification objects like timestamps, version history, and current status indicators on source systems to watch for changes.
Scale Up Your CDC Configuration to Accommodate Massive Data Volumes
Leverage the advanced processing power of Astera’s high-performance engine that takes full advantage of multicore and multiprocessor hardware to scale up your CDC configurations for massive data volume handling.
Choose Whether to Extract All Updates or a Select Few
The CDC feature can be configured to capture any combination of changes, whether you want to use it to load source database in full, or just set it up to watch a selected few fields for changes and update them continually in the data warehouse.
Capture Metadata Automatically to Understand Context of Changes
While the technology can watch for changes and capture this data, CDC also captures changes to the database structure using its Data Definition Language (DDL) and records this metadata in a separate audit table column.