Creating a Complex Dataflow in Centerprise – Part 1
Part 1 –Join Transformations and Functions
Our last post (Creating an Integration Flow in Centerprise) described how to create a simple dataflow in Centerprise. In this two-part blog we will show you how to build a more complex dataflow including maps, transformations, data quality rules, and data profiling.
The figure below shows a more complex dataflow.
In this example we are working with two source files, one contains information about home loans and the other contains information about the property tax for the corresponding home loans. We need to combine these two pieces of data and do some conversions by running some calculations on attributes. In the end we want to route the data to two different destination tables, depending on the origin of the home loan: if it is from California it goes to the California Loans table, otherwise it goes to the Out-of-State Loans table. Alongside this, we need to check the data quality for the loan data and again for the tax data. We also need to profile the tax data so that it can be sent to an Excel file and output as a report.
In order to design the dataflow shown above, we begin by clicking on the New Dataflow button to create a new dataflow. First we look at the data—both loan data and tax data. In the previous blog, Creating Simple Dataflows, we learned how to create our source simply by dragging and dropping from the toolbox onto the designer and specifying properties. However, there is also a shortcut to create sources directly. Simply drag and drop the Loans and Tax Excel files directly from the Explorer window to the designer.
Centerprise does the rest. It has created the source, knows where the file comes from, and has done the layout. When you click on the chevron you can see all the data columns from the source file.
Click on preview and you can see all your data in the preview window.
Now do the same thing with the Tax file. When you preview your tax data, you can see the property tax information for each of the loans.
Next we want to combine the two sources. To do this we use the Join transformation. Drag and drop the Join transformation onto the designer.
When you click on the chevron, you can see that the transformation doesn’t have any elements.
We want to take all the elements from both the Loans and the Tax sources and combine them in the Join transformation. Drag and drop the Loans top node into the Join window. You can see that Centerprise has automatically created and mapped all the fields.
To add the two Tax fields to the join, drag and drop each field to the Join window and Centerprise automatically adds the fields and maps them.
Note that since there are now two LoanId fields, Centerprise has appended the one from the Tax source to LoanID_1.
Now we have all the fields required for the join. If we right click on the Join window and select Properties, we can see all the fields from both Loans and Taxes.
Click on the blue arrow at the top left of the window to go to the next page, where we will specify what kind of join we want. Choose a simple inner join, then in the Sort Left and Sort Right inputs specify the key that will be used for the join. For Loans it is the LoanID and for Taxes it is the LoanID_1.
Click OK and our join is ready. When we preview the data we can see that for each of the loans the property tax and loan information is joined.
So with a few clicks we have joined our two sources.
The next step is to use our join as a source for our transformation and maps. Drag and drop the Expression Map from the toolbox onto the designer.
This is used to do calculations and any kind of combining of data. In this example we see that the Loans information has the Borrower Name, State, and Zip Code. We want to combine these three fields into one field and call it “Address” in our destination. Since we are going to be routing to two different destinations, our natural next step is to add a router.
Drag and drop a router from the toolbox onto the designer. The router becomes the next destination.
Next, drag and drop the three fields we want to combine (Borrower Name, State, and Zip Code) from our Join window to the expression window.
Then open the expression properties window, click on the blue arrow next button and we are presented with the rules writer, which allows us to write any kind of rule. You can see the functions drop down menu has a large selection of functions that can be used for writing rules such as logical, conversion, date/time, name and address parsing, math, etc.
In this example we have a very simple concatenation so we will write the rule starting with Name, then a comma, then State, then a space, then the Zip Code, which is an integer. Since we are doing a concatenation of the strings we will use a conversion function to convert the Zip Code from an integer to string.
Click on OK and our value is ready for output. We take this value and drag and drop it to our destination. You can see the value is now in the destination.
At this point we can do a preview and see how our data is really going to work. You can see that the Name, State, and Zip Code have been combined the way we wanted: Name, comma, State, space, Zip Code. This is how you can write simple rules and simple calculations for data conversion.
Next we want to create a function. We start by dragging and dropping a function from the toolbox onto the designer.
We have the Name field in our join, but our destination uses FirstName and LastName fields, so we need to take the Name field and split it into FirstName and LastName. For that we will use the Name Parsing function. Click on the function properties and choose Name and Address Parsing from the drop-down menu. Then select the Parse Name function and click OK.
When you expand the function, you can see that a list of possible name related field options is available.
Drag and drop the name field from the Join window to the left side of the function to create the input, which then we have the options on the right side for the output. Drag and drop FirstName and LastName fields from the function window to the destination.
When you preview, you can see that Centerprise has taken the names from the transformation and split them into first name and last name.
This is how you can use functions and expressions. Part 2 of this blog coming next week will explain how to route the data we have transformed to multiple destinations.