Mapping: SCD Type-2 When Rows are Repeated

Mapping: SCD Type-2 When Rows are Repeated

Posted by: Sandeep Roy

A Mapping example that demonstrates how to implement the Slowly Changing Dimension (Type-2) when source data has multiple occurrence of the same dimension key.

Overview

Mapping: SCD Type-2 When Rows are Repeated
SCD Type 2 tracks historical data by creating multiple records for a given natural key in the dimensional tables with separate surrogate keys.The mapping example to apply SCD type-2 on source data.Suppose a company is maintaining a table with the employee location details. An example of the data is given below: EMPLOYEE TABLE
E_ID E_NAME E_DOB E_LOCATION START_DATE
100Chandler Bing26/06/1978Texas01/01/2000
100Chandler Bing26/06/1978Las Vegas01/01/2006
100Chandler Bing26/06/1978New York01/01/2009
Now the mapping will load the historical data in SCD Type-2 dimension table. So the data will look like as shown below after it has been passed through mapping.
SEQ E_ID E_NAME E_DOB E_LOCATION E_START_DATE END_DATE FLAG
101100Chandler Bing26/06/1978Texas01/01/200012/31/2005N
102100Chandler Bing26/06/1978Las Vegas01/01/200612/31/2008N
103100Chandler Bing26/06/1978New York01/01/200912/31/9999Y
Note: The End_Date with 12/31/9999 in row indicates the current tuple version. Download Contents
  • PowerCenter mapping XML
  • Oracle scripts to create a source and target.

Features

Mapping to apply SCD type-2 on source data where you have multiple records with the same natural key in the same run where we can capture all the records. You Can apply this during our initial load of dimension table.Get all our historic data in SCD-2 type format. Accurately report on historical data from first day. This code can be modified to fit any data on which we want to implement SCD -2.We just need the natural key and the way to determine the order of records like inserted date. Same code can be used to initial load as well as normal load. Although the initial load can be slowed due to the volume of the records. Optimal use of Dynamic Lookup transformation.

  • Informatica PowerCenter 9.1.0 and above.

Resources

Support

Sandeep Roy is working with Accenture as technical lead. He is a certified Informatica Developer and Administrator with more than 6 years of experience in using Informatica PowerCenter version from Informatica 6 to Informatica 9.Inquisitive by nature, always in lookout to gain knowledge in any sphere of life specially technical. Always keeps upgrading himself in Data warehousing technologies like Informatica, Oracle etc.The major goal is to be Data warehouse Architect with major focus on Informatica.If you have any questions about the block, please contact him on sandeeproy164@gmail.com