Your shopping cart is empty!
One of the main functions of an Extract, Transform, and Load (ETL) tool is to transform data. The transformation step is the most vital stage of building a structured data warehouse. With this post, I’d like to help you get a better understanding of the major transformation types in ETL. Here they are:
Today I will tell you about the following ETL transformation types: format revision, decoding of fields, calculated and derived values, splitting of single fields, and merging of information.
Format revision. Fields can contain numeric and text data types. If they do, you need to standardize and change the data type to text to provide values that could be correctly perceived by the users. The length of fields can also be different and you can standardize it.
Decoding of fields. In multiple source systems, the same data items are described by a variety of field values. Also, many legacy systems are notorious for using cryptic codes to represent business values. This ETL transformation type changes codes into values that make sense to the end-users.
Calculated and derived values. Sometimes you have to calculate the total cost and the profit margin before data can be stored in the data warehouse, which is an example of the calculated value. You may also want to store customer’s age separately—that would be an example of the derived value.
Splitting single fields. The first name, middle name, and last name, as well as some other values, were stored as a large text in a single field in the earlier legacy systems. You need to store individual components of names and addresses in separate fields in your data repository to improve the operating performance by indexing and analyzing individual components.
Merging of information. This type of data transformation in ETL does not literally mean the merging of several fields to create a single field. In this case, merging of information stands for establishing the connection between different fields, such as product price, description, package types, and viewing these fields as a single entity.
The transform stage applies to a series of rules or functions to the extracted data from the source to derive the data for loading into the end target. Some data sources will require very little or even no manipulation of data. In other cases, one or more of the following transformation types may be required to meet the business and technical needs of the target database: