Data Conversion in SSIS

Data conversion transformation or derived column transformation? It depends and you've also got other options.

When it comes to converting data from one type to another in SSIS, you have a few choices. We are going to look at a few of them and discuss the differences. With all of these, we are assuming that the input data type is compatible with the new output data type -- in other words, we are not converting the word "one" to the number 1, which would require fancier logic.

So first up, there is the appropriately named Data Conversion tranformation. This task will take one or more input columns and create new columns with the desired data type. It is simple to use: just select your input column or columns, name your output column and chose the new data type (see Fig. 1).

Data Conversion Transformation
Figure 1. Data Conversion Transformation is easy as clicking on what you need. (Click image to view larger version.)

You can also use the Derived Column transformation to convert data. This transformation is also pretty simple, you select an input column and then chose whether to replace this column or create a new output column. Then you need to apply an expression in order to come up with the new data for the output column. This expression can be a cast statement (see Fig. 2).

Derived Column transformation
Figure 2. Derived Column transformation allows a bit more flexibility in columns you can choose. (Click image to view larger version.)

As you can see, we have created a new column with a new data type, functionally the same as the Data Conversion transformation. Once thing to note, you can also choose to replace a column but when you do this, SSIS wants the new column to maintain the original data type. As shown in Fig. 3, even though we have applied to DT_I8 cast, the data is still a four-byte signed integer. This is just something to be aware of.

Still 4 bytes
Figure 3. Thing don't always appear to be as they look. (Click image to view larger version.)

So why use one over the other? Well the Data Conversion transformation will just take an input, convert the type and provide a new output column. If you use the Derived Column transformation you get to apply an expression to the data, which allows you to do more complex conversions or perform additional manipulations on the data. In fact, using IF...THEN logic in an expression, you could convert the word "one" to the number 1 with a Derived Column transformation.

By no means, are these the only two methods for conversion, but they are the two most common transformations used to convert data. You can also build custom script transformations or even use CAST or CONVERT statements in your source T-SQL if you so want. In the end, the method you chose will depend on your specific requirements, but it is good practice to understand all the options before making a call on which one is the best.

About the Author

Eric Johnson, SQL Server MVP, is the owner of Consortio Services in Colorado Springs providing IT systems management and technology consulting. He is also the President of the Colorado Springs SQL Server User Group. He can be contacted at www.consortioservices.com.

Featured

comments powered by Disqus

Subscribe on YouTube