SSIS: Conditional Split

Getting to know a transformation that allows you to split the data in more ways than one.

SSIS has a handy little transformation called the Conditional Split. This transformation allows you to examine data and then, based on values in your data, split your data and send it down different paths in your data flow. This can be helpful in several ways and I'll cover some examples in this column.

You could load a non-relational file into a relational database using the Conditional Split to separate the parent rows from the child rows. You could use it to weed out invalid data, sending it down an error-handling path. Maybe your model needs data from different regions loaded into different tables, which would be another great use of the Conditional Split. Basically, anytime you have one data set where some data needs to be handled differently from other data, think Conditional Split. Let's look at the actual transformation. Take the data flow in Fig. 1 as an example.

Data flow for a Conditional Split
Figure 1. Data flow for a Conditional Split. (Click image to view larger version.)

The data flow loads sales figures from a flat file and it also has to calculate goals for each region it is loading. The trouble is, each region has different targets as to whether they met their goal. Naturally, we use a conditional split to separate the data by region, calculate each goal with a derived column, and bring the data back together with a union all.

Now, check out Fig. 2 for a closer at the conditional split itself.

Conditional Split Transformation Editor
Figure 2. Conditional Split Transformation Editor. (Click image to view larger version.)

Here, the conditional split is pretty simple: You use an expression to define an output. In this case, our expression looks at the TerritoryID column and, depending on the value, assigns the data to the correct output. There is even a default output that we called Unknown Region. If none of the other outputs are a match, data will end up in the default output. This means that all the data that comes into the conditional split will end up coming out as well. It is also worth noting that your outputs are order dependent, meaning if your data matches the expression for more than one output, the first one it hits in order will be where it goes. Let's look at the next example (see Fig. 3) for more of an explanation as to what's happening.

Conditional Split results
Figure 3. The resulting scores from a Conditional Split. (Click image to view larger version.)

Based on the criteria in our Conditional Split, you'll notice that we have split data based on a sales score. There are four output expressions, equal to 100, 90 through 99, 80 through 89, and below 80. Data that is equal to 100 is also greater than 80 and great than 90, but since our equal to 100 expression is at the top, the data will hit that output first. If it doesn't equal 100, it will be evaluated by the next expression which is >= 90 and so on. Because of the way data is matched on the first and only the first expression, you don't have to define ranges such as ">=90 and <= 99".

Conditional Splits are pretty easy and powerful. The next time you need to separate data in your SSIS package, give Conditional Splits a shot.

About the Author

Eric Johnson, SQL Server MVP, is the owner of Consortio Services in Colorado Springs providing IT systems management and technology consulting. He is also the President of the Colorado Springs SQL Server User Group. He can be contacted at www.consortioservices.com.