Good Data from Data Profiling Task

Get some real useful information from SQL Server Integration Services' Data Profiling Task feature.

SQL Server 2008 Integration Services ships with the wonderful new Data Profiling Task. Much as the name implies, it profiles the data in a given table and returns a ton of useful information. The task gathers statistics for candidate keys, column length, NULL data ratio, data patterns, and much more. All of the results are written out to XML. The task makes more sense when you look at it, so let’s jump in.

The configuration of the task is pretty simple, you select the types of profiles that you want to run and then set the specific options for that profile. In Fig. 1, I selected Column Length Distribution for all columns (*) in the HumanResources.Employee table. This profile will gather length statistics on the columns, such as Minimum and Maximum Length, and provide a distribution of all of the lengths found. On the general page of the task, you also need to provide a destination, which can be a file connection or a variable.

Once you set up the task with all the data profiles you want to run, you simply execute the SSIS package and let it work its magic. If you wrote to a variable, you’re on your own to do something with the data within the SSIS package. On the other hand, if you wrote to a file, the SQL Server 2008 Client Tools come complete with the Data Profile Viewer, which can be found at in the program menu under Microsoft SQL Server 2008 | Integration Services. This is a pretty simple application that allows you to browse the XML file created by the Data Profiling Task.

Running Data Profiling Task to gather column length data
Figure 1. Use the Data Profiling Task to do things like gather data on tables you'll need to manage that you may not have seen before. Here's a sample showing how to get column length data for a table called HumanResources.Employee. (Click image to view larger version.)


Quiesced snapshots in VMware Tools
Figure 2. Running the Data Profile Task as I specified in Fig. 1 will bring up these results in XML format. (Click image to view larger version.)

Fig. 2 shows the results for the Column Length Distribution profile I ran earlier. There's quite a bit of information returned, and this is just one of the profiles. If you take some time and dig into all the profile types, you'll find that this task provides a large amount of information. If you’re the creative type, you can run this task and write the data to a variable, which can be further queried by your SSIS package.

With a little work, the Data Profiling Task tool is one that can give you lots of information the next time you're presented with a new database that you know nothing about.

About the Author

Eric Johnson, SQL Server MVP, is the owner of Consortio Services in Colorado Springs providing IT systems management and technology consulting. He is also the President of the Colorado Springs SQL Server User Group. He can be contacted at


  • OneDrive Users To Get Storage Options, Plus New Personal Vault

    Microsoft announced a few OneDrive enhancements, including storage-option additions, plus a new "Personal Vault" feature for added security assurance.

  • Cloud Services Starting To Overtake On-Prem Database Management Systems

    Database management system (DBMS) growth is happening more on the cloud services side than on the traditional "on-premises" side, according to a report by Gartner Inc.

  • How To Replace an Aging Domain Controller

    If the hardware behind your domain controllers has become outdated, here's a step-by-step guide to performing a hardware refresh.

  • Azure Backup for SQL Server 2008 Available at Preview Stage

    Microsoft added the option of using the Azure Backup service to provide recovery support for SQL Server 2008 and SQL Server 2008 R2 when those workloads are hosted on Azure virtual machines.

comments powered by Disqus

Office 365 Watch

Sign up for our newsletter.

Terms and Privacy Policy consent

I agree to this site's Privacy Policy.