SSIS 2016 CTP 3 and Data Flow Column LineageID

Back When We Used to Carve Our Own Chips Out of Wood…

Back in the old days (2005), SSIS Data Flow LineageIDs were created at design-time. Here’s a screenshot of an SSIS 2005 package’s XML:

SSIS2016Lineage03

When an error occurred, the ID property of the column in error was supplied to the ErrorColumn column field of the Error Output, as shown here:

SSIS2016Lineage04

Although it wasn’t simple, it was possible to use the value of the ID supplied in the ErrorColumn value to identify the offending column in a Data Flow Task. It was a manual process that involved:

· Opening the package’s XML

· Finding the column ID value shown in the ErrorColumn field of the Error Output

· Reading the LineageID value for that same column

· Tracing the LineageID back to its original assignment in the Data Flow Task:

SSIS2016Lineage05

The LineageID “49” maps to the “name” column in the image at the top of this post. The “name” column raised an error because someone (me) made the destination column way too small to hold any name values.

Back in the day, some folks came up with some pretty clever ways to automate identifying the name of the column causing SSIS Data Flow errors.

In Later Versions of SSIS…

The LineageID property of Data Flow columns changed in later versions of SSIS. In SSIS 2012, the LineageID property is there, but the value looks more like an SSIS package path than its SSIS 2005 counterpart.:

SSIS2016Lineag1a

The LineageID attribute in SSIS 2016 CTP 3.3, however, appears similarly:

SSIS2016Lineage_LineageID

A New Beginning…

SSIS 2016 CTP3.3 offers a solution. First, there are now two new columns in the SSIS Data Flow Component Error Output – ErrorCode – Description and ErrorColumn – Description:

SSIS2016Lineage06a

The new columns provide extremely useful (plain language) error metadata that will, in my opinion, greatly reduce the amount of time required to identify data-related load failures in the Data Flow Task.

But that’s not all. If you configure a log to capture the DiagnosticEx event, you will receive a message that provides the Data Flow column ID. To have a look, add a new log to an SSIS package that contains a configured Data Flow Task. On the Details tab, select the DiagnosticEx event:

SSIS2016LineageDiagnosticEx

When the package runs, a DiagnosticEx event will record XML describing the DTS:PipelineColumnMap. Viewing the XML in either a text file or SQL Server is no fun, but if you copy the XML and paste it into an XML file in Visual Studio (SSDT), you can format it nicely, as shown below:

SSIS2016Lineage07a

It’s possible to call a new method (GetIdentificationStringByID<) on the ComponentMetadata class in a Script Component, passing it the LineageID of a Data Flow column, and get the IdentificationString of the column. You can learn more about the GetIdentificationStringByID method – and everything I’ve written in this post – by reading Bo Fan’s (excellent) blog post on the subject here.

:{>

Andy Leonard

andyleonard.blog

Christian, husband, dad, grandpa, Data Philosopher, Data Engineer, Azure Data Factory, SSIS guy, and farmer. I was cloud before cloud was cool. :{>

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.