SSIS Design Pattern: Use Script Tasks for ETL Instrumentation

I consider scripting in SQL Server Integration Services (SSIS) my data integration “get out of jail free” card. Why? If I cannot find a way to accomplish some requirement using SSIS’s built-in objects – tasks, containers, Data Flow components, etc. – I can usually write .Net code to fulfill the requirement.

As stated in the SSIS Academy short course titled Troubleshooting SSIS, I almost always add to the Control Flow an SSIS Script Task that logs the initial state of parameters and variables when package execution starts. I find this information invaluable when troubleshooting failed SSIS executions.

Configuring a Script Task for ETL Instrumentation

Begin by adding an SSIS Script Task to the Control Flow:

Open the editor.
Click inside the ReadOnlyVariables property value textbox. An ellipsis is available:

Click the ellipsis to open the Select Variables window. Select the System::TaskName and System::PackageName variables:

Click the OK button to close the Select Variables window. Note the two variables we selected now appear in the ReadOnlyVariables value textbox:

Select the variables and copy them to the clipboard:

.Net Scripting

Click the Edit Script button to open the Visual Studio Tools for Applications (VSTA) .Net code editor. Scroll to public void Main() and select the text “TODO: Add your code here”:

(click to enlarge)

Paste the contents of the clipboard:

I do this because mistyping variable names results in an error that’s not very intuitive. Also, .Net design-time validation has no way to test variable names for accuracy. It’s better to copy and paste than risk an error, in my opinion.

Declaring a .Net Variable

Begin a .Net variable declaration by typing “string “. Select the System::PackageName variable name from the comment and copy it to the clipboard:

Paste the clipboard contents after “string ” as shown here:

The .Net string variable declaration appears as shown:

Now, System::PackageName is not a valid .Net variable name.
System::PackageName is a valid SSIS variable name where “System” is the namespace and “PackageName” is the variable name.

Edit the .Net variable so it is named “packageName” as shown:

Initializing a .Net Variable

Our goal is to read the value of the SSIS variable System::PackageName into the .Net string variable named packageName. Assigning an initial value to a variable is known as initializing the variable.

To begin initializing the .Net string variable named packageName, add the “=” sign after the variable name (packageName). In C# (and similar languages), a single equal sign is an assignment operator.

Type “Dts.” – as shown next – and note IntelliSense presents a list of available methods and objects. Scroll to the object named “Variables” as shown:

You can append “Variables” to “Dts.” by double-clicking Variables or by pressing the Tab key when Variables is selected (as shown above). Add a beginning-bracket – [ – followed by a double-quote – “. Note the complementary closing bracket – ] – and double-quote – ” – are added automagically. Plus the cursor is in perfect position to paste the contents of clipboard (again):

Move to the end of this line of .Net code and type “.v”. IntelliSense kicks in – although my screen-capture software makes the context menu opaque – and the Value property is selected as shown:

This is important.

Next type “.t”. Notice the Value property is completed by IntelliSense – again, automagically – and IntelliSense also displays the only option available that begins with the letter “t” – ToString:

Complete the statement by typing “();”:

Rinse, Repeat

Use the same .Net variable declaration and initialization to declare and initialize the taskName .Net string variable:

Next, declare a .Net string variable named subComponent and use the packageName and taskName .Net string variables to initialize the value of subComponent:

Exercise your .Net variable declaration and initialization skills even more by declaring two more variables:

  1. An int variable named informationCode and initialized to 1001
  2. A bool variable named fireAgain initialized to true:

Finally, declare a .Net string variable named description and initialize it with the string

“I am ” + packageName

ETL Instrumentation via Information Event Message

Configure and raise an Information event using the following .Net statement:

Dts.Events.FireInformation(informationCode, subComponent, description, “”, 0, ref fireAgain);

Click the screenshot below to enlarge the image. Note the Dts.Events.FireInformation method takes six arguments:

  1. informationCode [int]
  2. subComponent [string]
  3. description [string]
  4. helpFile [string]
  5. helpContext [int]
  6. fireAgain [bool]
(click to enlarge)

We use the .Net variables informationCode, subComponent, description, and fireAgain to supply four of the arguments. We supply literal values – “” and 0, respectively – to the helpFile and helpContext arguments.

When complete, our Main() method contains the following .Net code:

public void Main()
{
// System::TaskName,System::PackageName
string packageName = Dts.Variables[“System::PackageName”].Value.ToString();
string taskName = Dts.Variables[“System::TaskName”].Value.ToString();
string subComponent = packageName + “.” + taskName;
int informationCode = 1001;
bool fireAgain = true;

string description = “I am ” + packageName;
Dts.Events.FireInformation(informationCode, subComponent, description, “”, 0, ref fireAgain);

Dts.TaskResult = (int)ScriptResults.Success;
}

Test It!

Close the VstaProjects code editor by closing the window, then close the Script Task Editor by clicking the OK button.

Execute the SSIS package in the SQL Server Data Tools (SSDT) debugger by pressing the F5 key. If all goes as planned, you should see a successful debug execution:

Click the Progress tab to view the Information message:

(click to enlarge)

Conclusion

ETL Instrumentation is an important part of data integration development with SSIS. It increases development time (slightly) to add ETL Instrumentation such as this but it’s worth it. When troubleshooting an SSIS package at 2:00 AM some months after deployment, ETL Instrumentation will save time.

The additional time spent during development is the opposite of technical debt, it’s a technical investment.

Learn more!

Get live, online training – from me! Check out the Enterprise Data & Analytics Training page.

Check out Introduction to Troubleshooting SSIS – a free sample of SSIS Self-Help at SSIS Academy!

Join me For Expert SSIS Training!

I’m honored to announce Expert SSIS – a course from Enterprise Data & Analytics!

The next delivery is 01-02 Apr 2019, 9:00 AM – 4:30 PM ET.

Data integration is the foundation of data science, business intelligence, and enterprise data warehousing. This instructor-led training class is specifically designed for SQL Server Integration Services (SSIS) professionals responsible for developing, deploying, and managing data integration at enterprise-scale.

You will learn to improve data integration with SSIS by:

  • Building faster data integration.
  • Making data integration execution more manageable.
  • Building data integration faster.

Agenda

  1. SSIS Design Patterns for Performance – how to build SSIS packages that execute and load data faster by tuning SSIS data flows and implementing performance patterns.
  2. SSIS Deployment, Configuration, Execution, and Monitoring – the “Ops” part of DevOps with SSIS using out-of-the-box tools and the latest utilities.
  3. Automation – how to use Business Intelligence Markup Language (Biml) to improve SSIS quality and reduce development time.

I hope to see you there!

PS – Want to Learn More About Azure Data Factory?

Follow Andy Leonard’s SSIS Training page for more information.

My Learning Resources for SSIS

People often ask me, “Andy, where can I learn SSIS or learn more about it?” Well, you’re at one such location – my blog! I’ve been blogging about SSIS for about 10 years as I type this. There’s a lot posts here about SSIS – that’s the good news. The bad news is: There’s a lot of posts here about SSIS.

Some other places you can learn about SSIS from me:

I hope to see you at one of these events. If you read this blog, come up and introduce yourself. I’m the fat guy with a braided beard!

:{>

ETL Instrumentation: Logging SSIS Variable Values

During the December 2017 delivery of Expert SSIS, I was asked if there is an SSIS Catalog Logging Mode that will display the value of variables. I responded that I wasn’t aware of a logging level that accomplishes this, but then – as sometimes happens – I could not let it go. It is an excellent question. So I dug in…

Answer #1: YesBut

Here’s one way to do it: Open the Variables window in SSIS. Click the Grid Options button on the Variables window toolbar:

When the Variable Grid Options dialog displays, check the “Raise event when variable value changes” checkbox:

Click the OK button to return to the Variables window and note the new property: “Raise Change Event.” Set Raise Change Event to True:

Once deployed, the value of the variable will be logged in the SSIS Catalog at startup and each time the variable value changes:

Please note: the VariableValueChanged event is captured using the SSIS Catalog Basic Logging level.

Why isn’t this a resounding “Yes! Yes, you can log all variables values in the SSIS Catalog!”? Well, because this requires developers to open existing packages, open Variables, check the checkbox to even be able to see the property, and then change the property value – for every. single. variable.

Answer #2: Use a Script Task

You can add a Script Task to your SSIS package – similar to the Script Task shown below:

Open the Script Task Editor and click the ellipsis in the ReadOnlyVariables value textbox to open the Select Variables dialog. I like to select the System::PackageName and System::TaskName variables each time I use a Script Task for logging purposes (and I use Script Tasks for logging a lot). Then select the variable(s) for which you wish to log values:

When I design Production Data Integration solutions with SSIS, I log the values of parameters and variables at the start of each execution, and each variable used inside a Loop Container or Foreach Loop Container. Why? I don’t want to guess or dig or rely on defaults; I want the package to tell me what those values were when it executed. I consider this a best practice.

I add C# similar to that shown here to perform variable value logging:

Once deployed and executed, the Information Event raised in the Script Task publishes a message consumed by SSIS Catalog Basic Logging (the default SSIS Catalog logging level) as shown below:

I know many SSIS developers are uncomfortable using Script Tasks (and Script Components). While I understand the discomfort, I urge SSIS developers to learn enough .Net (C# or VB) to become comfortable using Script Tasks and Components in SSIS. In my opinion, .Net easily doubles your data integration capabilities with SSIS.

:{>

Learn more:
Expert SSIS Training (Live, Online)
IESSIS1: Immersion Event on Learning SQL Server Integration Services (In-person)