The very first data integration / data engineering framework I ever wrote was for Data Transformation Services, or DTS. The DTS framework had one job: manage connections. I don’t recall all the details, but I remember DTS included a task that allowed packages to retrieve settings from INI files. INI files are key-value files, so I simply added entries with identical keys and different values – values that matched connection strings for each lifecycle tier – and placed each version of the INI file in the same location on every server in the lifecycle.
The next framework I wrote was for SSIS. I stored metadata in tables – including connections metadata – and created a concept I called an SSIS Application. An SSIS application is, according to my definition, a “collection of SSIS packages that execute in a pre-determined order.”
The SSIS Framework File Community Edition is very similar to this first framework, except for the connections management.
Obtain the Code
To obtain the free and open-source project code for SSIS Framework File Community Edition, browse to the github repo (github.com/aleonard763/SSIS-Framework-File-Community-Edition), click the Code dropdown, and then click “Download ZIP”:
Setup the SSISFrameworkDB Database
After decompressing the zip file:
- Use Windows Explorer to navigate to the SSMS solution file named SSISFrameworkDB.ssmssln. The file may be found in the <unzip-directory>\SSIS-Framework-Community-Edition-File\SSIS-Framework-Community-Edition-File\SSISFrameworkDB directory.
- Right-click the SSISFrameworkDB.ssmssln file.
- Click Open:
The solution should open in SQL Server Management Studio (SSMS). To view solution details, click View>Solution Explorer to open Solutions Explorer:
When Solution Explorer opens, double-click the Query named 0-Build.sql to open:
Update the path to the T-SQL script files. Note the script files are the <unzip-directory>\SSIS-Framework-Community-Edition-File\SSIS-Framework-Community-Edition-File\SSISFrameworkDB directory\SSISFrameworkDB directory:
Click Query>SQLCMD Mode to turn on SQLCMD Mode:
When SQLCMD Mode is enabled, the background color for some lines of code in the 0-Build.sql script changes to gray to indicate these lines contain SQLCMD instructions:
SQLCMD includes variables, and this script uses a variable named “ScriptPath” to manage the path to the additional T-SQL scripts in the SSMS database project (“E:\github\SSIS-Framework-Community-Edition-File\SSISFrameworkDB\SSISFrameworkDB\” in this example).
The trailing backslash for the ScriptPath variable value is vital.
When 0-Build.sql is executed, the value of the ScriptPath SQLCMD variable is used as the first part of the “run” (“:r”) SQLCMD command (“:r $(ScriptPath)1-Create_SSISFrameworkDB.database.sql” for example).
Execute the 0-Build.sql script and confirm the Messages tab appears similar to:
If you forget to turn on SQLCMD Mode, the 0-Build.sql script fails and returns messages similar to:
0-Build.sql is designed to be re-executable (idempotent), so if you re-execute the script, the script should succeed, and the messages should appear as shown here:
Confirm the SSISFrameworkDB database is setup by refreshing SSMS Object Explorer:
The SSISFrameworkDB database has been deployed. The next step is to add SSIS Framework application and package metadata.
Adding SSIS Framework Application and Package Metadata
One compelling reason to use an SSIS Framework is encapsulation. Once metadata is configured, a collection of SSIS packages may be executed using a single command. While there are other ways to accomplish encapsulation, I believe the combination of software design best practices (data engineering development is software development) – particularly separation of concerns (decoupling) by building small, unit-of-work-sized SSIS packages, and driving execution with metadata stored in a database is an elegant approach to addressing data engineering complexity.
SSIS Application, Defined
An SSIS application is a construct I coined for a collection of SSIS packages that execute in a prescribed order. Now, that order may be parallel in some cases and serial in others. One key concept is the execution order is prescribed.
SSIS Application Packages
The core of the SSIS framework is an SSIS Application Package. An SSIS application is a collection of SSIS packages, so the application-to-package cardinality appears one-to-many. But that’s not entirely accurate.
Consider the case of a utility SSIS package – such as ArchiveFile.dtsx, an SSIS package I’ve been dragging around for over 15 years. ArchiveFile.dtsx is a utility SSIS package that, well, archives data files. Depending on the version, ArchiveFile.dtsx contains either variables or parameters that inform the package of the source location and name of the data file, and the archive location. There are more parameter / variables, but you get the idea.
Because ArchiveFile.dtsx may be used in many SSIS applications, the package-to-application cardinality (and by extension, the application-to-package cardinality) is actually many-to-many. Hence the need for a bridge or resolver table to maintain third-normal form: ApplicationPackages.
The ApplicationPackages table maps an SSIS package into an SSIS application. Application packages are the workers of the SSIS framework.
Included in the repo are two test SSIS projects named FileFrameworkTest1 and FileFrameworkTest2:
Also included in the repo are three files
The script named 90-Add_metadata.sql is similar to 0-Build.sql in that both scripts use SQLCMD Mode to call “child” scripts:
As with 0-Build.sql, edit the ScriptPath SQLCMD variable value.
In addition, open the script named 92-Add_packages.metadata.sql and edit the T-SQL parameter named @PackageFolder:
Remember, the backslash at the end of the ScriptPath variable and @PackageFolder parameter are vital.
Save your edits and then execute the script named 90-Add_metadata.sql (remembering to turn on SQLCMD Mode!). The messages tab should reflect the initial insert for SSIS application and package metadata:
Like 0-Build.sql, the 90-Add_metadata.sql script is re-executable. The message reflects the metadata already exists:
The SSMS data project includes a script named 93-Select_applicationpackages.metadata.sql to select the metadata for the test SSIS application:
Open and execute the 93-Select_applicationpackages.metadata.sql script and view the results:
Review the metadata returned from the query carefully. This is your metadata. The PackageFolder values will be specific to your environment.
The next step is to test the framework.
Construct the Command
When I test SSIS Framework File Community Edition, I prefer to build the command line in a text editor. I’m using Notepad++. Feel free to use Notepad or any text editor you prefer. I caution against more advanced applications like Microsoft Word because Word changes the characters used for double-quotes, etc.
There are three parts to the command line:
- The DtExec.exe command
- The Parent.dtsx SSIS package
- SSIS Package Execution Settings
I installed SQL Server on the E: drive of my virtual machine, so the path to DtExec.exe on my VM is E:\Program Files\Microsoft SQL Server\140\DTS\Binn\DTExec.exe, as shown here:
I want to use DtExec.exe to execute the Parent.dtsx SSIS package from the file system – located in the repo in the SSIS solution found in at <unzip-directory>\SSIS-Framework-Community-Edition-File\SSIS Framework Community Edition File\SSIS Framework Community Edition File\ folder:
I need to inform the DtExec.exe engine that I am sending it a file path, which is accomplished using the /FILE switch just before supplying the full path to the Parent.dtsx SSIS package.
SSIS Package Execution Settings
Here’s an image of the three parts in Notepad++, with the three parts labeled:
The settings part reads:
/SET “\Package.Variables[ApplicationName].Properties[Value]”;”File Framework Test” /REP EWI
There are two switches, “/SET” and “/REP”.
It’s best to read the /SET switch (mostly) from right to left. /SET overrides the value property of the SSIS package variable named “ApplicationName”, setting it to “File Framework Test”.
The /REP switch informs DtExec’s reporting to the console (command prompt) window. The values “EWI” represent Error, Warning, and Information, respectively. This setting limits the amount of information sent to the console during execution.
To limit the console messages to a minimum, configure the output for Errors only (E) or None (N).
The full command line reads:
“E:\Program Files\Microsoft SQL Server\140\DTS\Binn\DTExec.exe” /FILE “E:\github\SSIS-Framework-Community-Edition-File\SSIS Framework Community Edition File\SSIS Framework Community Edition File\Parent.dtsx” /SET “\Package.Variables[ApplicationName].Properties[Value]”;”File Framework Test” /REP EWI
One Observation Before We Continue…
The SSIS package named Child2.dtsx in the FileFrameworkTest2 SSIS project is designed to fail when initially executed. The purpose of this design is to demonstrate metadata-configurable Fault Tolerance included in SSIS Framework File Community Edition.
You do not want to include carriage-returns and line feeds (CRLF) characters when you copy the command line for execution. You want to copy one long command line to the clipboard so that you can properly paste it into the command prompt window:
Copy the one-long-line command to the clipboard.
Open a command prompt window and paste the command:
Press the Enter key to execute the command. If all goes as planned, your results should appear similar to those seen here:
If you click the image above, it will open in a new window, which will allow you to read most of the messages. the most important message is found near the bottom:
The execution of the SSIS framework application “File Framework Test” failed.
If you examine the output in detail, you will note only two packages executed – Child1.dtsx and Child2.dtsx – and that Child2.dtsx failed.
Child2 Is Designed to Fail
By default, execution of the SSIS framework application “File Framework Test” is designed to fail.
Why does execution of the SSIS framework application “File Framework Test” fail?
Open the SSIS solution named “SSIS Framework Community Edition File” included in the repo:
When the SSIS solution opens, expand the FileFrameworkTest2 SSIS project in Solution Explorer, and double-click the Child2.dtsx SSIS package to open the visual editor:
The Script Task named “SCR Succeed Package?” reads the value of the SSIS package variable named “SucceedPackage,” which is defaulted False:
By design and by default, Child2.dtsx is configured to fail execution each and every time.
Testing SSIS Framework File Community Edition Fault Tolerance
One way to test fault tolerance in SSIS Framework File Community Edition is to expand SSISFrameworkDB tables in SSMS Object Explorer, right-click cfg.ApplicationPackages, and then click “Edit Top 200 Rows”:
When the editor displays, edit the FailApplicationOnPackageFailure bit value for the Child2.dtsx package. If you’ve followed these instructions carefully, the ApplicationPackageID (and PackageID) for the Child2.dtsx package should be 2, but you can discover the ApplicationPackageID of Child2 by executing the query shared earlier, the one stored in the script file named “93-Select_applicationpackages.metadata.sql”.
Change the FailApplicationOnPackageFailure bit value from True to False for the Child2.dtsx application package:
I prefer to close the data editing tab after the edit is complete.
Next, re-execute the test. The important result appears at the bottom of the messages returned to the console:
In this section, we find an example of implementing fault tolerance using SSIS Framework File Community Edition. Individual child packages may be configured to fail without stopping the SSIS application execution, allowing for the remaining packages in the application to execute.
SSIS Framework File Community Edition is somewhat complex to configure but, once configured, simplifies data engineering with SSIS.
I believe the artifacts included in the github repo provide a solid foundation from which to work – especially the metadata management scripts.
As always, I welcome your feedback and comments.