Why I Built SSIS Catalog Compare, by Andy Leonard

A few weeks ago I published a blog post titled Why I Built DILM Suite, by Andy Leonard – an excerpt from the book Data Integration Life Cycle Management with SSIS. This post is another excerpt of the same book. In fact, this is the chapter that follows the chapter on Catalog Browser (which is the subject of the Why I Built DILM Suite, by Andy Leonard post).

This is a rather long blog post – one I considered breaking into parts. I pondered “chunking it up” for a while before deciding that the material as written is as granular as possible, topic-wise. And so I left it thus. Enjoy!

Viewing the SSIS Catalog contents including SSIS Catalog Projects, Packages, Environments, Environment Variables, References, and Reference Mappings – along with SSIS Package and SSIS Catalog Properties – is helpful. But what if you want to see the differences between the contents of one SSIS Catalog instance and another? Or, perhaps even more useful, how can you know the enterprise QA and Production SSIS Catalog instances match?

Why I Built SSIS Catalog Compare, by Andy Leonard

My team built a fairly complex data integration solution for a client using SSIS. We tested the solution in two DevOps tiers, Test and UAT (User Acceptance Testing). We identified some issues and corrected them. It was then time to deploy to Production. My team and I were on standby during the Production deployment and initial Production tests.

Initially, everything failed.

We identified the root causes and corrected the issues, but we had egg on our collective faces with the business – and for good reason. We had assured them “we were doing it right.” We were, with one exception. The enterprise data architect identified the gap with one very good question, “How can we know the Production and UAT SSIS Catalogs are the same?” My response at the time, “[Pregnant pause while thinking… then] I don’t know.”

I built SSIS Catalog Compare so I – and my customers, and you – can answer that question.

SSIS Catalog Compare

You can begin to think of SSIS Catalog Compare as two Catalog Browsers. Like Catalog Browser, SSIS Catalog Compare uses Catalog Base, a custom Catalog object. In SSIS Catalog Compare, Catalog Base is used to populate two trees – each representing a different SSIS Catalog – as shown in Figure 7-1 (click to enlarge):


Figure 7-1. SSIS Catalog Compare

Once two Catalogs are loaded they may be compared by clicking the Compare button. As shown in Figure 7-2, SSIS Catalog Compare uses italics node font to indicate a difference has been detected beneath a node and a different background color to indicate artifacts that are present in one SSIS Catalog but missing from the other (click to enlarge):


Figure 7-2. After a Compare Operation

The Test Catalog Folder in the VmDemo\Demo Catalog does not exist in the vmDemo\Dev Catalog, hence the background color on the vmDemo\Demo \SSISDB\Test node. The “deeper differences” indications (italics node fonts) shown in Figure 7-2 indicate differences within the Framework Catalog Folders and between the Catalog Properties.

7.1   Expanding the Differences

One time-saver is Expand Differences as shown in Figure 7-3 (click to enlarge):


Figure 7-3. Expand Differences

Expand Differences will expand nodes above differences and the nodes that are different. In large SSIS Catalog Projects, Catalog Projects with lots of configurations metadata, or both; Expand Differences can reduce the number of nodes expanded. Expanding the differences in the Framework Catalog Folder for both Catalogs, we see the Parent.dtsx SSIS Package SSISDB Connection Manager’s ConnectionString property is configured for each server’s local SQL Server instance as shown in Figure 7-4 (click to enlarge):


Figure 7-4. Difference ConnectionString Property Values

This is a really good thing because, if these values matched, Framework Application executions in one Catalog would start SSIS executions in another Catalog. That may be bad.

7.2   Catalog Properties

We didn’t discuss Catalog Properties in the section about Catalog Browser because I wanted to cover Catalog Properties here in more detail. If we collapse the Framework Catalog Folder and expand Catalog Properties, we see the difference is the Schema Build property shown in Figure 7-5:


Figure 7-5. Different Values for the SSIS Catalog Schema Build Property

Please note from the property just beneath Schema Build that both of these SSIS Catalogs are SSIS 2016. vmDemo\Demo’s Schema Build version tells us that this Catalog is the General Availability (or GA, formerly known as RTM [Release to Manufacturing]) version of SSIS 2016. vmDemo\Dev’s Catalog is an SSIS 2016 SP1 Schema Build version.

The Catalog Base object is built to allow SSIS Catalog Compare to compare SSIS Catalogs from different releases and versions of SQL Server. For example, I could compare the contents of an SSIS 2012 Catalog to those of an SSIS 2016 Catalog. SSIS 2016 added a couple properties to the Catalog Properties, so you would see a couple properties marked on the 2016 side of the compare as missing from the 2012 side. But the compare operation would succeed and produce accurate and useful results.

7.3   SSIS Catalog Compare Scripting

In many enterprises with two or more IT professionals, the professional who develops software is not permitted to deploy that software. Large enterprises have entire teams dedicated to Release Management. Lifecycle management will help enterprises of any size reduce downtime and improve code supportability and maintainability. This holds for SSIS as data integration lifecycle management is just as vital as web, GUI, and middle-tier software lifecycle management.

SSIS Catalog Compare is designed to support DevOps and enterprise DILM.

You can script individual artifacts using SSIS Catalog Compare, but perhaps the more useful (and quicker) functionality is to script a Catalog Folder and all its contents as shown in Figure 7-6:


Figure 7-6. Preparing to Generate Scripts for a Catalog Folder and Its Contents

When you click “Generate Scripts for Folder and Contents,” SSIS Catalog Compare prompts you for a file system folder in which to store the scripts as show in Figure 7-7:


Figure 7-7. Selecting a Target File System Folder for the Scripts

When you select a file system folder, SSIS Catalog Compare generates the scripts and ISPAC files to create the Catalog Folder and all its contents as shown in Figure 7-8:


Figure 7-8. Viewing the Target File System Folder for the Scripts

The scripts are numbered in an order that represents dependencies. For example, you cannot deploy an SSIS Project using the ISPAC file – number 2 – unless and until the SSIS Catalog Folder for that SSIS Catalog Project exists – script number 1.

Double-clicking the files in the order listed will create the Test Catalog Folder (script 1) on the target SSIS Catalog, deploy the DILMSample SSIS Project using the Integration Services Deployment Wizard GUI (you will need to select the Test Catalog Folder as the target Catalog Folder during deployment), create the envConnection1 Catalog Environment and its Catalog Environment Variable(s), create a Reference between the DILMSample SSIS Project and envConnection1, and create another Reference with Reference Mapping(s) between the SimplePackage.dtsx SSIS Package and the envConnection1 Catalog Environment.

Why two reference files? Inside the SSIS Catalog References for SSIS Catalog Projects and SSIS Packages are distinct artifacts.

7.3.1  Creating a Catalog Folder

The scripts (and ISPAC) are idempotent, or re-executable as shown in Figure 7-9:


Figure 7-9. Idempotent Transact-SQL Scripts

If the Test Catalog Folder does not exist, the script creates it. If the Test Catalog Folder exists, the script outputs a message informing the executor of this fact.

The output messages are written to be copied and pasted into a ticket’s Notes field before the ticket is closed. The messages provide detailed information about how and when the script was generated, by whom, when and where the script was executed, and by whom; as shown in Figure 7-10:


Figure 7-10. Script Output Messages

Refreshing the SSIS Catalog node in SSMS Object Explorer will reveal the Test Catalog Folder has been created:


Figure 7-11. Refreshing the SSIS Catalog Node

Figure 7-12 shows the results of the Refresh – the Test Catalog Folder has been created by the script.


Figure 7-12. The Test Catalog Folder Lives!

We began this section stating, “The scripts (and ISPAC) are idempotent…” What happens if we re-execute the script we just used to create the Test Catalog Folder? Let’s return to SSMS and try it. The results are shown in Figure 7-13:


Figure 7-13. Re-executing the Catalog Folder Script

Please note the message returned: “Test folder already exists.” The Test Catalog Folder was not harmed by re-executing the script. The script simply checks for the existence of the Catalog Folder and creates it if it does not exist. If the Catalog Folder exists, the script returns a message. The Transact-SQL that performs this operation is shown in Figure 7-14:


Figure 7-14. Transact-SQL that First Checks for Catalog Folder Existence

If we refresh SSIS Catalog Compare Catalog 2 we now see the Test Catalog Folder as shown in Figure 7-15:


Figure 7-15. The Test Catalog Folder Created

7.3.2  Deploying the SSIS Project

To deploy the SSIS Project, double-click (or right-click and click Open) the ISPAC file as shown in Figure 7-16:


Figure 7-16. Opening the ISPAC File

The Integration Services Deployment Wizard starts – the same wizard used to deploy SSIS Projects from SQL Server Data Tools (SSDT). Stepping through the wizard, our first stop is the Select Source page shown in Figure 7-17:


Figure 7-17. The Integration Services Deployment Wizard Select Source Page

This page surfaces a lot of functionality. The Deployment Model dropdown is used to select Project or Package Deployment. Beginning with SSIS 2016, operators have the option of deploying the entire SSIS Project – which was the only option available in SSIS 2012 and 2014 – or deploying a single SSIS Package. Operators may also select the type of SSIS Project source. There are two options available: a Project deployment file or an Integration Services Catalog. Since we started this exercise by opening an ISPAC file, the Project deployment file option is selected for us.

If we select the Integration Services Catalog option, the Select Source page presents Catalog Project settings which may be configured for an SSIS Catalog Project deployed to a different SSIS Catalog as shown in Figure 7-18:


Figure 7-18. Deploying from One SSIS Catalog to Another

The remainder of the Integration Services Deployment Wizard pages and process are the same as those covered in the Deploy to the SSIS Catalog chapter. On the Select Destination page, enter the name of the SQL Server instance that hosts the target SSIS Catalog. You can then browse to the newly-created Catalog Folder as shown in Figure 7-19:


Figure 7-19. Browsing to the Test Catalog Folder Recently Created

Step through the remainder of the Integration Services Deployment Wizard until the SSIS Project has been deployed, as shown in Figure 7-20:


Figure 7-20. A Successful Deployment

Refresh SSIS Catalog Compare to see the SSIS Catalog Project now exists as shown in Figure 7-21:


Figure 7-21. Viewing the DILMSample SSIS Catalog Project

As with the Catalog Folder script, ISPAC files are re-executable. When an ISPAC file is re-executed, a new version of the SSIS Project is deployed to the target SSIS Catalog.

7.3.3  Deploying the Literal Overrides and Catalog Environment

SSIS Catalog Literal Overrides and Environment scripts are similar because both contain values – either Literal Override values or values for Catalog Environment Variables. Value parameters reside at the top of the Transact-SQL script generated and may be modified by the operator prior to deployment to the target SSIS Catalog as shown in Figure 7-22:


Figure 7-22. Editing an SSIS Catalog Environment Script Prior to Execution

As with the SSIS Catalog Folder script, the Catalog Environment script provides rich feedback messages as shown in Figure 7-23:


Figure 7-23. Messages from the Catalog Environment Script Execution

Please note the Catalog Environment script first checks for the existence on the target Catalog Folder and provides feedback in message on its existence. If the target folder does not exist the Catalog Environment script will fail. Catalog Environment Variables are created in the Catalog Environment script.

7.3.4  Deploying Project and Package References

Project Reference scripts check for the existence of the Catalog Folder, Catalog Project, and Catalog Environment as shown in Figure 7-24:


Figure 7-24. The Project Reference Script Checks for Catalog Folder, Project, and Environment

When executed for the first time the Project Reference script returns messages similar to those shown in Figure 7-25:


Figure 7-25. Project Reference Script Feedback

If the Project Reference script is re-executed, the messages reveal the Project Reference already exists as shown in Figure 7-26:


Figure 7-26. Re-executing the Project Reference Script

The Package Reference script is most complex among the scripts generated by SSIS Catalog Compare. As shown in Figure 7-27, the Package Reference script includes checks for Catalog Folder, Project, Environment, and Reference:


Figure 7-27. Existence Checks in the Package Reference Script

The Package Reference script always clears the existing value in the Reference Mapping before setting it.

As one might imagine, there are several messages returned from execution of the Package Reference script as shown in Figure 7-28:


Figure 7-28. Message Returned from Package Reference Script Execution

7.3.5  Testing with SSIS Catalog Compare

Return to SSIS Catalog Compare and right-click the Compare button to “Refresh Both TreeViews and Compare” as shown in Figure 7-29:


Figure 7-29. Refreshing Both TreeViews and Comparing

What’s this? Didn’t we just walk through deploying all these scripted artifacts to our target SSIS Catalog? Why doesn’t Figure 7-30 show everything matching? Please remember we updated the value of the ConnectionString Catalog Environment Variable before we executed the Catalog Environment script. Figure 7-30 shows us this difference exists between the Test Catalog Folders in our SSIS Catalog instances:


Figure 7-30. Different Data Source Values

Now what? These values are, after all, supposed to be different. It’s bad if they’re the same, in fact. Fear not. Click ViewàOptions as shown in Figure 7-31:


Figure 7-31. Opening SSIS Catalog Compare Options

You can ignore Catalog Environment Variable Values by checking the checkbox shown in Figure 7-32:


Figure 7-32. SSIS Catalog Compare Options

Once Catalog Environment Variable Values are ignored, a re-compare operation shows the Test Catalog Folders in our Catalogs matches. Expanding the envConnection1 Catalog Environment to view the values of the ConnectionString Catalog Environment Variable shows the values have not changed – they’re still different – merely ignored as shown in Figure 7-33:


Figure 7-33. Catalog Environment Variable Values, Ignored

SSIS Catalog Compare may be purchased at dilmsuite.com/ssis-catalog-compare.

Not discussed in this post is CatCompare, the CLI (Command-Line Interface) for SSIS Catalog Compare. Learn more at DILM Suite.

:{>

Andy Leonard

andyleonard.blog

Christian, husband, dad, grandpa, Data Philosopher, Data Engineer, Azure Data Factory, SSIS guy, and farmer. I was cloud before cloud was cool. :{>

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.