A few days ago, I (along with some smart people) were asked on Twitter how to manage the size of the SSIS Catalog. Since I’d heard this question a couple times recently, I decided it would probably make a good topic for a blog post. Welcome to the blog post!
The gentleman who asked the question was kind enough to supply detailed metrics:
This distribution of event and operation messages, context, and statistics is not uncommon in busy enterprise SSIS Catalogs.
One Solution: Manage the SSIS Catalog File Size
One way to manage SSIS Catalog file size is to limit the number Retention Period days. The default for this property is 365. I warned enterprises to change this default in a post titled Updated: SSIS Catalog Logging and Reporting in the Enterprise. If you truly need a year’s worth of log data in your enterprise, I recommend you change this to some other number – like 369.
Many enterprises do not need to keep SSIS Catalog logs around for that amount of time, though. Let’s say you want to change this value to 15 days. Begin by right-clicking the SSIS Catalog node – named SSISDB – in SSMS Object Explorer’s Integration Services Catalogs node, and then clicking Properties:
As I mentioned, the default value for the Retention Period (says) property is 365:
WAIT: Walk It Down
This is where things can get tricksy. You may think the next step is “set the value to 15.” You could but – at this volume of data things will get… messy… overnight. I can hear you thinking, “Andy, why will things get… messy… overnight?”
I’m glad you asked.
The SSIS Catalog database – also named SSISDB – uses cascading DELETEs to enforce referential integrity. In his post titled Adventures In Foreign Keys 3: Why Cascading Deletes Perform Slowly, Erik Darling (Blog, About) aka, “Nice post, Brent!” blogged at the Brent Ozar (@BrentO | Blog | DataDriven interview) website. In the post, Erik explains some complications that arise out of cascading DELETEs.
The short answer applied to the SSIS Catalog? Your enterprise SSIS performance will likely be impacted if you make the move from retaining 365 days to fewer days of log data.
How To Walk Retention Period Down
Begin by changing Retention Period from 365 to 330, as shown above. Click the OK button to save the setting and then execute the following T-SQL query:
exec SSISDB.internal.cleanup_server_retention_window
I can hear some of you thinking, “Why execute this stored procedure, Andy?” Once again, I am glad you asked.
Step with me into Andy’s Time Machine while we venture back to when you or someone else created your enterprise SSIS Catalog…
In the Beginning…
If, back when you or someone else created the SSIS Catalog on this instance, you / they checked the “Enable automatic execution of Integration Services stored procedure at SQL Server startup” checkbox, this query is executed as part of a nightly SQL Agent job named SSIS Server Maintenance Job:
I recommend checking this checkbox when creating the SSIS Catalog. The stored procedures called by this job manage a lot of bad things that can happen when the SQL Server service ends unexpectedly – especially if the service stops while an SSIS package is running.
The SQL Agent job named SSIS Server Maintenance Job contains two steps – one for managing the logs and another for managing project versions:
If we dive into the job step named SSIS Server Operation Records Maintenance, we see a check for permission to execute the stored procedure followed by a call to the stored procedure itself:
One of the operations performed by the SSISDB.internal.cleanup_server_retention_window stored procedure is deleting log data that is older than the SSIS Catalog Retention Period property specifies.
BUT… this job is scheduled (by default – you can change it) to execute each night at midnight local time:
What else might be running around midnight? On your enterprise SQL Server? In Production?? Only you can answer that question. If SSIS packages are executing and the SSIS Catalog is attempting to store log messages at this time, SSIS performance can be impacted.
Back to the Future, er, Present…
Executing the SSISDB.internal.cleanup_server_retention_window stored procedure applies your updated setting to the SSIS Catalog. Once the procedure completes, you should find less records in the SSIS Catalog log tables.
Continue walking down the Retention Period setting and executing the SSISDB.internal.cleanup_server_retention_window stored procedure until you reach your desired goal of retained days or SSIS Catalog log record count:
Please keep in mind: the records are gone forever… unless you have backed up the SSISDB database.
You should treat SSISDB like every other production enterprise database!
A Couple Other Options
Another option is to take advantage of two free offerings by Tim Mitchell and Your Truly:
Tim’s (excellent – and free) script circumvents the default SSIS Catalog’s cascading deletes behavior by deleting records older than the configured SSIS Catalog Retention Period setting, starting with the tables at the bottom of the foreign key hierarchy.
SSIS Catalog Archive creates a separate database to hold the same data Tim’s Clean-up script deletes and includes an SSIS solution to read from the SSIS Catalog and load the archive database. The archive database enforces referential integrity that is identical to the RI found in the SSISDB database’s log-related tables.
Conclusion
Resetting the SSIS Catalog’s Retention Period (days) property is one way to manage SSISDB size.
At Enterprise Data & Analytics, we grok the SSIS Catalog.
We’re the authors of DILM Suite – a collection of utilities and applications (most are free and some are open source!) designed to make your life easier as you manage your enterprise data integration lifecycle – including SSIS Catalog Compare and SSIS Frameworks. Heck, we even wrote a book about it!
At Enterprise Data & Analytics, we are here to help.™
Contact us today!
Comments