The SSIS Catalog, first introduced with the release of SQL Server 2012 Integration Services (SSIS 2012), is a big step forward in Data Integration Lifecycle Management (DILM) with SQL Server Integration Services (SSIS). Like all software, the SSIS Catalog performs some tasks admirably and other tasks… not so admirably. In this post, I will share some thoughts on SSIS Catalog logging.
My favorite SSIS Catalog feature is the integrated logging. It Just Works. I configure a default Logging Level when I create an SSIS Catalog (or later) and I don’t have to think about logging unless or until I want to change the Logging Level for some reason. And I can make that change on a per-package-execution basis, even. SSIS Catalog logging is fast and requires little or no work on my part to maintain. Other logging defaults configured at the time of the Catalog creation (or later) include the Retention Period and Clean Logs Periodically properties:
I can run my data integration enterprise with these defaults. And I sometimes do.
The SSIS Catalog will log some events in an order that is different from the order in which the events occurred in real time. In my experience, it’s possible to see events that occur within a few milliseconds logged out of order. I’ve explained this to a few auditors and, to date, none have disqualified a system because of this behavior.
The way the SSIS Catalog cleans these records (in 2012 and 2014, at least) can be troublesome. Short version: the maintenance process executes daily (by default) and relies on cascading deletes that span a few levels in the logging hierarchy. I won’t bore you with the remainder of the technical details. I will just tell you that if you have lots of execution and event metadata stored in the SSIS Catalog log tables for a particular day, the log cleaning process can take a long time and consume lots of server resources.
This is potentially Ugly, but it is at least Bad. In order for someone to view the SSIS Catalog reports that ship with the solution (in SSIS 2012 and 2014), the following must be true:
The user must have SQL Server Management Studio (SSMS) installed.
The user must be a member of the ssis_admin or sysadmin roles (…in 2012 and 2014. Starting in 2016 there’s a new role named “ssis_logreader” that does what it says.).
The reports are excellent:
I have no issue with Database Administrators (DBAs), Database Developers, or Data Integration Developers using SSMS and the SSIS Catalog reports. I worry some about Operations people and project stakeholders needing SSMS and elevated permissions just so they can access operational or historical enterprise data integration information. Am I saying I do not want Operations people and project stakeholders to see the data contained in the SSIS Catalog reports? No, I most certainly am not. My concern is that someone will accidentally delete a mission-critical SSIS package from a Production SSIS Catalog. (Note: DBAs, Database Developers, and Data Integration Developers are just as at risk of accidentally deleting something as Operations people or project stakeholders.)
You see, if you are a member of either the ssis_admin or sysadmin role, you have permission to delete SSIS packages. You also have permission to do a whole lot of other things that, frankly, you shouldn’t need to do.
The defaults for SSIS Catalog logging properties are shown in the screenshot above. The default logging level is Basic. I like that setting. I also like cleaning the logs periodically.
I cringe a little at the default retention period. Why? Experience. I’ve been an enterprise data integration architect for years. Doing that job will teach you to think about stuff normal people don’t consider. Here’s why I cringe at a 365-day default retention period: What happens 1 Jan? Most enterprises run mission-critical, regulation-required, end-of-year reports on or around 1 Jan. These processes may only execute once per year (but they’re tested each time a change is deployed, right? Don’t get me started…). They can be big and complex processes and they usually execute across a larger domain of data than nightly, weekly, monthly, or quarterly processing.
What does that mean for the SSIS Catalog logs? More messages!
Wait, it gets worse.
Where are your support people on or around 1 Jan? Sitting at their desks watching these processes run to completion? Gosh, I hope not. Let’s say you manage to make it through that first SSIS-Catalog-managed year-end processing cycle without calling in too many people. Good. Fast-forward one year – or 365 days, as the Logging Retention Period property of the SSIS Catalog likes to call it. The default logging retention period also kicks in on or around 1 Jan (you think you have a 25% chance of hitting a leap year, but Murphy is way better at statistics than you or me…), right about the time your extra long data integration processing is trying to run. That smell coming from your server? That’s smoke.
I can hear you thinking, “Great. Now what, Andy?” I’m glad you asked! Ask yourself, “Self, do we really need a year of historical enterprise data integration logging?”
If the answer is “Yes, we need at least a year of log data,” then I recommend you increase the Retention Period (days) property to some number greater than 365 – maybe 380, provided fifteen days will get your enterprise beyond the year-end process. The goal is to have the log maintenance from last year’s year-end processing occur sometime after this year’s year-end processing.
If the answer is “No, we don’t need a year a log data,” I recommend you set the Retention Period (days) property to the number of days you need to retain.
If you’re reading this and you have an SSIS Catalog in your enterprise, then you also have SQL Server Reporting Services (SSRS). You can write your own reports to execute against the logs and data integration metadata stored in the SSISDB database. SSRS reports can be deployed to run in a web browser. That makes them read-only from a user perspective.
To be fair, the SSIS Catalog reports built into SSMS are also read-only. But SSMS is anything but read-only. Good DBAs will read this and (rightfully) think, “Well, enterprise security should define who has permissions to read and write and make changes to data in SSMS.” I concur, and that’s almost correct. The SSIS Catalog node in the SSMS Object Browser is different. Permissions checks are actually hard-coded into some of the SSIS Catalog Views, even. In order to view the reports, they must be a member of either ssis_admin or sysadmin.
To also be fair, the user still must be a member of either ssis_admin or sysadmin to view data via SSIS Catalog Views from an SSRS solution deployed so that it runs in a web browser. Two thoughts:
It’s a web browser and the reports are (hopefully) only executing SELECT statements.
If you design your own reporting solution in SSRS, you can define how you access the data. You do not need to use the SSIS Catalog-supplied views with their hard-coded role checks, you can write your own views or stored procedures.
I and others have built custom SSRS reporting solutions that read the same SSIS Catalog execution logs and metadata. At Enterprise Data & Analytics, we are here to help™. Check out the Catalog Reports (free and open source!) project at DILM Suite.
The SSIS Catalog contains a great set of Data Integration Lifecycle Management tools, reports, and functionality. The logging and reporting is very good, but there are things of which you should be aware.
I am here to help:
DevOps and Data Integration with SSIS (at IT/Dev Connections in Las Vegas, NV 15 Sep 2015)
Designing an SSIS Framework (at IT/Dev Connections in Las Vegas, NV 16 Sep 2015)
Using Biml to Automate SSIS Design Patterns (at the PASS Summit 2015 in Seattle, WA 27-30 Oct 2015)
Hacking the SSIS Catalog (at the PASS Summit 2015 in Seattle, WA 27-30 Oct 2015)