HowTo: Install SSIS Framework Community Edition

SSIS Framework Comunity Edition is available from DILM Suite (Data Integration Lifecycle Management Suite). SSIS Framework CE is not only free, it is open source.

To install SSIS Framework Community Edition, visit DILM Suite and click SSIS Framework Community Edition:

When the SSIS Framework Community Edition page opens, click the first link:

This link will take you to GitHub where you may download the code:

Unzip the file and open the PDF titled SSIS Framework Community Edition Documentation and User Guide:

The documentation includes instructions for creating an SSIS Catalog on-premises – or an instance of Azure-SSIS in Azure Data Factory:

The Documentation and User Guide next walks you through the process of upgrading the SSIS packages to the latest version of SSIS, followed by instructions for deploying the Framework project:

The next step is to execute T-SQL scripts that configure the project you just deployed and also create the SSIS Framework objects:

The remainder of the document walks you through testing and customizing the SSIS Framework for your own needs. A customizing primer is included, in which I demonstrate how to extend Framework functionality.

Learn More

Click here to learn about enterprise SSIS Framework editions.
Click here to learn about SSIS Framework Browser, which is free.

The Recording for Loading Medical Data with SSIS is Available

Kent Bradshaw and I had a blast delivering Loading Medical Data with SSIS earlier today! If you missed the webinar and, perhaps more importanly, coupon codes to save on upcoming Enterprise Data & Analytics Training.

Enjoy the video!

We demonstrated a handful of (free!) DILM Suite (Data Integration Lifecycle Management) utilities:

Join us next week for another free webinar: Enterprise SSIS Execution!

Free Webinar – Enterprise SSIS Execution

Join Kent Bradshaw and me as we present (another) free Enterprise Data & Analytics webinar Tuesday, 23 Apr 2019 at 12:00 PM EDT: Enterprise SSIS Execution.

Abstract

SQL Server Integration Services (SSIS) is a powerful enterprise data integration tool that ships free with Microsoft SQL Server. Join Andy Leonard – Microsoft Data platform MVP, author, blogger, and Chief Data Engineer at Enterprise Data & Analytics – and Kent Bradshaw – Database Administrator, Developer, and Data Scientist at Enterprise Data & Analytics – as they demonstrate several ways to execute enterprise SSIS.

Join this webinar and learn how to execute SSIS from:

  • SSDT (SQL Server Data Tools)
  • the Command Prompt
  • the SSIS Catalog
  • a metadata-driven SSIS Framework

Register today!

:{>

Regarding SSIS Frameworks, Part 1 – Execution

Why Use a Framework?

The first answer is, “Maybe you don’t need or want to use a framework.” If your enterprise data integration consists of a small number of SSIS packages, a framework could be an extra layer of hassle metadata management for you that, frankly, you can live without. We will unpack this in a minute…

“You Don’t Need a Framework”

I know some really smart SSIS developers and consultants for whom this is the answer.
I know some less-experienced SSIS developers and consultants for whom this is the answer.
Some smart and less-experienced SSIS developers and consultants may change their minds once they gain at-scale experience and encounter some of the problems a framework solves.
Some will not.

If that last paragraph rubbed you the wrong way, I ask you to read the next one before closing this post:

One thing to consider: If you work with other data integration platforms – such as DataStage or Informatica – you will note these platforms include framework functionality built-in. Did the developers of these platforms include a bunch of unnecessary overhead in their products? No. They built in framework functionality because framework functionality is a solution for common data integration issues encountered at enterprise scale.

If your data integration consultant tells you that you do not need a framework, one of two things is true:
1. They are correct, you do not need a framework; or
2. They have not yet encountered the problems a framework solves, issues that only arise when one architects a data integration solution at scale.

– Andy, circa 2018

Data Integration Framework: Defined

A data integration framework manages three things:

  1. Execution
  2. Configuration
  3. Logging

This post focuses on…

Execution

If you read the paragraph above and thought, “I don’t need a framework for SSIS. I have a small number of SSIS packages in my enterprise,” I promised we would unpack that thought. You may have a small number of packages because you built one or more monolith(s). A monolith is one large package containing all the logic required to perform a data integration operation – such as staging from sources.

(click to enlarge)

The monolith shown above is from a (free!) webinar Kent Bradshaw and I deliver 17 Apr 2019. It’s called Loading Medical Data with SSIS. We refactor this monolith into four smaller packages – one for each Sequence Container – and add a (Batch) Controller package to execute them in order. I can hear some of you thinking…

“Why Refactor, Andy?”

I’m glad you asked! Despite the fact that its name contains the name of a popular relational database engine (SQL Server), SQL Server Integration Services is a software development platform. If you search for software development best practices, you will find something called Separation of Concerns near the top of everyone’s list.

One component of separation of concerns is decoupling chunks of code into smaller modules of encapsulated functionality. Applied to SSIS, this means Monoliths must die:

A slide from the Expert SSIS training…

If your SSIS package has a dozen Data Flow Tasks and one fails, you have to dig through the logs – a little, not a lot; but it’s at 2:00 AM – to figure out what failed and why. You can cut down the “what failed” part by building SSIS packages that contain a single Data Flow Task per package.

If you took that advice, you are now the proud owner of a bunch of SSIS packages. How do you manage execution?

Solutions

There are a number of solutions. You could:

  1. Daisy-chain package execution by using an Execute Package Task at the end of each SSIS package Control Flow that starts the next SSIS package.
  2. Create a Batch Controller SSIS package that uses Execute Package Tasks to execute each package in the desired order and degree of parallelism.
  3. Delegate execution management to a scheduling solution (SQL Agent, etc.).
  4. Use an SSIS Framework.
  5. Some combination of the above.
  6. None of the above (there are other options…).

Dasiy-Chain

Daisy-chaining package execution has some benefits:

  • Easy to interject a new SSIS package into the workflow, simply add the new package and update the preceding package’s Execute Package Task.

Daisy-chaining package execution has some drawbacks:

  • Adding a new package to daisy-chained solutions almost always requires deployment of two SSIS packages – the package before the new SSIS package (with a reconfigured Execute Package Task – or an update to the ) along with the new SSIS package. The exception is a new first package. A new last package would also require the “old last package” be updated.

Batch Controller

Using a Batch Controller package has some benefits:

  • Relatively easy to interject a new SSIS package into the workflow. As with daisy-chain, add the new package and modify the Controller package by adding a new Execute Package Task to kick off the new package when desired.

Batch-controlling package execution has some drawbacks:

  • Adding a new package to a batch-controlled solutions always requires deployment of two SSIS packages – the new SSIS package and the updated Controller SSIS package.

Scheduler Delegation

Depending on the scheduling utility in use, adding a package to the workflow can be really simple or horribly complex. I’ve seen both and I’ve also seen methods of automation that mitigate horribly-complex schedulers.

Use a Framework

I like metadata-driven SSIS frameworks because they’re metadata-driven. Why’s metadata-driven so important to me? To the production DBA or Operations people monitoring the systems in the middle of the night, SSIS package execution is just another batch process using server resources. Some DBAs and operations people comprehend SSIS really well, some do not. We can make life easier for both by surfacing as much metadata and operational logging – ETL instrumentation – as possible.

Well architected metadata-driven frameworks reduce enterprise innovation friction by:

  • Reducing maintenance overhead
  • Batched execution, discrete IDs
  • Packages may live anywhere
Less Overhead

Adding an SSIS package to a metadata-driven framework is a relatively simple two-step process:

  1. Deploy the SSIS package (or project).
  2. Just add metadata.

A nice bonus? Metadata stored in tables can be readily available to both production DBAs and Operations personnel… or anyone, really, with permission to view said data.

Batched Execution with Discrete IDs

An SSIS Catalog-integrated framework can overcome one of my pet peeves with using Batch Controllers. If you call packages using the Parent-Child design pattern implemented with the Execute Package Task, each child execution shares the same Execution / Operation ID with the parent package. While it’s mostly not a big deal, I feel the “All Executions” report is… misleading.

Using a Catalog-integrated framework gives me an Execution / Operation ID for each package executed – the parent and each child.

“Dude, Where’s My Package?”

Ever try to configure an Execute Package Task to execute a package in another SSIS project? or Catalog folder? You cannot.* By default, the Execute Package Task in a Project Deployment Model SSIS project (also the default) cannot “see” SSIS packages that reside in other SSIS projects or which are deployed to other SSIS Catalog Folders.

“Why do I care, Andy?”

Excellent question. Another benefit of separation of concerns is it promotes code reuse. Imagine I have a package named ArchiveFile.dtsx that, you know, archives flat files once I’m done loading them. Suppose I want to use that highly-parameterized SSIS package in several orchestrations? Sure, I can Add-Existing-Package my way right out of this corner. Until…

What happens when I want to modify the packages? Or find a bug? This is way messier than simply being able to modify / fix the package, test it, and deploy it to a single location in Production where a bajillion workflows access it. Isn’t it?

It is.

Messy stinks. Code reuse is awesome. A metadata-driven framework can access SSIS packages that are deployed to any SSIS project in any SSIS Catalog folder on an instance. Again, it’s just metadata.

*I know a couple ways to “jack up” an Execute Package Task and make it call SSIS Packages that reside in other SSIS Projects or in other SSIS Catalog Folders. I think this is such a bad idea for so many reasons, I’m not even going to share how to do it. If you are here…

… Just use a framework.

SSIS Framework Community Edition

At DILM Suite, Kent Bradshaw and I give away an SSIS Framework that manages execution – for free! I kid you not. SSIS Framework Community Edition is not only free, it’s also open source.

Mean Time to Identify Failure

While managing a team of 40 ETL developers, I wanted to track lots of metrics. Some of the things I wanted to track were technical, like SSIS package execution times. Some metrics were people-centric. 

Andy’s First Rule of Statistics states:

You can use statistics for anything about people, except people.

Andy – circa 2005

It was important to me to track how long it took the on-call person to identify the problem. I didn’t use the information to beat on-call people over the head. I used the information to measure the results of several experiments for displaying metadata about the failure.

Reports For The Win

You may be as shocked by this as I was; reports helped a lot more than I anticipated. Before I deployed the reports the Mean Time to Identify Failure was tracking just under 30 minutes. After deploying the reports, the mean time to identify failure fell to 5 minutes.

As I said, I was shocked. There were mitigating circumstances. The on-call team members were getting more familiar with the information SSIS produces when it logs an error. They were gaining experience, seeing similar errors more than once.

I accounted for growing familiarity by narrowing the time window I examined. The least-impressive metrics put the reduction at 18 minutes to 5 minutes.

Pictures…

(click to enlarge)

Before I built and deployed the dashboard for SSIS Application Instances (like the one pictured at the top of this post), on-call people would query custom-logging tables we built to monitor enterprise data integration. The queries to return Application Instance log data were stored where everyone could reach them. In fact, I used the same queries as sources for this report.

A funny thing happened when I deployed the reports. Each week, one or more on-call people would ping me and tell me how much they liked the reports. Even though the data was the same, the presentation was different. A picture with a little color goes a long way.

The image at the beginning of this section – the SSIS Framework Task Instance Report – is displayed when a user clicks the Failed link shown in the initial screenshot. This design received he most comment by the on-call team members. The most common comment was, “I click the Failed link and it takes me to details about the failure.” The reports were passing The 2:00 AM Test.

SSIS Framework Applications

If you’ve read this far and wondered, “Andy, what’s an SSIS Application?” An SSIS Application is a construct I came up with to describe a collection of SSIS Packages configured to execute in a specific order. An application is a way to group SSIS packages for execution. You can get a sense of how our frameworks work – especially the application execution functionality – by checking out the SSIS Framework Community Edition at DILM Suite (DILM == Data Integration Lifecycle Management).

(click to enlarge)

An Application Instance is an instance of execution of an SSIS Application. An Application Instance is made up of Package Instances. the relationship between applications and packages appears straightforward: an application is a collection of packages; parent-child; one-to-many. But it’s not quite that simple. Our SSIS Frameworks facilitate patterns that execute the same package multiple times, sometimes in parallel! We can also create packages that perform utility functions – such as ArchiveFile.dtsx – and call it from multiple applications. When you do the Boolean algebra, the relationship between applications and packages is many-to-many. 

Our SSIS Frameworks are SSIS Catalog-integrated. They even work with the SSIS Integration Runtime that’s part of Azure Data Factory, Azure-SSIS. 

Dashboards… Evolved

While the Reporting Services dashboard was neat when it was released back in the day, the cool kids now play with Power BI. At DILM Suite you will also find a free – albeit basic – Power BI dashboard that surfaces many of the same metrics using even better reporting technology. The Basic SSIS Catalog Dashboard in Power BI is free at DILM Suite.

I’ve not yet collected Mean Time to Identify Failure metrics using the Basic SSIS Catalog Dashboard in Power BI dashboard. Perhaps you can be the first.

Enjoy!

Free Stuff for People Who Give Back: Announcing 2019 Scholarships

Do you work for a charity or non-profit organization? Submit your application today.

I haven’t advertised this in the past and… I’m not sure why: I donate licenses for SSIS Catalog Compare and (non-free) SSIS Framework Editions – and subscriptions to Biml Academy and SSIS Academy – and Enterprise Data & Analytics Training – to individuals who work for charities and non-profit organizations. I am honored to announce our 2019 Scholarships.

I was inspired to make this public after reading this post over at Brent Ozar Unlimited.

Free Stuff for Charities and Non-Profit Organizations

  

Free Training and Discounted Consulting Services

In addition to donating free licenses to our software and online training sites, Enterprise Data & Analytics offers a discounted rate to charities and non-profits for consulting services and free access to Enterprise Data & Analytics Training.

We are here to help.™ How may we serve you? Contact us today and let us know!

Using SSIS Framework Community Edition Webinar 20 Sep

Join me 20 Sep 2018 at noon ET for a free webinar titled Using SSIS Framework Community Edition!

Abstract

SSIS Framework Community Edition is free and open source. You may know can use SSIS Framework Community Edition to execute a collection of SSIS packages using a call to a single stored procedure passing a single parameter. But did you know you can also use it to execute a collection of SSIS packages in Azure Data Factory SSIS Integration Runtime? You can!

In this free webinar, Andy discusses and demonstrates SSIS Framework Community Edition – on-premises and in the cloud.

Join SSIS author, BimlHero, consultant, trainer, and blogger Andy Leonard at noon EDT Thursday 20 Sep 2018 as he demonstrates using Biml to make an on-premises copy of an Azure SQL DB.

I hope to see you there!

Register today.

:{>