The next delivery is 01-02 Apr 2019, 9:00 AM – 4:30 PM ET.
Data integration is the foundation of data science, business intelligence, and enterprise data warehousing. This instructor-led training class is specifically designed for SQL Server Integration Services (SSIS) professionals responsible for developing, deploying, and managing data integration at enterprise-scale.
You will learn to improve data integration with SSIS by:
Building faster data integration.
Making data integration execution more manageable.
Building data integration faster.
SSIS Design Patterns for Performance – how to build SSIS packages that execute and load data faster by tuning SSIS data flows and implementing performance patterns.
SSIS Deployment, Configuration, Execution, and Monitoring – the “Ops” part of DevOps with SSIS using out-of-the-box tools and the latest utilities.
Automation – how to use Business Intelligence Markup Language (Biml) to improve SSIS quality and reduce development time.
The next delivery is 04 Mar 2019, 9:00 AM – 4:30 PM ET.
Azure Data Factory, or ADF, is an Azure PaaS (Platform-as-a-Service) that provides hybrid data integration at global scale. Use ADF to build fully managed ETL in the cloud – including SSIS. Join Andy Leonard – author, blogger, and Chief Data Engineer at Enterprise Data & Analytics – as he demonstrates practical Azure Data Factory use cases.
In this course, you’ll learn:
The essentials of ADF
Developing, testing, scheduling, monitoring, and managing ADF pipelines
Lifting and shifting SSIS to ADF SSIS Integration Runtime (Azure-SSIS)
ADF design patterns
Data Integration Lifecycle Management (DILM) for the cloud and hybrid data integration scenarios
My Twitter DMs are open @AndyLeonard. Ask me a question about SSIS, ADF, or Biml. I may answer it with a blog post, a video, or both.
Today’s Andy Answer is an answer to an SSIS question received via @AndyLeonard direct message:
Using SSDT, I have 3 variables: one for the file path, one for date string, and one for file name. I’m trying to create the flat file destination with a variable so the date will be correct each day the file is generated. I have the right expressions for each variable, however, when the file is created it only shows the file name variable value. I configured the flat file connection manager’s ConnectionString property with an expression that concatenates these three variables. Can you help?
Yes. Yes I can help. I Am Here To Help.™
An Example to Demonstrate
Create a demo project:
Rename the SSIS package, if you’d like:
Add new SSIS Variables to the SSIS Package.
I added four variables to start:
FlatFilePath [String] – defaulted to a path on my server.
FlatFileName [String] – a hard-coded base name for my file.
FlatFileExtension [String] – to hold my file extension (csv in this case).
FlatFileDateString [String] – the date formatted as I desire:
Formatting the FlatFileDateString SSIS variable is tricksy. I used the following SSIS Expression:
Add a Data Flow Task. In the Data Flow Task, add an OLE DB Source adapter. Configure the source adapter to read some data from some sample database:
Create a test sample file in the directory you specified in the FlatFilePath SSIS Variable Value:
Populate the sample file with column names and one sample data row. Enclose the column names and data values in double-quotes:
Create a Flat File Connection Manager. Configure the Text Qualifier as double-quotes and make sure the “Column names in the first data row” checkbox is checked:
Verify Flat File Connection Manager configuration by viewing the Columns page:
Add a Flat File Destination adapter to the Data Flow Task and configure it to use the Flat File Connection Manager. Ensure the “Overwrite data in the file” checkbox is checked to make the SSIS package re-executable (within the same day, at least):
Select the Flat File Connection Manager and view its Properties. Click the ellipsis for the Expressions property to open the Property Expressions Editor for the Flat File Connection Manager. From the Property dropdown, select ConnectionString:
Drag the FlatFileFullPath variable to the Expression textbox. Click the Evaluate Expression button to examine and verify the value of the FlatFileFullPath SSIS variable:
When you close the Expression Builder, the Property Expressions Editor should appear similar to:
Once the ConnectionString Expression override is configured, set the DelayValidation property for the Flat File Connection Manager to True. This setting configures the connection manager to skip validation, which is helpful if the file does not (yet) exist:
A test execution in the SSIS debugger proves we’ve accomplished out goal:
I recommend (at least) one change to this package: Move the FlatFilePath SSIS Variable to a Package Parameter. Why? One may desire to change the target directory of the output file at some point and a parameter value is easy to override at runtime:
After converting the SSIS variable to a package parameter, update the Expression for the FlatFileFullPath SSIS variable to reflect the change:
Eugene has authored threePluralsight courses on the topics Power BI, DAX, Power Pivot, and SSAS. His experience includes loading ERP solutions, Power Reports, ETL, QlikView, and SQL Server development.
I am sometimes asked, “Andy, how do I start a technical blog?” I demonstrate one way here – a way that has served me well with search engines. 1. Describe how the error occurred. 2. Copy the error message, post it as text. 3. Describe the solution.
I am helping a large enterprise client migrate their data integration from another data integration platform to Azure-SSIS. It’s a two-step process:
Step 1: Convert the existing data integration operations to SSIS and test the conversion by deploying to an on-premises SSIS Catalog.
Step 2: Lift and shift the SSIS Catalog configurations from on-premises to Azure-SSIS (we will be using SSIS Catalog Compare for this part).
An SSIS Error
In the midst of designing SSIS solutions we encountered an interesting SSIS error:
Failed to start project
Exception deserializing the package “Operation is not valid due to the current state of the object.”. (Microsoft.DataTransformationServices.VsIntegration)
As a professional consultant who has been blogging about SSIS for 12 years and authored and co-authored a dozen books related to Microsoft data technologies, my first response was:
I bingled (credit for the term, “Bingle”: Ed Danger Watson) and found nothing helpful. I was beginning to think something unfortunate had happened to the client’s SQL Server Data Tools SSIS template when a crazy idea popped into my noggin. I said to myself:
Self? Could it be the odd-looking name of the Project Connection Manager?
Andy – circa 2019
The Odd-Looking Name of the Project Connection Manager
When you create a new SSIS Connection Manager, SSIS automagically names it for you. Awesome, right? Most of the time. But sometimes the default name created for a package connection manager doesn’t play nice with Project Connection naming requirements.
Don’t believe me? Try this:
Create a new SSIS package. Add a package connection manager aimed at a named instance of SQL Server, such as “vDemo\Dev”:
Next, (attempt to) promote the package connection manager to a project connection manager:
Note the error:
The error message in this case reads: Failed to convert the package connection mananager ‘vDemo\Dev.WeatherData’ to a project connection manager because ‘vDemo\Dev.WeatherData’ is not a valid stream name in the project. Rename the package connection manager and do the conversion again. Object name ‘vDemo\Dev.WeatherData.conmgr’ is not valid. Name must contain only alphanumeric characters or underscores ‘_’.
The backslash – “\” – in the connection manager name – which works just fine for the Package Connection Manager – blocks promotion of the package connection manager to a Project Connection Manager.
Back to the problem at hand…
My client’s Project Connection Manager name included square brackets (it’s an Oracle thing…). Renaming the project connection manager – simply removing the square brackets – cleared the error:
The package now executes in the SSIS debugger:
SSIS is a complex and flexible software development platform. I learn new things with each and every consulting project. Why did I think of the connection manager name? In testing SSIS Catalog Compare version 4, I tried out a bunch of naming combinations. I was shocked to learn some characters are permitted (did you know you can create an SSIS Catalog Folder name that contains an apostrophe??).
Last August, I had a blast at the Data Platform Summit 2018 in Bengaluru, India. One highlight of the trip was meeting and conversing with the Azure Data Factory SSIS Integration Runtime Product/Program Manager, Sandy Winarko.
Sandy and I swapped coder stories and talked about data integration design patterns and I learned a few new design patterns sitting in his session. This post is about one of the patterns I learned from Sandy.
Before I move on, though, you should really check out the Data Platform Summit! It is an awesome show – Asia’s largest Microsoft data conference.
Stop In The Name Of… Money
I love Azure! I love the technology and the velocity of development. The cloud is changing the world and Azure is changing the cloud – almost every day. One cool thing about Azure is cost savings. A lot services are free or nearly free. That. is. Awesome! But some services are not free. (That’s not a bad thing!)
Azure Data Factory’s SSIS Integration Runtime (aka Azure-SSIS IR) is not free. It costs money even when it’s just sitting there switched on with no SSIS packages executing. Depending on the size of the Azure-SSIS IR, one can wrack up some serious charges for what amounts to, well, no work.
That’s why I was excited to learn about this pattern from Sandy. The pattern is made up of three parts:
Their article walks one through setting up an instance of Azure-SSIS. It then proceeds to describe how to build the start-execute-stop pattern Sandy described to me (and everyone who attended his presentation) in Bengaluru. Now this pattern has evolved since Sandy first described it. The first version used Azure Runbooks and Webhooks. I like this version much better. You can still use Runbooks and there’s more information in the How to start and stop Azure-SSIS Integration Runtime on a schedule article.
Before You Start
I’m going to share the part that gave me trouble first – it was security. In the article, Sandy and company write in Step 5 of the Create Your Pipelines section:
Assign the managed identity for your ADF a Contributor role to itself, so Web activities in its pipelines can call REST API to start/stop Azure-SSIS IRs provisioned in it.
I read that and proceeded to assign myself to the Contributor role of Azure Data Factory. That didn’t work. Not at all. And that’s not what the instructions said. The instructions said to “Assign the managed identity for your ADF a Contributor role to itself…”
After I figured this out – with awesome and patient help from Sandy himself (thanks Sandy!) – I was in business.
I created a separate pipeline to stop Azure-SSIS. I run it every night at 11:00 PM ET because if I’m awake at that time (I usually rise around 4:00 AM ET), I am not being productive on my computer (if I am on it at all…). I do this in case I forget and leave Azure-SSIS running after a presentation, class, or client demo.
Following the instructions, I created a new pipeline and added a Web Activity. I configured the Web Activity according to the instructions. Mine looked like this:
The tricksy part (if there is one) is configuring the URL. There’s a shortcut for that. Click on the Monitor page of the ADF portal:
Click the Integration Runtimes tab at the top of the page, and then click the “View Integration Runtime Detail” link under the Actions column to bring up the snappy new mini-dashboard for Azure-SSIS:
Regardless of the state of Azure-SSIS, click the status link to open a functional pane that contains the Resource ID and a Copy button:
This copies a lot of what you will need to set up the call to stop Azure-SSIS (and also to start it):
When you copy the Resource ID, you get all those missing values filled in for you. How neat is that? Prefix that with “https://management.azure.com” and add the suffix “/stop?api-version=2018-06-01”, and you are golden.
Finish following the instructions in the Create Your Pipelines section, and then take it out for a test drive.
Things begin to pick up here partly because you’ve done a lot more work than you realize already, and partly because supper’s ready and I’ve been blogging for the past couple hours.
(For you, Dear Reader! I do it all for You!) And for me. In case I forget…
Click the ellipsis beside the pipeline and then click Clone to make a fresh copy of your pipeline (it still has that new pipeline smell!)
I name my new clone Start Run Stop.
The Stop AndyDemoSSIS Web Activity is in there already and just begging to be cloned.
Look how easy it is to clone an activity in ADF. I mean, could they make this any easier? I think not.
Once its cloned, it’s not much work to edit the configuration of the cloned Web Activity to start the AndyDemo Azure-SSIS instance:
While managing a team of 40 ETL developers, I wanted to track lots of metrics. Some of the things I wanted to track were technical, like SSIS package execution times. Some metrics were people-centric.
Andy’s First Rule of Statistics states:
You can use statistics for anything about people, except people.
Andy – circa 2005
It was important to me to track how long it took the on-call person to identify the problem. I didn’t use the information to beat on-call people over the head. I used the information to measure the results of several experiments for displaying metadata about the failure.
Reports For The Win
You may be as shocked by this as I was; reports helped a lot more than I anticipated. Before I deployed the reports the Mean Time to Identify Failure was tracking just under 30 minutes. After deploying the reports, the mean time to identify failure fell to 5 minutes.
As I said, I was shocked. There were mitigating circumstances. The on-call team members were getting more familiar with the information SSIS produces when it logs an error. They were gaining experience, seeing similar errors more than once.
I accounted for growing familiarity by narrowing the time window I examined. The least-impressive metrics put the reduction at 18 minutes to 5 minutes.
Before I built and deployed the dashboard for SSIS Application Instances (like the one pictured at the top of this post), on-call people would query custom-logging tables we built to monitor enterprise data integration. The queries to return Application Instance log data were stored where everyone could reach them. In fact, I used the same queries as sources for this report.
A funny thing happened when I deployed the reports. Each week, one or more on-call people would ping me and tell me how much they liked the reports. Even though the data was the same, the presentation was different. A picture with a little color goes a long way.
The image at the beginning of this section – the SSIS Framework Task Instance Report – is displayed when a user clicks the Failed link shown in the initial screenshot. This design received he most comment by the on-call team members. The most common comment was, “I click the Failed link and it takes me to details about the failure.” The reports were passing The 2:00 AM Test.
SSIS Framework Applications
If you’ve read this far and wondered, “Andy, what’s an SSIS Application?” An SSIS Application is a construct I came up with to describe a collection of SSIS Packages configured to execute in a specific order. An application is a way to group SSIS packages for execution. You can get a sense of how our frameworks work – especially the application execution functionality – by checking out the SSIS Framework Community Edition at DILM Suite (DILM == Data Integration Lifecycle Management).
An Application Instance is an instance of execution of an SSIS Application. An Application Instance is made up of Package Instances. the relationship between applications and packages appears straightforward: an application is a collection of packages; parent-child; one-to-many. But it’s not quite that simple. Our SSIS Frameworks facilitate patterns that execute the same package multiple times, sometimes in parallel! We can also create packages that perform utility functions – such as ArchiveFile.dtsx – and call it from multiple applications. When you do the Boolean algebra, the relationship between applications and packages is many-to-many.
Our SSIS Frameworks are SSIS Catalog-integrated. They even work with the SSIS Integration Runtime that’s part of Azure Data Factory, Azure-SSIS.
While the Reporting Services dashboard was neat when it was released back in the day, the cool kids now play with Power BI. At DILM Suite you will also find a free – albeit basic – Power BI dashboard that surfaces many of the same metrics using even better reporting technology. The Basic SSIS Catalog Dashboard in Power BI is free at DILM Suite.
Last night I presented Moving Data with Azure Data Factory to a packed house at the Richmond Azure User Group. The crowd was engaged and asked awesome questions. I had a blast!
I rehearsed many of my demos before the presentation and even updated my slides. One thing I did not do was rehearse configuring an Execute SSIS Package activity. Why? I’ve built these demos live a bajillion times. I know what I’m doing…
When I clicked on the Settings tab I found new stuff – options I’d not previously encountered. Thankfully, I immediately recognized the purpose of these new configuration options – and I also immediately liked them. The demo worked and no one was the wiser (except the handful of folks with whom I shared this story after the meeting).
New Dropdowns for Folder, Project and Package
The new interface sports new dropdowns for selecting the Catalog Folder, Project, and Package to executed. I like this – it’s slick. I had deployed a project to my Azure-SSIS instance between the time I started this part of the demo and the time I wanted to configure the Execute SSIS Package activity. During deployment I created a new Catalog Folder which was not initially listed in the Folder dropdown. Clicking the Refresh button remedied this, though, and I was able to complete configuration rapidly.
I cannot recall if the Connection Managers and Property Overrides tabs were part of the previous Execute SSIS Package activity interface. I don’t think so, but I could be wrong about that. Update: I verified these are new tabs by looking at screenshots from my June 2018 post titled ADF Execute SSIS Package Activity. The previous version had an Advanced tab.
The SSIS package configuration tabs are SSIS Parameters, Connection Managers, and Property Overrides.
When your Azure-SSIS instance is running, you may use these tabs to update Parameter, Connection Manager Property, and SSIS Package Property values:
It’s possible to configure the Execute SSIS Package activity when your Azure-SSIS Integration Runtime is not running, but you don’t get the nice dropdown pre-population and have to revert to the previous method of configuring the full SSIS Catalog path to the package you desire to execute.
SSIS Catalog Browser To The Rescue!
If you find yourself in this predicament and would rather configure the Execute SSIS Package activity without waiting 20-30 minutes for the Azure-SSIS instance to spin up, you can use SSIS Catalog Browser – a free utility from DILM Suite – to connect to your Azure-SSIS instance:
SSIS Catalog Browser displays the Catalog path for an SSIS package (or Catalog Environment) when you select the artifact in the unified Catalog surface. Copy the package’s Catalog path displayed in the Status area and and paste the value into the Package Path textbox in ADF:
Make sure the Manual Entries checkbox is checked.
I like the warnings. Feedback is a good thing.
Once Azure-SSIS Is Running
When your Azure-SSIS instance is up and running, you may configure the path to your SSIS package using the dropdowns:
You can even configure the path to a Catalog Environment:
I believe there are at least two lessons to take away from my experience:
When presenting on Microsoft Azure topics, always check your demos to make certain nothing has changed; and
Microsoft Azure is evolving at a rapid rate – especially Azure Data Factory!
I made three short-ish (~2 minutes each) videos demonstrating some of the capabilities of the product. I really like the SCCPAC files and SCCPAC Deployment Wizard, which are only available in the Enterprise Edition.
Use SCCPAC Deployment Wizard to Lift and Shift SSIS Projects and Configurations Metadata to Azure-SSIS
Use SCCPAC Deployment Wizard to Manage SSIS DevOps
Use SSIS Catalog Compare Enterprise Edition and SCCPAC Deployment Wizard to Deploy from the QA to Production Catalog in Real-Time
I enjoy math. I noticed a pattern learning math, perhaps you experienced something similar. I found arithmetic an exercise in memory. I have a good memory (well, I had a good memory…) so memorizing a bunch of rules was no big deal.
When I learned algebra, arithmetic made more sense. In addition to the memorized rules, I saw why the rules existed. I understood the rules better as a result.
This pattern held all through my math education. I understood algebra better once I learned geometry. I understood geometry better once I learned trigonometry. I understood trigonometry better once I learned single-variable calculus.
An Axiom (for me)
I notice a similar pattern applies to my career (or careers, as the case may be). I’ve served in many roles:
Soldier (part-time in the Virginia Army National Guard)
Manufacturing automation integrator
The similar pattern manifests itself in this manner: I’ve enjoyed the position – and more success in the position – when I had a reason to do the work; some reason other than a paycheck. In some cases, I had multiple “why’s” beyond remuneration. For example, I join the Virginia Army National Guard to learn electronics and serve my country – to especially protect everyone’s right to free speech guaranteed by the First Amendment. I may not agree with what people say, but I was (and still am) willing to fight and die to preserve the right of US citizens to say whatever they want.
As a result, I enjoyed serving in the National Guard (for the most part). I learned more. I learned better, I think, because I enjoyed serving.
Entrepreneurship can be challenging. I believe one needs a “why” – or perhaps several “why’s” to remain an entrepreneur. The “why” cannot simply be money. Money isn’t inconsequential, mind you, but I believe the best “why’s” are less tangible.
Passion plays a major role for me. When business isn’t going well or when business is going too well, a couple intangible “why’s” – passions for both entrepreneurship and the kind of work I am blessed to do – inspire me to keep a steady hand on the tiller.
Also, entrepreneurship affords more and different ways to serve people. Am I saying one must be an entrepreneur to serve others? Nope. Flexibility with time, though, facilitates opportunities that may not otherwise be possible, or as possible.
What is Your “Why?”
That’s the question this month: Why do you do what you do?
What is Intelligent Data Integration? SSIS (SQL Server Integration Services) packages developed using tried and true design patterns, built to participate in a DevOps enterprise practicing DILM (Data Integration Lifecycle Management), produced using Biml (Business Intelligence Markup Language) and executed using an SSIS Framework.
Attend a day of training focused on intelligent data integration delivered by an experienced SSIS consultant who has also led an enterprise team of several ETL developers during multiple projects that spanned 2.5 years (and delivered).
Attendees will learn:
a holistic approach to data integration design.
a methodology for enterprise data integration that spans development through operational support.
how automation changes everything. Including data integration with SSIS.
SSIS Design Patterns
Executing SSIS in the Enterprise
Custom SSIS Execution Frameworks
DevOps and SSIS
Biml, Biml Frameworks, and Tools
Prerequisites: Familiarity with SQL Server Integration Services (SSIS).
Continental Breakfast, Lunch and Afternoon Snacks are provided.