Last August, I had a blast at the Data Platform Summit 2018 in Bengaluru, India. One highlight of the trip was meeting and conversing with the Azure Data Factory SSIS Integration Runtime Product/Program Manager, Sandy Winarko.
Sandy and I swapped coder stories and talked about data integration design patterns and I learned a few new design patterns sitting in his session. This post is about one of the patterns I learned from Sandy.
Before I move on, though, you should really check out the Data Platform Summit! It is an awesome show – Asia’s largest Microsoft data conference.
Stop In The Name Of… Money
I love Azure! I love the technology and the velocity of development. The cloud is changing the world and Azure is changing the cloud – almost every day. One cool thing about Azure is cost savings. A lot services are free or nearly free. That. is. Awesome! But some services are not free. (That’s not a bad thing!)
Azure Data Factory’s SSIS Integration Runtime (aka Azure-SSIS IR) is not free. It costs money even when it’s just sitting there switched on with no SSIS packages executing. Depending on the size of the Azure-SSIS IR, one can wrack up some serious charges for what amounts to, well, no work.
That’s why I was excited to learn about this pattern from Sandy. The pattern is made up of three parts:
- Start Azure-SSIS
- Execute an SSIS package
- Stop Azure-SSIS
Sandy, Nick Schonning, Douglas Laudenschlager, and Sreedhar Pelluru authored a post at Microsoft Docs titled How to start and stop Azure-SSIS Integration Runtime on a schedule. When I tried to follow the instructions I encountered one issue. Now, you may read that last sentence and think, “Andy, don’t you have a problem following instructions anyway?” Yes, but that’s beside the point… mostly.
Their article walks one through setting up an instance of Azure-SSIS. It then proceeds to describe how to build the start-execute-stop pattern Sandy described to me (and everyone who attended his presentation) in Bengaluru. Now this pattern has evolved since Sandy first described it. The first version used Azure Runbooks and Webhooks. I like this version much better. You can still use Runbooks and there’s more information in the How to start and stop Azure-SSIS Integration Runtime on a schedule article.
Before You Start
I’m going to share the part that gave me trouble first – it was security. In the article, Sandy and company write in Step 5 of the Create Your Pipelines section:
Assign the managed identity for your ADF a Contributor role to itself, so Web activities in its pipelines can call REST API to start/stop Azure-SSIS IRs provisioned in it.
I read that and proceeded to assign myself to the Contributor role of Azure Data Factory. That didn’t work. Not at all. And that’s not what the instructions said. The instructions said to “Assign the managed identity for your ADF a Contributor role to itself…”
What’s a Managed Identity?
Well, I searched for the answer and found this spiffy article titled Azure Data Factory service identity that not only explained what an ADF Managed Identity was, it also told me where to find it.
See, when I browsed over to assign the ADF Contributor Role, all I saw in the available list of people / accounts I could assign was li’l ol’ me:
So that’s who I assigned to the Contributor Role: me. As I said, that did not work.
The article titled Azure Data Factory service identity took me right to the location of the ADF Service Identity Application ID, which was the value I needed:
After I figured this out – with awesome and patient help from Sandy himself (thanks Sandy!) – I was in business.
Stop It!
I created a separate pipeline to stop Azure-SSIS. I run it every night at 11:00 PM ET because if I’m awake at that time (I usually rise around 4:00 AM ET), I am not being productive on my computer (if I am on it at all…). I do this in case I forget and leave Azure-SSIS running after a presentation, class, or client demo.
Following the instructions, I created a new pipeline and added a Web Activity. I configured the Web Activity according to the instructions. Mine looked like this:
The tricksy part (if there is one) is configuring the URL. There’s a shortcut for that. Click on the Monitor page of the ADF portal:
Click the Integration Runtimes tab at the top of the page, and then click the “View Integration Runtime Detail” link under the Actions column to bring up the snappy new mini-dashboard for Azure-SSIS:
Regardless of the state of Azure-SSIS, click the status link to open a functional pane that contains the Resource ID and a Copy button:
This copies a lot of what you will need to set up the call to stop Azure-SSIS (and also to start it):
/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.DataFactory/factories/{factoryName}/integrationRuntimes/{integrationRuntimeName}
When you copy the Resource ID, you get all those missing values filled in for you. How neat is that? Prefix that with “https://management.azure.com” and add the suffix “/stop?api-version=2018-06-01”, and you are golden.
Finish following the instructions in the Create Your Pipelines section, and then take it out for a test drive.
Clone
Things begin to pick up here partly because you’ve done a lot more work than you realize already, and partly because supper’s ready and I’ve been blogging for the past couple hours.
(For you, Dear Reader! I do it all for You!) And for me. In case I forget…
Click the ellipsis beside the pipeline and then click Clone to make a fresh copy of your pipeline (it still has that new pipeline smell!)
I name my new clone Start Run Stop.
The Stop AndyDemoSSIS Web Activity is in there already and just begging to be cloned.
Look how easy it is to clone an activity in ADF. I mean, could they make this any easier? I think not.
Once its cloned, it’s not much work to edit the configuration of the cloned Web Activity to start the AndyDemo Azure-SSIS instance:
Boom. That just happened.
“Run-Time!” (Spoken in the Voice of the Geico Camel)
Next up? The Execute SSIS Package Activity:
I’m not going to write much about how to configure the Execute SSIS Package Activity. You can learn more in one of these blog posts where I’ve already written about it. Supper’s waiting.
Conclusion
Thanks to Sandy and the team at Microsoft for the excellent documentation. This is a very cool Azure Data factory Design Pattern!
Enjoy!
very nice article! this method seems much easier than using automation/runbook. i assume you can do something similar to scale a sql db?
Thanks again sir. Exactly what I needed.