Presenting Moving Data with Azure Data Factory at SQL Saturday Charlotte!

I am honored to present Moving Data with Azure Data Factory at SQL Saturday 806 in Charlotte, NC 20 Oct 2018.

This is the first time I am delivering this session. It still has that new presentation smell!


Azure Data Factory – ADF – is a cloud data engineering solution. ADF version 2 sports a snappy web GUI (graphical user interface) and supports the SSIS Integration Runtime (IR) – or “SSIS in the Cloud.”

Attend this session to learn:
– How to build a “native ADF” pipeline;
– How to lift and shift SSIS to the Azure Data Factory integration Runtime; and
– ADF Design Patterns to execute and monitor pipelines and packages.

I hope to see you there!


On Data Engineering in 2018

A few years ago I had a conversation with Scott Currie, the CEO of Varigence and inventor of Biml. Scott is one of the smartest people I know and I know a lot of very smart people. Whenever I have the opportunity to communicate with the very smart people I know, I often ask them – based upon their knowledge of me and what I do and any plans I have shared – for advice on stuff I should focus upon moving forward. Scott’s monosyllabic reply to this question?


Over the course of the past couple years I’ve shared – on this blog, even – some of my experiences learning more about Apache Spark. While I expected a little resistance, I was a little surprised by the… intensity… of some of the private messages and emails I received. As I replied to the intense people (and there were only a few), I could stop learning stuff right now, today, and continue working with SSIS for the next decade or so. I believe the same goes for anyone working with SSIS today.

I’m not going to stop learning stuff.


I love SSIS. But I don’t use SSIS because I love it. I love SSIS because it solves a particularly difficult piece of the story of enterprise data: data integration and data engineering. I feel a similar way about writing (and I’ve blogged recently about writing). I write because I like to write, not for clicks or branding or any of the other benefits I glean from writing. I consider those benefits cool, but side-effects of my desire to write.

The same goes for SSIS: I love SSIS because it solves a problem that I want to solve. I hope that makes sense.

Why Spark?

Two reasons:

  1. Spark is taking a more prominent role in SQL Server 2019.
  2. Spark is the engine beneath Azure Data Factory Data Flows.

How can you learn more about Spark? There are bunches of videos out there on YouTube. YouTube is crowd-sourced and free, which makes it awesome. The quality of YouTube training is crowd-sourced and free, which is a challenge.

I learned a lot from edX. I’m a fan of how edX approaches MOOC (online education). They offer lots of courses for free, which means you can grow your knowledge by investing only time. It’s tough to beat $free. If you want, you can pay for a certificate which you can then add to your online resume or LinkedIn profile like I added this one.


My advice? Spend some time learning. Pick your favorite learning platform and jump in. If you have SQL Server and / or SSIS experience, you already know a lot about the problem enterprises are trying to solve with data engineering tools. That’s a good place to be, but not required.

Keep learning!


Free Stuff for People Who Give Back: Announcing 2019 Scholarships

I haven’t advertised this in the past and… I’m not sure why: I donate licenses for SSIS Catalog Compare and (non-free) SSIS Framework Editions – and subscriptions to Biml Academy and SSIS Academy – to individuals who work for charities and non-profit organizations. I am honored to announce our 2019 Scholarships.

I was inspired to make this public after reading this post over at Brent Ozar Unlimited.

Free Stuff for Charities and Non-Profit Organizations


Do you work for a charity or non-profit organization? Submit your application today.

Discounted Consulting Services

In addition to donating free licenses to our software and online training sites, Enterprise Data & Analytics offers a discounted rate to charities and non-profits for consulting services.

We are here to help.™ How may we serve you? Contact us today and let us know!

Using SSIS Framework Community Edition Webinar 20 Sep

Join me 20 Sep 2018 at noon ET for a free webinar titled Using SSIS Framework Community Edition!


SSIS Framework Community Edition is free and open source. You may know can use SSIS Framework Community Edition to execute a collection of SSIS packages using a call to a single stored procedure passing a single parameter. But did you know you can also use it to execute a collection of SSIS packages in Azure Data Factory SSIS Integration Runtime? You can!

In this free webinar, Andy discusses and demonstrates SSIS Framework Community Edition – on-premises and in the cloud.

Join SSIS author, BimlHero, consultant, trainer, and blogger Andy Leonard at noon EDT Thursday 20 Sep 2018 as he demonstrates using Biml to make an on-premises copy of an Azure SQL DB.

I hope to see you there!

Register today.


Introducing Azure Data Factory Design Patterns

I was honored to write an article titled Introducing Azure Data Factory Design Patterns featured in this month’s PASS Insights newsletter!

Introducing Azure Data Factory Design Patterns

The article covers a couple execution patterns:

  1. Execute Child Pipeline
  2. Execute Child SSIS Package

I demonstrate a cool SSIS Catalog Browser feature that helps ADF developers configure the Execute SSIS Package activity.

To see it in action, download SSIS Catalog Browser – it’s one of the free utilities available at DILM Suite. Connect to the instance of Azure SQL DB that hosts an Azure Data Factory SSIS Integration Runtime Catalog, select the SSIS Package you desire to execute using the Execute SSIS Package activity, and then copy the Catalog Path from the  Catalog Browser status message:

Paste that value into the Package Path property of the Execute SSIS Package activity:

You can rinse and repeat – Catalog Browser surfaces Environment paths as well:

Enjoy the article!

If you have any questions about Azure Data Factory – or need help getting started – please reach out!

Learn more:
Attend my full-day pre-conference session titled Intelligent Data Integration at the PASS Summit 2018  on 5 Nov 2018.
Check out this 1-day course on
Fundamentals of Azure Data Factory delivered in cooperation with Brent Ozar Unlimited 10 Dec 2018!

AndyWeather Internet of Things (IoT) is a site I’ve maintained for about 10 years now. I use the site and related hardware, software, and services to test concepts and perform experiments.

I then apply my experience in delivering Internet of Things (IoT) solutions for Enterprise Data & Analytics customers and for SSIS and Biml training, such as my upcoming course titled Fundamentals of Azure Data Factory delivered in cooperation with Brent Ozar Unlimited.

It all started when GoDaddy created a DMZ for SQL Server databases. I found this functionality in 2008 and asked myself, “Self, how might we use this?”

Since That Time…

There have been two major iterations of AndyWeather. I use weather data collected during the first iteration for training purposes at SSIS Academy and when delivering training to Enterprise Data & Analytics customers.

AndyWeather v2

The setup of the second iteration is fairly straightforward:

  1. The Acurite Weather Station consists of an instrument pack plus a base station. The instruments collect weather measurements and transmit them to the base station.
  2. The base station is connected to an older e-Machine running Windows 7 Ultimate (32-bit) on 2GB RAM.
  3. An Acurite application interfaces with the base station and the application stores data locally in a single CSV file.
  4. I wrote a very simple C# console application named “abt” (an acronym for “Azure Blob Transfer”) to transfer the CSV file to Azure Blog Storage.
  5. An Azure Data Factory pipeline that loads an Azure SQL DB staging table.
  6. The AndyWeather website which reads the latest weather data from the Azure SQL DB staging table.
  7. I wrote another very simple C# application named “awt” (an acronym for “AndyWeather Tweets”) that tweets updates to the @AndyWeather twitter account.

Acurite Weather Station

The latest iteration began in early 2018 when I purchased an updated package of instruments and a new base station made by Acurite. So far, I like this station a lot. It was less expensive than the previous station and appears more rugged (again, so far – time will tell).

I recently relocated the weather station to improve connectivity between the instruments and the base station. I recorded a Data Driven *DataPoint* about it:

(Pay no attention to the exploding pecans in the background…)

The e-Machine

I intentionally use an under-powered PC for the server. Why? I want to learn how the base station – and then everything downstream of the base station – responds to busy server conditions. This is Engineering 101 stuff and I’ve learned a lot:

I love this old machine!

Acurite Application

The Acurite people maintain an application for communicating with base stations:

(click to enlarge)

The PC Connect application allows me to configure how and when weather data is collected from the base station – which collects measurements from the instruments. The application lets me configure the units-of-measure and file location – and I can even share my weather data with Weather Underground. How cool is that?

The Azure Blob Transfer Console Application

The Azure Blob Transfer (abt) application is a very simple console application written in C#. It picks up the CSV file containing weather data stored by the Acurite PC Connect application and writes the file to an Azure Blob Storage container:

(click to enlarge)

The CSV file in Azure Blob Storage is overwritten each time abt successfully executes. You can download a copy of the abt solution here.

Azure Data Factory Pipeline

An Azure Data Factory (ADF) pipeline calls a stored procedure that first truncates a staging table in a Azure SQL DB using a Stored Procedure activity, followed by a Copy Data activity that copies the weather data from the CSV file in Azure Blob Storage to an Azure SQL DB staging table:

At the time of this writing, ADF version 2 is current.

You can download the ARM template for the pipeline here.

The AndyWeather Website

The AndyWeather website has been around since the days of the first iteration of AndyWeather – the one that stored data in a SQL Server instance hosted at GoDaddy’s DMZ. It’s fairly straightforward code, which helps it perform fairly on desktops and mobile devices:

The biggest performance hit comes from executing the stored procedure against an Azure SQL DB, which can sometimes take 5-10 seconds to complete.

The AndyWeather Tweets Console Application

I snagged some C# code and a TwitterAPI class from a project named called TweetSharp to help build the awt console application:

You can download a copy of the awt solution here.

The @AndyWeather Twitter Account

It makes me happy every time I see a tweet from @AndyWeather:

I tell people, “It’s just a dumb little app,” but I really had fun building it. I learned a bunch, too!


The AndyWeather IoT solution uses hybrid technology – on-premises instruments and servers, combined with cloud services – to deliver weather data to a website and Twitter account. It’s accessible from social media and the web from desktops and mobile devices.

Just so you know, this isn’t everything I’ve built using the AndyWeather instruments. There’s a bunch more – some of which is still in the experimental phase. I’ll share more as time permits. But I want you all to know, I consider Azure a great big cyber-playground!


Announcing the Fundamentals of Azure Data Factory Course!

I am excited to announce a brand new course (it still has that new course smell) from Brent Ozar Unlimited and honored to deliver it! This one-day, live, online course is titled Fundamentals of Azure Data Factory and it’s designed to introduce you to Azure Data Factory (ADF).

There will be demos.
Live demos.
Lots of live demos!


Azure Data Factory, or ADF, is an Azure PaaS (Platform-as-a-Service) that provides hybrid data integration at global scale. Use ADF to build fully managed ETL in the cloud – including SSIS. Join Andy Leonard – authorblogger, and Chief Data Engineer at Enterprise Data & Analytics – as he demonstrates practical Azure Data Factory use cases.

In this course, you’ll learn:

  • The essentials of Azure Data Factory (ADF)
  • Developing, testing, scheduling, monitoring, and managing ADF pipelines
  • Lifting and shifting SSIS to ADF SSIS Integration Runtime (Azure-SSIS)
  • ADF design patterns
  • Data Integration Lifecycle Management (DILM) for the cloud and hybrid data integration scenarios

To know if you’re ready for this class, look for “yes” answers to these questions:

  • Do you want to learn more about cloud data integration in Azure Data Factory?
  • Is your enterprise planning to migrate its data, databases, data warehouse(s), or some of them, to the cloud?
  • Do you currently use SSIS?

The next delivery is scheduled for 10 Dec 2018. Register today!

I hope to see you there.


T-SQL Tuesday #106: Regarding Triggers

In my experience, people either love or hate triggers in Transact-SQL. As a consultant, I get to see some interesting solutions involving triggers. In one case, I saw several thousand lines of T-SQL in a single trigger. My first thought was, “I bet there’s a different way to do that.” I was right. The people who’d written that trigger was unaware there was another way to solve the problem.

I see that a lot.

Software and database developers do their level best to deliver the assigned task using the tools available plus their experience with the tools. I don’t see malice in bad code; I see an opportunity to mature.

Granted, when I’m called in as a consultant I get paid by the hour. That’s not the case for many folks.

One ETL-Related Use For Triggers

Change detection is a sub-science of data engineering / data integration / ETL (Extract, Transform, and Load). One way to detect changes in a source is to track the last-updated datetime for the rows loaded by an ETL load process.

The next time the load process executes, your code grabs that value from the previous execution and executes a query that returns all rows inserted into or updated in the source since that time.

Simple, right? Not so fast.

Where to Manage Last-Updated

How the value of the last-updated datetime is managed is crucial.  I’ve had arguments discussions with many data professionals about the best way to manage these values. My answer? Triggers.

“Why Triggers, Andy?”

I’m glad you asked. That is an excellent question!

Imagine the following scenario: Something unforeseen occurs with the data in the database. In our example, assume it’s something small – maybe impacting a single row of data. The most cost-effective manner to manage the update?

Manual update.

How is the last-updated value managed? It could be managed in a stored procedure. It could be managed in dynamic T-SQL hard-coded inside a class library for the application. If so – and given a data professional is now manually writing an UPDATE T-SQL statement to correct the issue, there is an opportunity to forget to update the last-updated column.

But if the last-updated column is managed by a trigger, it no longer matters if the data professional performing the update remembers or forgets, the trigger will fire and set the last-updated column to the desired value.

Change detection works and all is right with the world.

Troubleshooting Triggers

Troubleshooting triggers is… tough. There are some neat ways to troubleshoot built right into tools like SQL Server Management Studio (which is free). For example, in SSMS you can Debug T-SQL:


I like triggers for the things triggers do best, but triggers can definitely be misused. Choose wisely.