Achievement Unlocked: Certified in Implementing Real-Time Analytics with Azure HDInsight

I learned a lot completing Implementing Real-Time Analytics with Azure HDInsight. It’s the most fun Microsoft Virtual Academy / edX course I’ve taken to date!

In completing this course, I finished the requirements for the Microsoft Azure HDInsight Big Data Analyst program. I completed Implementing Predictive Analytics with Spark in Azure HDInsight and Processing Big Data with Hadoop in Azure HDInsight late last year.

What’s Next?

I started the Microsoft Professional Program in Data Science early last year, but I’m only a few courses into it. Anything worth starting is worth finishing.

I really got a lot out of the Big Data Analyst XSeries program. I’m eyeing the Microsoft Professional Program in Big Data program. I’ve already completed some of the courses as part of the Data Science program and the Big Data XSeries program. Some that I haven’t completed look very interesting – like Processing Big Data with Azure Data Lake AnalyticsOrchestrating Big Data with Azure Data Factory, and Analyzing Big Data with Microsoft R Server.

It’s time to update LinkedIn


Writing about Apache Storm Stirs Up a Storm

A storm is coming…

I’ve been spending some time learning other data integration and analytics open-source-y stuff. Perhaps you’ve noticed recent blog posts like:

If you read MSDN Magazine, you may have noticed a familiar name at the end of Frank La Vigne’s (blog | @Tableteer) new column, Artificially Intelligent – especially the October, November, December (2017), and January 2018 articles – I’ve been helping Frank with some technical editing for his column.

Maybe you noticed this and wondered, “Is Andy bailing on SSIS?” The answer to that question is:


Goodness no! If you only knew what I have on the drawing board and in the pipeline, you’d understand me abandoning SSIS is nothing to be concerned about. (I don’t blame you for not knowing – I’ve not shared more about these endeavors… yet!)

I’ve repeatedly shared that I enjoy learning. I’ve advised readers of this blog and those who follow me on social media to continually educate themselves. I’ve equally warned folks: If you don’t like lifelong learning, technology is not for you and you should go into another field.

I like learning!

I’ve invested some time and money in more formal training via Microsoft Virtual Academy and edX (although you can take all the courses I’ve taken for free… you have to pay only if you want a certificate).

Learning a New Language

Frank and I chat often. In a recent chat I shared my excitement at discovering patterns and frameworks in Apache Storm and Spark – frameworks and patterns that are remarkably similar to frameworks and patterns I’ve developed for SSIS. Why was I excited? It’s validating to me when smart people – people smart enough to build open source technology platforms like Spark and Storm – include the same functionality I’ve built for SSIS in their platforms and solutions. I’ve debated smart SSIS developers on occasion who believe frameworks in general, or specific frameworks, are not necessary. I’m not good at these debates because I can rarely remember the reason why something is a good or bad idea; I only remember that the idea is good or bad. I think the reason my brain works this way is because I’m an engineer and more interested in solving problems for customers than winning an argument… but I digress.

Frank shared an anecdote from his days of learning Deutsche. Paraphrasing:

I had a hard time remembering when to use ‘who’ and when to use ‘whom’ in English grammar until I learned German. Now I get it.

I have developed patterns and utilities (such as DILM Suite) to address architectural concepts I identified as unclear or less-clear or even missing from SSIS. While it’s validating (to me) to find this same functionality built into open source platforms, the biggest thrill remains learning new stuff. As Frank and I discuss in an upcoming Data Driven show, I bring a bunch of context when learning about open source data integration solutions – and Frank brings a bunch of context when learning about the software architecture that underlies Data Science.

Context Matters

Why is context so important? Context is what we mean most of the time when we use the word “experience.”

Like Frank’s German lessons helped him with English grammar, my open source data integration training is helping me articulate arguments for frameworks and patterns – in SSIS, even. I’ve gained additional depth in my career as a result of cross-training in other data integration platforms. I’m better at SSIS because I’ve learned Spark and Storm. Bonus: I’m getting lots of experience in the Azure Data Engineering platforms.

But – please trust me – I’m nowhere near leaving SSIS! In fact, I have a more lucid understanding of where SSIS excels when compared to other data integration platforms.

You’re still stuck with me writing about SSIS. I hope that’s ok.



Apache Storm Stream Processing in Azure HDInsight

I’m learning more about Apache Storm which is used for stream processing for near-real-time analytics in Azure HDInsight. I grabbed the image above from an AWS presentation on slideshare (link).

I got all giggly inside as I learned about Storm architecture. Why? Because I built this much of this same functionality as part the DILM Suite and we implement many of these same SSIS Design Patterns (SSIS Framework Editions, Controllers, Application Restartability, etc.) when delivering data integration awesomeness to our Enterprise Data & Analytics consulting customers.

It’s kinda cool to see similar architecture, patterns, and functionality in a completely separate toolset and environment. I find it… validating.

It just goes to show you, architecture is architecture, regardless of the platform or tools.


Achievement Unlocked: Certified in Implementing Predictive Analytics with Spark in Azure HDInsight

I learned a lot taking the course Implementing Predictive Analytics with Spark in Azure HDInsight!

I’m slowly working my way through the Data Science Certification offered by Microsoft Virtual Academy along with edX. This course is not part of the official curriculum, but I’ve been wanting to learn more about Spark so I took it.

Many of the courses at Microsoft Virtual Academy are free or have free options (like auditing the class). Check it out!


Presenting A Day of Intelligent Data Integration in NYC 18 May!

I’m honored to present A Day of Intelligent Data Integration – a SQL Saturday NYC precon – 18 May!

What is Intelligent Data Integration? SSIS packages developed using tried and true design patterns, built
to participate in a DevOps enterprise practicing DILM, produced using Biml and executed using an SSIS

Attend a day of training focused on intelligent data integration delivered by an experienced SSIS
consultant who has also led an enterprise team of several ETL developers during multiple projects that
spanned 2.5 years. And delivered.
Attendees will learn:

– a holistic approach to data integration design.
– a methodology for enterprise data integration that spans development through operational support.
– how automation changes everything. Including data integration with SSIS.

Topics include:
1. SSIS Design Patterns
2. Executing SSIS in the Enterprise
3. Custom SSIS Execution Frameworks
4. DevOps and SSIS
5. Biml, Biml Frameworks, and Tools

Register today!