Apache Storm Stream Processing in Azure HDInsight

I’m learning more about Apache Storm which is used for stream processing for near-real-time analytics in Azure HDInsight. I grabbed the image above from an AWS presentation on slideshare (link). I got all giggly inside as I learned about Storm architecture. Why? Because I built this much of this same functionality as part the DILM …
Continue reading Apache Storm Stream Processing in Azure HDInsight

Expert SSIS Training: SSIS Design Patterns for Performance

  I focus on three topics in the Expert SSIS course I deliver live, online in cooperation with Brent Ozar Unlimited: SSIS Design Patterns for Performance SSIS Deployment, Configuration, Execution, and Monitoring Automation What do I cover in SSIS Design Patterns for Performance? I cover Data Flow Internals, chat about throughput – the OBM (One Big Metric) …
Continue reading Expert SSIS Training: SSIS Design Patterns for Performance

Why I Built DILM Suite, by Andy Leonard

The following is Chapter 6: Catalog Browser from my latest book titled Data Integration Life Cycle Management with SSIS: A Short Introduction by Example: I was honored to be a Microsoft SQL Server MVP for five years (2007-2012). One cool thing about a being a Microsoft MVP was access to the internal developer teams. Everyone could …
Continue reading Why I Built DILM Suite, by Andy Leonard

ETL Instrumentation: Logging SSIS Variable Values

During the December 2017 delivery of Expert SSIS, I was asked if there is an SSIS Catalog Logging Mode that will display the value of variables. I responded that I wasn’t aware of a logging level that accomplishes this, but then – as sometimes happens – I could not let it go. It is an excellent …
Continue reading ETL Instrumentation: Logging SSIS Variable Values

Coming Soon to SSIS Catalog Compare: Values Everywhere

I’ve been testing a new feature in SSIS Catalog Compare‘s catalog browser. I call it “values everywhere.” What do I mean by values everywhere? In the browser shown above, please note the reference mapping of the ConnectionString property for the package connection manager in SimplePackage.dtsx. Each Reference is listed as a child node of the reference …
Continue reading Coming Soon to SSIS Catalog Compare: Values Everywhere

Website Ch-ch-changes…

Inspired by Frank La Vigne’s (blog | @Tableteer) snappy update to the Data Driven home page, I decided to spruce up a couple / three of my websites including this one, andyleonard.blog. I added the sidebar shown on the right of the site. It contains a more-readily-available search box and links to other websites I maintain – …
Continue reading Website Ch-ch-changes…

SSIS and Visual Studio Configurations

I got a great question from a student in the December delivery of Expert SSIS. A student asked, “Why wouldn’t you use Environments in Visual Studio (Dev, Test, and Prod), and deploy accordingly the mapped project parameters and package parameters?” I’ve looked into using SQL Server Data Tools (SSDT, or Visual Studio) configurations in the …
Continue reading SSIS and Visual Studio Configurations

On Data Frameworks…

You may not realize this, but Apache Spark is a framework. Spark is cluster-computing engine that manages parallel executions extremely well. Spark enables other technologies including Java, Scala, Python, R, and graph processing. Spark stitches together previously-disparate functionality into a cohesive, syntactically-similar set of commands. Spark’s architecture is library-driven and includes the following libraries: Spark SQL …
Continue reading On Data Frameworks…