While managing a team of 40 ETL developers, I wanted to track lots of metrics. Some of the things I wanted to track were technical, like SSIS package execution times. Some metrics were people-centric.
Andy’s First Rule of Statistics states:
You can use statistics for anything about people, except people.
Andy – circa 2005
It was important to me to track how long it took the on-call person to identify the problem. I didn’t use the information to beat on-call people over the head. I used the information to measure the results of several experiments for displaying metadata about the failure.
Reports For The Win
You may be as shocked by this as I was; reports helped a lot more than I anticipated. Before I deployed the reports the Mean Time to Identify Failure was tracking just under 30 minutes. After deploying the reports, the mean time to identify failure fell to 5 minutes.
As I said, I was shocked. There were mitigating circumstances. The on-call team members were getting more familiar with the information SSIS produces when it logs an error. They were gaining experience, seeing similar errors more than once.
I accounted for growing familiarity by narrowing the time window I examined. The least-impressive metrics put the reduction at 18 minutes to 5 minutes.
Before I built and deployed the dashboard for SSIS Application Instances (like the one pictured at the top of this post), on-call people would query custom-logging tables we built to monitor enterprise data integration. The queries to return Application Instance log data were stored where everyone could reach them. In fact, I used the same queries as sources for this report.
A funny thing happened when I deployed the reports. Each week, one or more on-call people would ping me and tell me how much they liked the reports. Even though the data was the same, the presentation was different. A picture with a little color goes a long way.
The image at the beginning of this section – the SSIS Framework Task Instance Report – is displayed when a user clicks the Failed link shown in the initial screenshot. This design received he most comment by the on-call team members. The most common comment was, “I click the Failed link and it takes me to details about the failure.” The reports were passing The 2:00 AM Test.
SSIS Framework Applications
If you’ve read this far and wondered, “Andy, what’s an SSIS Application?” An SSIS Application is a construct I came up with to describe a collection of SSIS Packages configured to execute in a specific order. An application is a way to group SSIS packages for execution. You can get a sense of how our frameworks work – especially the application execution functionality – by checking out the SSIS Framework Community Edition at DILM Suite (DILM == Data Integration Lifecycle Management).
An Application Instance is an instance of execution of an SSIS Application. An Application Instance is made up of Package Instances. the relationship between applications and packages appears straightforward: an application is a collection of packages; parent-child; one-to-many. But it’s not quite that simple. Our SSIS Frameworks facilitate patterns that execute the same package multiple times, sometimes in parallel! We can also create packages that perform utility functions – such as ArchiveFile.dtsx – and call it from multiple applications. When you do the Boolean algebra, the relationship between applications and packages is many-to-many.
Our SSIS Frameworks are SSIS Catalog-integrated. They even work with the SSIS Integration Runtime that’s part of Azure Data Factory, Azure-SSIS.
While the Reporting Services dashboard was neat when it was released back in the day, the cool kids now play with Power BI. At DILM Suite you will also find a free – albeit basic – Power BI dashboard that surfaces many of the same metrics using even better reporting technology. The Basic SSIS Catalog Dashboard in Power BI is free at DILM Suite.
What is Intelligent Data Integration? SSIS (SQL Server Integration Services) packages developed using tried and true design patterns, built to participate in a DevOps enterprise practicing DILM (Data Integration Lifecycle Management), produced using Biml (Business Intelligence Markup Language) and executed using an SSIS Framework.
Attend a day of training focused on intelligent data integration delivered by an experienced SSIS consultant who has also led an enterprise team of several ETL developers during multiple projects that spanned 2.5 years (and delivered).
Attendees will learn:
a holistic approach to data integration design.
a methodology for enterprise data integration that spans development through operational support.
how automation changes everything. Including data integration with SSIS.
SSIS Design Patterns
Executing SSIS in the Enterprise
Custom SSIS Execution Frameworks
DevOps and SSIS
Biml, Biml Frameworks, and Tools
Prerequisites: Familiarity with SQL Server Integration Services (SSIS).
Continental Breakfast, Lunch and Afternoon Snacks are provided.
Shannon is a data engineer, data scientist, BimlHero (though not listed on the page at the time of this writing), and shepherd of the Biml Interrogator open source project. If you use Biml to generate SSIS projects that load flat files, you need Biml Interrogator.
Shannon, Kent Bradshaw, and I are also co-authoring a book tentatively titled Frameworks. (Confession: Kent and Shannon are mostly done… I am slacking…)
Shannon brings a metric ton of experience to serve our awesome clients. He has years of experience in data analytics, serving recently in the role of enterprise Lead Data Scientist. Shannon’s experience spans supply chain management, manufacturing, finance, and insurance.
In addition to his impressive data skills, Shannon is an accomplished .Net developer with enterprise senior developer experience (check out Biml Interrogator for a sample of his coding prowess).
Shannon is a regular speaker at SQL Saturday events, presenting on topics that include Business Intelligence, Biml, and data integration automation. He is a gifted engineer with experience in framework design, data integration patterns, and Azure who possesses a knack for automation. Shannon is an avid “tinkerer” who enjoys learning. He has experience implementing Azure Machine Learning and applying AI to predictive analytics using sources classified Big Data. Shannon is also a practitioner of data integration DevOps with SSIS. In other words, he fits right in with our team here at Enterprise Data & Analytics!
I am a data guy with a passion for partnering with clients to solve their database and technology issues. Over my career, I’ve played all the roles: database developer, administrator, business intelligence developer, and architect, and now consultant. I’m the guy you call in when you have the impossible problem and everyone tells you it cannot be solved. I automate solutions in order to free your current staff to do the higher value tasks. I bring solutions outside of traditional relational database solutions, in order to find the shortest path between you and your goals.
As an accomplished Microsoft SQL data professional, recognized BimlHero, and practicing Data Scientist, I’m the resource you need to extract the most value from your data.
I’m humbled and thankful and excited to watch Enterprise Data & Analytics continue to (quietly) grow – adding cool people (another announcement is forthcoming) and service offerings like Data Concierge. It’s very cool to watch!
Welcome Shannon! I’m honored to work with you, my brother and friend.
I occasionally (rarely) read reviews at Amazon of books I’ve written. If I learn of a complaint regarding the book I often try to help. If a reader experiences difficulty with demos, I often offer to help, and sometimes I meet with readers to work through some difficulty related to the book (as I offered here). About half the time, there’s a problem with the way the book explains an example or the sample code; the other half the time the reader does not understand what is written.
I own both cases. As a writer it’s my job to provide good examples that are as easy to understand as possible. Sometimes I fail.
In very rare instances, I feel the review misrepresents the contents of a book – enough to justify clarification. This review afforded one such opportunity. None of what I share below is new to regular readers of my blog. I chose to respond in a direct manner because I know and respect the author of the review, and believe he will receive my feedback in the manner in which it was intended – a manner (helpful feedback) very similar to the manner in which I believe his review was intended (also helpful feedback).
Thank you for sharing your thoughts in this review.
I maintain successful technology solutions are a combination of three factors:
The problem we are trying to solve;
The technology used to solve the problem; and
The developer attempting to use a technology to solve the problem.
I believe any technologist, given enough time and what my mother refers to as “gumption” (a bias for action combined with a will – and the stubbornness, er… tenacity – to succeed), can solve any problem with any technology.
I find your statement, “It doesn’t really teach BIML but rather teaches a specific optional framework for those who want to do everything with BIML rather than increase productivity for repetitive patterns” inaccurate. Did you read the entire book? I ask that question because I most often encounter readers who state the precise opposite of the opinion expressed in this review (some of your fellow reviewers express the opposite opinion here, even). I disagree with your statement, but I understand your opinion in light of other opinions expressed to me by readers during my past decade+ of writing several books. I share one example in this blog post. The short version: we often get more – or different – things from reading than we realize.
There is a section in The Biml Book – made up of 3 of the 20 chapters and appendices – that proposes a basic metadata-driven Biml framework which is the basis of a metadata-driven Biml framework in production in several large enterprises. I wrote those 3 chapters and the basic metadata-driven Biml framework in question. Varigence has also produced a metadata-driven Biml framework called BimlFlex – based upon similar principles – which has been deployed at myriad enterprises. The enterprise architects at these organizations have been able to improve code quality and increase productivity, all while decreasing the amount of time required to bring data-related solutions to market. They do not share your opinion, although they share at least some of the problems (you mentioned ETL) you are trying to solve.
Am I saying Biml – or the framework about which I wrote or even BimlFlex – is the end-all-be-all for every SSIS development effort? Goodness no! In fact, you will find disclaimers included in my writings on Biml and SSIS frameworks. I know because I wrote the disclaimers.
Misalignment on any of the three factors for successful technology solutions – the problem, the technology, and/or the developer – can lead to impedance mismatches in application and implementation. For example, Biml is not a good solution for some classes of ETL or data integration (points 1 and 2). And learning Biml takes about 40 hours of tenacious commitment (which goes to point 3). This is all coupled with a simple fact: data integration is hard. SSIS does a good job as a generic, provider-driven solution – but most find SSIS challenging to learn and non-intuitive (I did when I first started using it). Does that mean SSIS is not worth the effort to learn? Goodness, no! It does mean complaints about simplifying the learning process are to be expected and somewhat rhetorical.
Architects disagree. We have varying amounts of experience. We have different kinds of experience. Some architects have worked only as lone-wolf consultants or as members of single-person teams. Others have worked only as leaders of small teams of ETL developers. Rarer still are enterprise architects who are cross-disciplined and experienced as both lone-wolf consultants and managers of large teams in independent consulting firms and large enterprises. Less-experienced architects sometimes believe they have solved “all the things” when they have merely solved “one of the things.” Does this make them bad people? Goodness, no. It makes them less-experienced, though, and helps identify them as such. Did the “one thing” need solving? Goodness, yes. But enterprise architecture is part science and part art. Understanding the value of a solution one does not prefer – or the value of a solution that does not apply to the problem one is trying to solve – lands squarely in the art portion of the gig.
Regarding the science portion of the gig: engineers are qualified to qualitatively pronounce any solution is “over-engineered.” Non-engineers are not qualified to make any such determination, in my opinion.
By way of example: I recently returned from the PASS Summit. Although I did not attend the session, I know an SSIS architect delivered a session in which he made dismissing statements regarding any and all metadata-driven frameworks related to ETL with SSIS. If I didn’t attend his session, how do I know about the content of the presentation? A number of people in attendance approached me after the session to share their opinion that the architect, though he shared some useful information, does not appreciate the problems faced by most enterprise architects – especially enterprise architects who lead teams of ETL developers.
My advice to all enterprise architects and would-be enterprise architects is: Don’t be that guy.
Instead, be the enterprise architect who continues to examine a solution until underlying constraints and drivers and reasons the solution was designed the way it was designed are understood. Realize and recognize the difference between “That’s wrong,” “I disagree,” and “I prefer a different solution.” One size does not fit all.
Finally, some of the demos and Biml platforms were not working at the time you wrote this review. Many have been addressed since that time. In the future, updates to SSIS, Biml, and ETL best practices will invalidate what this team of authors wrote during the first half of 2017. As such, the statements, “And much of the online information is outdated and no longer works. It’s a shame someone hasn’t found a way to simplify the learning process.” is a tautology that defines progress. Learning is part of the job of a technologist. I encourage us all – myself most of all – to continue learning.
I wrote a book about practicing Data Integration Lifecycle Management with SSIS. In the book I explore tools I built to support DevOps with SSIS. Collectively, I call them the Data Integration Lifecycle Management Suite (DILM Suite). Many utilities at DILM Suite are free and some are open-source:
Software development is hard. It takes time, yes. But more than that, software development takes patience and thought and blood and sweat and love and tears.
My friends at Varigence recently released an update to their Business Intelligence Markup Language (Biml) products. If you’re into business intelligence or data science, integration, or engineering, you should check out Biml.
The release took longer than some would have liked. Varigence didn’t provide regular updates on progress. Some became… antsy.
I understand. Really, I do. As a BimlHero I get just a little more access behind the curtain compared to the average bear. Would I like to know more? Yep. Does it bother me when I don’t hear more? Nope. Why?
Software Development is Hard
I know how difficult it is to develop software because I decided back in the early 20-teens that I wanted to develop some software. (And I did it! Check out DILM Suite!) In the early 20-teens, I encountered… resistance… to the idea. Make no mistake, the resistance was well-founded and may ultimately prove to have been correct. But resistance didn’t do anything to curb my beliefs that:
Software should always participate in a lifecycle that is managed, preferably by a process akin to DevOps;
All software is tested. Some intentionally; and
SSIS development is software development.
The SSIS team at Microsoft has given us some incredible out-of-the-box functionality. I love the SSIS Catalog! It’s a great enterprise framework for managing data engineering execution, logging, and externalization (configuration). I believe that strongly-enough to have included similar statements in my last book: Data Integration Life Cycle Management with SSIS:
I can hear you thinking, “If you’re convinced the SSIS Catalog is so awesome, Andy, why did you build DILM Suite?” That’s a fair question. I actually answer this question in the book in chapter 6 titled Catalog Browser. You don’t have to buy the book to learn my answer; I published Chapter 6 here on this blog in a post with the obscure title, Why I Built DILM Suite, by Andy Leonard.
Was I Right?
I don’t know. Time will tell.
There have been thousands of downloads since I built DILM Suite. I view the number of downloads as indicative of interest. Does everyone who downloads a product – especially a free product – use that product? Goodness no. Does everyone who uses SSIS or the SSIS Catalog need to download DILM Suite components? Goodness no.
If you’re trying to practice lifecycle management (or DevOps) with SSIS, though, DILM Suite can help.
100 Dumb Little Things
Software development is a lot like being a parent in that it consists of getting 100 dumb little things right. Are the dumb little things important? Some are, some are not, and some are vital. Does anyone get all 100 dumb little things right in parenting? in software development? No and no.
At the end of the day, every day in fact, I am extremely proud of what I’ve built.
SSIS Catalog Compare is the first product I’ve ever attempted to develop. Perhaps that shows. My competition certainly thinks so and has made much hay out of this fact. Do I shy away from telling folks because my competition uses it against me?
Nope. At the end of the day, every day in fact, I am extremely proud of what I’ve built. I get regular feedback from customers sharing how much the product helps them manage SSIS in their enterprise. The feedback greatly overshadows the… statements… of the competition. (Sidebar: I sometimes wonder how my competition sleeps at night…)
Getting software right is all about getting everything right including the best wording for feedback and error messages (like that shown at the top of this post). Getting everything right is almost impossible, and certainly cost-prohibitive, but it should absolutely be the goal of any software development endeavor.
“Great Andy, But What If I Need To Load In Parallel Using A Framework?”
Enter the SSIS Design Pattern named the Controller Pattern. A Controller is an SSIS package that executes other SSIS packages. Controllers can be serial or parallel, or combinations of both. I’ve seen (great and talented) SSIS architects design frameworks around specific Controllers – building a Controller for each collection of loaders related to a subject area.
There’s nothing wrong with those solutions.
SSIS Framework Community Edition ships with a generic metadata-driven serial controller named Parent.dtsx which is found in the Framework SSIS solution.
Specific Controller Pattern Design
A specific Controller can appear as shown here (click to enlarge):
This controller achieves parallel execution. One answer to the question, “How do I execute packages in parallel?” is to build a specific controller like this one.
Simple and straightforward, uses out-of-the-box Execute Package Task
“All Executions” Catalog Report is… misleading…
“How is the All Executions Report misleading, Andy?”
I’m glad you asked. If you build and deploy a project such as SerialControllers SSIS project shown here – and then execute the SerialController.dtsx package – the All Executions reports only a single package execution: SerialController.dtsx (click to enlarge):
We see one and only one execution listed in the All Executions report. If we click on the link to view the Overview report we see each package listed individually:
The All Executions report accurately reflects an important aspect of the execution of the SerialController.dtsx SSIS package. The execution of this package – and the packages called by SerialController.dtsx – share the same Execution ID value. This is not necessarily a bad thing, but it is something of which to be aware.
Specific Controller Design in SSIS Framework Community Edition
A specific Controller built using SSIS Framework Community Edition can appear as shown here:
This controller uses Execute SQL Tasks instead of Execute Package Tasks. The T-SQL in the Execute SQL Tasks calls a stored procedure named custom.execute_catalog_package that is part of SSIS Framework Community Edition.
One answer to the question, “How do I execute packages in parallel using SSIS Framework Community Edition?” is to build a Controller.
The SSIS Catalog All Executions report is accurate
The All Executions Report is no longer misleading. If you build and deploy a project such as SerialControllersInFrameworkCE SSIS project shown here – and then execute the SerialControllerInFrameworkCE.dtsx package – the All Executions reports each package execution (click to enlarge):
We now see one execution listed in the All Executions report for each package. As before, All Executions accurately reflects an important aspect of the execution of the SerialControllerInFrameworkCE.dtsx SSIS package: The execution of the Controller and each Childpackage now have distinct Execution ID values.
When using specific Controllers with an SSIS Framework it’s common to create a single-package SSIS Application that simply starts the Controller, and then let the Controller package call the child packages. Parent.dtsx in SSIS Framework Community Edition is a generic metadata-driven Controller, but it doesn’t mind executing specific Controllers one bit!
Once Upon A Time…
Not too long ago, Kent Bradshaw and I endeavored to add automated parallel package execution to our Commercial and Enterprise SSIS Frameworks. We achieved our goal, but the solution added so much complexity to the Framework and its associated metadata that we opted to not market the implemented solution.
Why? Here are some reasons:
Starting SSIS packages in parallel is very easy to accomplish in the SSIS Catalog. The SYNCHRONIZED execution parameter is False by default. That means we could build a controller package similar to the SerialControllerInFrameworkCE.dtsx SSIS package – with precedence constraints between each Execute SQL Task, even – and the SSIS Catalog would start the packages in rapid succession. In some scenarios – such as the scenario discussed in this post (from which the current post was derived) – this then becomes a race condition engine.
A Race Condition Engine?
Yes. Because controlling only when packages start is not enough to effectively manage race conditions. To mitigate the race condition described in this post I need to make sure the dimension loaders complete before starting the fact loaders. A (much simplified) Controller for such a process could appear as shown here (click to enlarge):
I grabbed this screenshot after the dimension loader and some (vague) pre-operations process have completed in parallel but while the fact loader is still executing. Please note the combination of the Sequence Container and precedence constraint which ensure the fact loader doesnot start executing until the dimension loader execution is complete. The sequence container creates a “virtual step” whereby all tasks within the container must complete before the sequence container evaluates the precedence constraint. Since each task inside this container starts an SSIS package (and since the SYNCHRONIZED execution parameter is set to True by default in SSIS Framework Community Edition), nothing downstream of this container can begin executing until everything inside the container has completed executing. This is how we avoid the race condition scenario described earlier.
How does one automate this process in a framework?
It’s not simple.
The method Kent and I devised was to create and operate upon metadata used to define and configure a “virtual step.” In SSIS Framework Community Edition the Application Packages table is where we store the Execution Order attribute. We reasoned if two Application Package entities shared the same value for Execution Order, then the associated package(s) (I’m leaving out some complexity in the design here, but imagine executing the same package in parallel with itself…) compose a virtual step.
In a virtual step packages would start together, execute, and not proceed to the next virtual step – which could be another serial package execution or another collection of packages executing in parallel in yet another virtual step – until all packages in the current virtual step had completed execution. Here, again, I gloss over even more complexity regarding fault tolerance. Kent and I added metadata to configure whether a virtual step should fail if an individual package execution failed.
This was but one of our designs (we tried three). We learned managing execution dependency in a framework is not trivial. We opted instead to share the Controller pattern.
We Learned Something Else
While visiting a client who had deployed the Controller Pattern, we noticed something. The client used a plotter to print large copies of Controller control flows and post them on the walls outside his cubicle.
When we saw this we got it.
The tasks in the Controller’s control flow were well-named. They were, one could say, self-documenting. By posting updated versions of the Controller control flows whenever the design changed, the data engineer was informing his colleagues of changes to the process.
He didn’t need to explain what had changed. It was obvious to anyone stopping by his cubicle for a few minutes. Briliant!
In this post I wrote about some data integration theory. I also answered a question I regularly receive about performing parallel loads using SSIS Framework Community Edition. I finally covered some of the challenges of automating a solution to manage parallel execution of SSIS packages in a Framework.
Note: much of this material was shared earlier in this post. I broke the Controller Pattern part out in this post because the other post was really too long.
You might need an SSIS framework. “How can I tell if I need an SSIS framework, Andy?” I’m glad you asked.
Does your enterprise:
Execute lots of SQL Server Integration Services (SSIS) packages?
Execute SSIS packages several times per day?
Execute “SSIS in the Cloud” using the Azure Data Factory version 2 Integration Runtime (ADFv2 IR)?
Require configuration options not available in off-the-shelf solutions?
How an SSIS Framework Can Help
One SSIS best practice is to develop small, unit-of-work packages. You can think of them as data engineering functions. Design SSIS packages with the fewest number of Data Flow Tasks, optimally one.
There’s a saying in engineering (and life): “There’s no free lunch.” Applied to data engineering with SSIS, if you apply the best practice of creating small, unit-of-work packages, you end up with a bunch of SSIS packages. How do you manage executing all these packages? An SSIS framework.
An SSIS framework manages package execution, configuration, and logging.
SSIS Framework Community Edition
The SSIS Framework Community Edition is part of the DILM (Data Integration Lifecycle Management) Suite. SSIS Framework Community Edition groups the execution of several SSIS packages into SSIS Applications, which are a collection of SSIS packages configured to execute in a specific order.
“Can’t I just use Execute Package Tasks for that, Andy?”
Yes. And no. When deploying to the SSIS Catalog, the Execute Package Task can be used to execute any package as long as that package exists in the same project. What if you have a utility package – say a package that archives flat files after you’ve loaded them – that you want to store in a single SSIS Catalog folder and project but call from different processes (or SSIS applications)? SSIS Framework Community Edition can execute that package as part of an SSIS application.
SSIS Framework Community Edition is Catalog-Integrated
SSIS Framework Community Edition is integrated into the SSIS Catalog. When packages execute as part of an SSIS application, operational metadata and execution information is sent to the SSIS Catalog’s tables. You can view operational metrics and metadata using the catalog reports solution built into SQL Server Management Studio (SSMS)…
…or you could view SSIS execution logs and operational metadata using Catalog Reports – a free and open-source SQL Server Reporting Services (SSRS) solution from DILM Suite.
SSIS Framework Community Edition is Free. And Open-Source.
SSIS Framework Community Edition is free and open-source. In fact, the documentation walks you through building your own SSIS framework – it teaches you how you would design your own SSIS framework.
SSIS Framework Community Edition is Customize-able
Customization is one of the coolest features of open-source software. If you need some unique functionality, you have the source code and can code it up yourself!
If you don’t have time to code your own unique functionality, Enterprise Data & Analytics can help. It’s possible SSIS Framework Commercial or Enterprise Edition already has the functionality you seek. Compare editions to learn more.
SSIS Framework Community Edition is Cloud-Ready
I can hear you thinking, “Wait. It’s free. It’s open-source. And it runs in the cloud?” Yep, yep, and yep!
We Can Help
At Enterprise Data & Analytics, we’ve been building data integration frameworks for over 15 years. I wrote a book about Data Integration Lifecycle Management (DILM):
We built the DILM Suite– a collection of utilities and solutions, many of which are free (and some even open-source!):
Truth be told, I enjoy writing. I think my love of writing stems from a love of learning. I’m just trying to share the joy!
It’s a huge honor to write. I’m humbled and thrilled (all at the same time – a feeling I describe as “all giggly inside”) whenever I’m attending an event like the PASS Summit or a SQL Saturday and someone tells me they enjoy what I’ve written or learned something from something I’ve written.
Mostly I’ve co-authored books but a few I’ve written by myself. I owe all my knowledge to those who shared their knowledge with me. I am merely your scribe.
I’m going to be a little bold in this post and suggest if you are developing for SQL Server, the screenshot to the left and above shows something that is, well, wrong. I can hear you thinking,
“What’s Wrong With That Output, Andy?”
I’m glad you asked. I will answer with a question: What just happened? What code or command or process or… whatever… just “completed successfully”? Silly question, isn’t it? There’s no way to tell what just happened simply by looking at that output.
And that’s the point of this post:
You don’t know what just happened.
Sit Back and Let Grandpa Andy Tell You a Story
I was managing a large team of ETL developers and serving as the ETL Architect for a large enterprise developing enterprise data engineering solutions for two similar clients. Things were winding down and we were in an interesting state with one client – somewhere between User Acceptance Testing (UAT) and Production. I guess you could call that state PrUAT, but I digress…
The optics got… tricksy… with the client in PrUAT. Vendors were not receiving pay due to the state of our solution. The vendors (rightfully) complained. One of them called the news media and they showed up to report on the situation. Politicians became involved. To call the situation “messy” was accurate but did not convey the internal pressure on our teams to find and fix the issue – in addition to fixing all the other issues.
There were fires everywhere. In this case, one of the fires had caught fire.
Things. Were. Ugly.
My boss called and said, “Andy, can you fix this issue?” I replied, “Yes.” Why? Because it was my job to fix issues. Fixing issues and solving problems is still my job (it’s probably your job too…). I found and corrected the root cause in Dev. As ETL Architect, I exercised my authority to make a judgment call, promoted the code to Test, tested it, documented the test results, created a ticket, and packaged things up for deployment to PrUAT by the PrUAT DBAs.
Because this particular fire was on fire, I also followed up by calling Geoff, the PrUAT DBA I suspected would be assigned this ticket. Geoff was busy (this is important, don’t forget this part…) working on another fire-on-fire, and told me he couldn’t get to this right now.
But this had to be done. Right now.
I thanked Geoff and hung up the phone. I then made another judgment call and exercised yet more of my ETL Architect authority. I assigned the PrUAT ticket to myself, logged into PrUAT, executed the patch, copied the output of the execution to the Notes field of the ticket (as we’d trained all DBAs and Release Management people to do), and then manually verified the patch was, in fact, deployed to PrUAT.
I closed the ticket and called my boss. “Done. And verified,” I said. My boss replied, “Good,” and hung up. He passed the good news up the chain.
A funny thing happened the next morning. And by “funny,” I mean no-fun-at-all. My boss called and asked, “Andy? I thought you said the patch was was deployed to PrUAT.” I was a little stunned, grappling with the implications of the accusation. He continued, “The process failed again last night and vendor checks were – again – not cut.” I finally stammered, “Let me check on it and get back to you.”
I could ramble here. But let me cut to the chase. Remember Geoff was busy? He was working a corrupt PrUAT database issue. How do you think he solved it? Did you guess restore from backup? You are correct, if so. When did Geoff restore from backup? Sometime after I applied the patch. What happened to my patch code? It was overwritten by the restore.
I re-opened the ticket and assigned it to Geoff. Being less-busy now, Geoff executed the code, copied the output into the Notes field of the ticket (as we’d trained all DBAs and Release Management people to do), and then closed the ticket. The next night, the process executed successfully and the vendor checks were cut.
“How’d You Save Your Job, Andy?”
That is an excellent question because I should have been fired. I’m almost certain the possibility crossed the mind of my boss and his bosses. I know I would have fired me. The answer?
Output. Documented output, to be more precise.
You see, the output we’d trained all DBAs and Release Management people to copy and paste into the Notes field of the ticket before closing the ticket included enough information to verify that both Geoff and I had deployed code with similar output. It also contained date and time metadata about the deployment, which is why I was not canned.
Compare the screenshot at the top of this post to the one below (click to enlarge).
This T-SQL produces lots of output. That’s great. Sort of.
“There’s no free lunch” is a saying that conveys everything good thing (like lunch) costs something (“no free”). And that’s true – especially in software development. Software design is largely an exercise in balancing between mutually exclusive and competing requirements and demands.
If it was easy anyone could do it.
It’s not easy. It takes experienced developers years to develop (double entendre intended) the skills required to design software – and even more years of varied experience to build the skills required to be a good software architect.
The good news: the output is awesome. The bad news: the output is a lot of typing.
“So Why, Andy? Why Do All The Typing?”
That’s not DevOps. That’s wishful thinking.
You’ve probably heard of technical debt. This is the opposite of technical debt; this is a technical investment.
Technical investments are time and energy spent early (or earlier) in the software development lifecycle that produce technical dividends later in the software development lifecycle. (Time and energy invested earlier in the project lifecycle always costs less than investing later in the project lifecycle. I need to write more about this…) What are some examples of technical dividends? Well, not-firing-the-ETL-Architect-for-doing-his-job leaps to mind.
This isn’t the only technical dividend, though. Knowing that the code was deployed is important to the DevOps process. Instrumented code is verifiable code – whether the instrumentation supports deployment or execution. Consider the option: believing the code has been executed.
That’s not DevOps. That’s wishful thinking.
Measuring Technical Dividends
Measuring technical dividends directly is difficult but possible. It’s akin to asking the question, “How much downtime did we avoid by having good processes in place?” The answer to that question is hard to capture. You can get some sense of it by tracking the mean time to identify a fault, though – as measured by the difference between the time someone begins working the issue and the time when they identify the root cause.
Good instrumentation reduces mean time to identify a fault. Knowing is better than guessing or believing. The extra typing required to produce good output is worth it.
In this age of automation, good output may not require extra typing. Good output may simply require another investment – one of money traded for time. There are several good tools available from vendors that surface awesome reports regarding the state of enterprise software, databases, and data. DevOps tools are maturing and supporting enterprises willing to invest the time and energy required to implement them.
One such tool is SSIS Catalog Compare which generated the second screenshot. (Full disclosure: I built SSIS Catalog Compare.)
SSIS Catalog Compare generates scripts and ISPAC files from one SSIS Catalog that are ready to be deployed to another SSIS Catalog. Treeview controls display Catalog artifacts, surfacing everything related to an SSIS Catalog project without the need to right-click and open additional windows. (You can get this functionality free by downloading SSIS Catalog Browser. Did I mention it’s free?)
In addition, SSIS Catalog Compare compares the contents of two SSIS Catalogs – like QA and PrUAT, for example. Can one compare catalogs using other methods? Yes. None are as easy, fast, or complete as SSIS Catalog Compare.
For a limited time you can get SSIS Catalog Compare for 60% off. Click here, buy SSIS Catalog Compare – the Bundle, the GUI, or CatCompare (the command-line interface) – and enter “andysblog” without the double-quotes as the coupon code at checkout.
Whether you use a tool to generate scripts or not, it’s a good idea to make the technical investment of instrumenting your code – T-SQL or other. Good instrumentation saves time and money and allows enterprises to scale by freeing-up people to do more important work.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.