Q: “Is It Possible to Get Software Development Right the First Time?”

photo credit: https://www.nasa.gov/centers/kennedy/shuttleoperations/orbiters/discovery-info.html

A: “Almost.”

Q: “Why Almost?”

I’m glad you asked. There’s an awesome, albeit dated somewhat, article titled They Write the Right Stuff. The article is about implementing a process to manage software development. In the article, the following metrics are mentioned:

  • 3 Previous Builds:
    • 420,000 lines of code
    • 1 error each
  • 11 Previous Builds:
    • 17 total errors

Let’s break down those numbers.

Analyzing the 3 Previous Builds

Each of the three previous builds experienced 1 error and contained 420,000 lines of code. The math says the errors per lines of code is roughly 0.00000238 or an error rate of 0.000238%. In engineering, we call that close.

Analyzing the Last 11 Builds

If we apply the same math to the previous 11 builds, the math [17 / (11*420,000)] says errors per lines of code is roughly 0.00000368 or an error rate of 
0.000368%. Higher, yes. But still close.

OBM (One Big Metric)

I’m a fan of identifying OBM, or One Big Metric, for comparisons. My favorite OBM story is about whether one should use premium or regular gasoline in a vehicle. Some measure miles-per-gallon, or mpg. I care more about cost-per-mile, or cpm. By measuring, I found premium gasoline had a lower cpm than regular gasoline for my old Jeep pickup (note: there’s more to the calculation than I mention here. If you’d like to correct me, feel free to do so in the Comments section. Warning: You are arguing with an engineer…).

Picking OBM is fraught with danger. In fact, danger surrounds the entire process of statistical analysis. You can find a cool summary of data mistakes here

As in my Jeep pickup example, I usually choose an economic OBM.

OBM for Almost-Perfect Software Development

In the article, near the bottom, they mention the budget: $35,000,000 per year. 

$35,000,000 / 420,000 lines of code == $83.33 per line of code

Math Homework

According to my OBM, how much does your software cost? To find out, divide your annual software development budget by the lines of code currently supported by your team.

If your number is close to $83 per line of code, you can expect an error rate that’s a small fraction of 1%.

If your number is lower than $83 per line of code (and I’m going to guess your number is way lower than $83 per line of code), then you should expect a higher error rate.

Conclusion

Q: Is it possible to reduce the error rate of software development? 
A: Yes. But it’s expensive.

Fundamentals of Azure Data Factory

(click to enlarge)
(click to enlarge)

This past week I delivered Fundamentals of Azure Data Factory twice. Once for Brent Ozar Unlimited and once for a client – a private training delivery

The course introduces Azure Data Factory and covers:

  • Azure Data Factory Essentials – including provisioning an ADF instance
  • Azure Data Factory Pipelines – in which introduce pipeline activities and source control
  • Azure Data Factory SSIS Integration Runtime (or Azure-SSIS) – how it works plus lifting and shifting SSIS from your enterprise to Azure-SSIS
  • Azure Data Factory Execution Design Patterns
  • Azure Data Factory and Data Integration Lifecycle Management – beyond source control

The course is designed to get someone up and running on ADF.

Does this training course sound interesting? Ping me and let me know how I may serve you!

:{>
“My name is Andy and I consult.”

SSIS, Docker, and Windows Containers, Part 0 – Getting Started

To begin tinkering with SSIS in containers, you first need to install Docker. There are some prerequisites. I will not exhaust the prerequisites here. I strongly suggest you familiarize yourself with the requirements for Docker prior to attempting to install the software.

Since I use a PC (a Lenovo P51S) running Windows 10 Pro, I chose to use Docker for Windows.

I Needed a HyperVisor

I chose to run Docker for Windows with Hyper-V:

One reason I upgraded to Windows 10 was to work with containers. I read posts and articles that stated I could use Docker for Windows with VirtualBox, and I had been using VirtualBox for a long time. When I started using VirtualBox, it was the only HyperVisor that was:

  1. Free; and
  2. Supported 64-bit guests.

I decided to switch to Hyper-V, though, and rebuilt my demo and development virtual machines in Hyper-V.

IWOMM; YMMV (It works on my machine; your mileage may vary…)  😉

Choose a Container OS Platform

Once Docker is installed you need to decide whether to work with Linux or Windows containers, but you can switch anytime:

One nice thing about switching is it’s fairly painless, as one may glean from the message that displays when I click “Switch to Linux containers…”:

Docker for Windows with Windows Containers

The cool kids are using Linux containers, especially the cool SQL Server kids. I’ve tinkered a little with SQL Server on Linux. I’m interested in SSIS, though, so I’ve been focusing on running Windows containers in Docker for Windows.

Getting Started

Containers are like lightweight virtual machines. They typically consume less disk space. Containers spin up relatively quickly. A developer can configure a container, persist it in some state, and then turn it off. It’s faster and lighter than working with a VM. There are other benefits that we will get to later in this series.

After installing Docker, your next step is pulling an image.

Pulling an Image

I can hear you thinking, “Umm, Andy… what’s an image?” I’m glad you asked. From the Docker Glossary:

An Image is an ordered collection of root filesystem changes and the corresponding execution parameters for use within a container runtime. An image typically contains a union of layered filesystems stacked on top of each other. An image does not have state and it never changes.

I can now hear you thinking, “Umm… what?” I think of an image as a pre-configured container. What’s configured in the pre-configuration? Well, the operating system and other software I may want to use.

For example, the hello-world image is relatively small and will test your installation of docker. In the animated gif below, I show how I:

  • Search for images named “hello-world”
  • Locate the name of an image labeled “Official” and named “hello-world”
  • Pull the “hello-world” image (sped up – my internet is not that fast out here in Farmville…)
  • Re-pull the “hello-world” image to show how things look when the image is up to date.
  • Run the image

(click to open, enlarged, in a new tab)

As stated earlier, the hello-world image is a test.

Searching, Pulling, and Starting a Windows Container with SQL Server Installed

To search for, pull, and start a Windows container with SQL Server installed and configured, execute the following Docker commands:

  1. docker search mssql
  2. docker pull microsoft/mssql-server-windows-developer
  3. docker run -d -p 1433:1433 -e sa_password=$up3r$3cr3t -e ACCEPT_EULA=Y –name mySqlServerContainer microsoft/mssql-server-windows-developer

If all goes well, you will see the container id – a hexadecimal number such as:

2d06a9a756b7b75bb0e388173bdd6925ba712e8843848610b5f5276a69f4bf19

Command Line Switches

#3 above has a lot of switches on that command line. Here’s what they mean:

  • -d == detach, which tells docker to run the container in the background and print the container id
  • -p == publish list, which publishes a container’s port(s) to the host
    • I use -p to map the container’s port 1433 (SQL Server uses port 1433) to my laptop’s port 1433
  • -e == env list, which sets a container’s environment variable(s) to a value (or values).
    • I use -e to set the sa_password and ACCEPT_EULA environment variables
  • –name == assigns a name to the container
  • The final argument is the name of the image

By the way, you can get help on any command in docker by typing docker <command> — help. To complete the list above, I typed:

docker run –help

Interacting with Your Newly-Started Container

If you saw that long hex number, your SQL Server Windows container started. It’s a brand new container – it still has that new container smell.

There are a few ways to interact with this container. Let’s look at one, PowerShell.

Connect to PowerShell in mySqlServerContainer using the following command:

docker exec -it mySqlServerContainer powershell

Once connected, you can use PowerShell to execute all sorts of commands, such as:

dir
(or ls):

Cool? I think so too.

Conclusion

I’m going to stop here because this post is long enough and this is enough to get you started using SQL Server in a Windows container. That was my goal today.

Enjoy!

Packaging SSIS Catalog Deployments

I love the SSIS Catalog. It’s an elegant piece of data integration engineering and I cannot say enough positive things about it. Packaging SSIS Catalog deployments can be tricky, though.

The SSIS Catalog is a framework. Frameworks manage execution, configuration, and logging; and the SSIS Catalog handles each task with grace. Like I said, I love it!

But…

(You knew there was a “but…” coming, didn’t you?)

A Tale of Two Audiences

There are two audiences for the SSIS Catalog, two groups of consumers:

  1. Administrators
  2. Developers
  3. Release Managers

I listed three. Because…

Administrators

(click to enlarge)

SSIS is often administered by database administrators, or DBAs. I admire DBAs. It’s often a thankless job – more like a collection of jobs all rolled into one (and subsequently paid as if its one job…).

I believe the SSIS Catalog interface presented to DBAs in SQL Server Management Studio is sufficient.

My complaint is the SSIS administrator has to expand a handful of nodes in Object Explorer and then right-click to open the SSIS project configuration window and then double-click each referenced SSIS Catalog environment to determine which value is configured for use when an SSIS package is executed.

Click the screenshot above to see what I mean. Configuring SSIS Catalog deployments in SSMS is challenging. I find it… clunky. Once I understood all the windows, what they meant and how to configure an SSIS package and project deployed to the SSIS Catalog, this made sense. But – in my opinion – this interface works against comprehension.

Does this interface work, though? It certainly does. When I teach people how to use the SSIS Catalog, I show them how to use the Object Explorer interface provided in SSMS.

(click to enlarge)

I don’t stop there, however, because I built one solution to the problem. I call my solution SSIS Catalog Browser. If you click to enlarge this image you will note I am viewing the configuration of the same parameter displayed in the SSMS image above. I find this interface cleaner.

Do administrators still need to understand how to configure SSIS Catalog deployments and SSIS packages and projects deployed to the SSIS Catalog? You bet. There is no substitute for understanding. SSIS Catalog Browser surfaces the same metadata displayed in the SSMS Object Explorer. The only difference is Catalog Browser is easier to navigate – in my opinion.

Best of all, Catalog Browser is free.

Developers and Release Managers

SSIS developers and release managers (DevOps release teams) need more functionality. As I wrote in DILM Tiers for the SSIS Enterprise, an enterprise should have a minimum of four Data Integration Lifecycle Management (DILM) tiers to manage enterprise data integration with SSIS. Those tiers need to be:

  1. Development – an environment where SSIS developers build SSIS packages and projects. SSIS developers need permission / rights / roles to utterly destroy the database instances in Dev. If the SSIS developers lack this ability, you have “an environment named Development but not a Development environment.” There is a difference.
  2. Test or Integration – an environment where SSIS developers have permission / rights / roles to deploy, configure, execute, and view logs related to SSIS packages and projects.
  3. UAT or QA (User Acceptance Testing or Quality Assurance or any environment other than Production, Test, or Development) – an environment that mimics Production in security, permission / rights / roles. Developers may (or may not) have read access to logs, source, and destination data. SSIS administrators (DBAs or DevOps / Release teams) own this environment. The person performing the deployment to Production should perform the deployment to UAT / QA / Whatever because I do not want the Production deployment to be the very first time this person deploys and configures this SSIS project.
  4. Production.

I architect data integration environments in this manner to support DILM (Data Integration Lifecycle Management) with SSIS, as I wrote in Data Integration Lifecycle Management with SSIS.

Viewing the contents of an SSIS Catalog is not enough functionality to manage releases. Why, then, do I include developers? Because…

SSIS developers create the initial SSIS Catalog deployments in the DILM DevOps cycle.

I cannot overemphasize this point. Developers need an environment where they are free to fail to build SSIS. They aren’t free to succeed, in fact, unless and until they are free to fail.

Have you ever heard a developer state, “It works on my machine.”? Do you know why it works on their machine? Defaults. They coded it up using default values. The defaults have to work or the developer will not pass the code along to the next tier.

How does an SSIS developer know they’ve forgotten to parameterize values?
How do they figure this out?

It’s impossible to test for missing parameters in the Development environment.

The answer is: SSIS developers must deploy the SSIS project to another environment – an environment separate and distinct from the Development environment – to test for missing parameterization.

To review: SSIS developers need a Development environment (not merely an environment named Dev) and they need a different environment to which they can deploy, configure, execute, and monitor logs.

Error Elimination

Having the SSIS developers build and script the SSIS Catalog deployments eliminates 80% of deployment configuration errors (according to Pareto…).

Having SSIS administrators practice deployment to UAT / QA / Whatever eliminates 80% of the remaining errors.

Math tells me an enterprise practicing DILM in this manner will experience a 4% deployment error rate. (Want to knock that down to 0.8%? Add another tier.)

Packaging SSIS Deployment

I will not go into functionality missing from the SSMS Object Explorer Integration Services Catalogs node (nor the underlying .Net Framework assemblies). I will simply state that some functionality that I believe should be there is not there.

(click to enlarge)

I don’t stop there, however, because I built one solution to the problem. I call my solution SSIS Catalog Compare. If you click the image to enlarge it, you will see a treeview that surfaces SSIS Catalog artifacts in the same way as SSIS Catalog Browser (they share a codebase). You will also see the results of a comparison operation, and the user initiating the packaging of an SSIS folder deployment by scripting the folder and its contents.

The result is a file system folder that contains T-SQL script files and ISPAC files for each SSIS Catalog configuration artifact:

  • Folder
  • SSIS Project
  • Environment
  • Reference
  • Configured Literals

You can use SSIS Catalog Compare to lift and shift SSIS Catalog objects from any environment to any other environment – or from any DILM tier to any other DILM tier – provided you have proper access to said environments and tiers.

This includes lifting and shifting SSIS to the Azure Data Factory SSIS Integration Runtime, also know as Azure-SSIS.

Zip up the contents of this file system folder, attach it to a ticket, and let your enterprise DevOps process work for data integration.

Conclusion

The SSIS Catalog is a framework, and a pretty elegant framework at that. Some pieces are clunky and other pieces are missing.

DILM Suite helps.

PASS Summit 2018 Session Feedback

I was honored to participate in three presentations at the PASS Summit 2018. I delivered a full-day pre-conference session titled Intelligent Data Integration and another session titled Faster SSIS. I also participated on a panel titled BI & Data Visualization.

A great big THANK YOU to everyone who attended my presentations!

As I’ve shared (numerous times) in the past, I share my ratings not to boast but to let less experienced and would-be presenters peek behind the curtain.

The ratings were on a scale of 1 (bad) to 5 (good).

Intelligent Data Integration Precon

I had 107 attendees and received feedback from 33 attendees.

Ratings

Rate the value of the session content: 4.45
How useful and relevant is the session content to your job/career? 4.24
How well did the session’s track, audience, title, abstract, and level align with what was presented? 4.42
Rate the speaker’s knowledge of the subject matter: 4.79
Rate the overall presentation and delivery of the session content: 4.45
Rate the balance of educational content versus that of sales, marketing, and promotional subject matter: 4.55

These are ok numbers. Not as good as Brent’s or Cathrine’s, but ok. I’m ok with these numbers but I want to do a better job connecting with the audience.

Selected Comments About the Presentation

Most of the comments were positive. I really appreciate the positive comments. I included all the non-positive comments below. Some of the negative comments were also instructional. Others… need instruction.

“I learned tips and tricks that justified the cost of the conference.”
“Andy was an excellent speaker. Working through demos was integral in giving me exactly what I need to implement a solution.”
“Great knowledge and very humble in the way he talks.”
“Andy is awesome, his ego does not get in the way during the presentation, overall great.”
“Andy is a wonderful speaker and I was able to easily understand the topics covered and feel confident that I can use the lessons learned in my profession.”
“Excellent delivery of material. Always delivers humor along with the content. Picked up tips that would have taken a very long time to learn on my own. Would love to see another pre-conference session on ETL by Andy (whether SSIS or ADF).”

“We spent way too much time in Biml. ”
“Andy knows his subject but while the Biml intro was interesting the examples of SSIS patterns were too simplistic, too ‘Hello world!’. Should have had at least one example were data source and destination were on different servers.”
“Continual reference speaker made to how session content just ‘hobbying’/not production conscious, alluding to the ‘real’ stuff being something other than what was being presented. ‘I have a stand in the exhibitors arena’!!! The serious stuff!!! Cheapened the experience. Although pre-con by no means cheap! This inflection, re-iterated by the speaker was disappointing. Felt a superiority complex existed!
Unsettling.”

My Thoughts

I didn’t get much feedback on the Biml portion of the precon. Of the two comments that mention Biml, one was negative and another positive. I’m not sure if that’s because I covered Biml during the first part of the precon (and evaluations were filled out at the end of the precon) or if I did a worse job on Biml than the other topics. To me, it looks like a wash.

The last comment is… I don’t know. I think someone was having a bad day or I wasn’t their cup of tea or I just rubbed them the wrong way. I actually took 60 seconds to explain that PASS exhorts presenters to steer clear of sales-y talk in presentations. In the past, I’ve been vocal in my disagreement with PASS about these policies. While I believe I am right, I signed a contract to abide by the rules.

That said, the rules permit sponsors and exhibitors to share information about their for-pay products during presentations. In the 60 seconds, I shared my belief that someone – either the attendee or their company – had paid good money for the attendee to attend this precon. But I did not want to spend any of our time together selling them anything. I shared some of the free tools available from DILM Suite. I mentioned I sell other tools and solutions and that I was an exhibitor. If they wanted to learn more about products I sell, they could drop by the booth or email me.

bunch of attendees did just that! Nick and I had some great conversations with people at our booth. I’m pretty sure we will be helping some folks with training and consulting, and that we sold some licenses to DILM Suite products.

I strive to cover the topics in as much depth as possible. In order to do that, I demonstrate “chunks” of functionality and patterns using demos that are not Production-ready. I believe more than one day is required to delve into data integration with SSIS with Production-ready code. I’ve led such training – and continue to deliver beginner and advanced SSIS training privately and occasionally, publicly. If you’re interested in learning SSIS from the ground up, I have a five-day course named From Zero To SSIS. I’d be honored to deliver this training for you or your team. Email me. If you are an experienced SSIS developer, I recommend attending a delivery of Expert SSIS presented in cooperation with Brent Ozar Unlimited.

To this attendee who I do not know (evaluations are anonymous when sent to speakers) and any other attendees who agreed with this attendee, I apologize and ask your forgiveness for coming across as superior and cheapening your experience.

Faster SSIS

I had 203 attendees and received feedback from 63 attendees.

Ratings

Rate the value of the session content: 4.54
How useful and relevant is the session content to your job/career? 4.44
How well did the session’s track, audience, title, abstract, and level align with what was presented? 4.60
Rate the speaker’s knowledge of the subject matter: 4.83
Rate the overall presentation and delivery of the session content: 4.63
Rate the balance of educational content versus that of sales, marketing, and promotional subject matter: 4.83

These are better numbers. They trend about the same, though: I scored higher on speaker’s knowledge of the subject matter in both this session and the precon. My lowest score in both presentations is the usefulness of the session content for job and career. From this I glean I need to work on content.

Selected Comments About the Presentation

As before, most of the comments were positive and I included all the less-than-positive comments below:

“I now know that a hashbyte match for an etl will work. Ive wanted to try this but my clients use other methods to load data. Thank you very much.”
“Andy is a good man and we are lucky to have him in our SQL Family.”
“Very easy to sit through! Great content and answers.”
“Andy is always entertaining. I learned quite a bit from this session and I plan to implement the things he spoke about.”
“Great session, love your humor. Good and easy to follow demos. Good job!”
“Andy was incredibly entertaining and the content was very good.”
“Great session. He’s on my list when I come back.”
“More sessions should have interpretive dances.”

“Thanks for the talk. Liked the time for questions. However I would have preferred more examples.”
“Excellent presentation skills. Very interesting and entertaining. There were way too many questions about their own specific issues that would have been better if handled after the session. It felt as if we rushed through content because there were so many questions. I’m definitely going to look into Andy’s available content because I found his approach interesting and different and would like to see more tricks.”
“Ok. Hard to understand some concepts.”
“Good instructor, but deep subject matter (hard to follow).”
“I should have read session description more carefully. It was beyond my skills. My own fault.”

My Thoughts

I’m going to start at the bottom of the negative comments here and say: “God bless you sir or ma’am,” to the person who had the courage to share that the level of the session was beyond their skills. I had this session pegged as Level 300. I will increase that level going forward to 400 (minimum) based on feedback about the level. It wasn’t just this person, I own the “hard to follow” and “hard to understand” comments. I will correct that moving forward. I will also add a disclaimer about the pace of this presentation in response to the “rushed through content” comment – which is another reason to bump the level.

Regarding the interpretive dance comment, well… you had to be there. 😉

BI & Data Visualization Panel

This was a panel that included Mico Yuk, Melissa Coates, Meagan Longoria, Ryan Wade, and me. We had 111 attendees and received feedback from 20 attendees.

Ratings

Rate the value of the session content: 3.75
How useful and relevant is the session content to your job/career? 3.90
How well did the session’s track, audience, title, abstract, and level align with what was presented? 3.90
Rate the speaker’s knowledge of the subject matter: 4.40
Rate the overall presentation and delivery of the session content: 3.90
Rate the balance of educational content versus that of sales, marketing, and promotional subject matter: 4.50

These numbers are not good but the comments help us understand why.

Selected Comments About the Presentation

The comments were evenly split between positive and less-than-positive:

“Fantastic and especially loved the diversity of the panel. Thank you!”
“I valued the information about Data maturity – getting customers from data > information > knowledge.”
“This was great. There was a lot of interesting discussion. Mico was a great host.”
“Mico was a fantastic moderator and the panel (for the most part) were experts in their field.”

“Mico was great at moving the conversations along and keeping it on point…very well done. I guess I was just disappointed that I didn’t come away with more “definitive” material. I’ll admit, I didn’t know what to expect.”
“Mico interrupted the panelists with her own comments too much and didn’t manage the room well. The panelists were fantastic.”
“This session need a Plan B for when the audience did not contribute many (or any) questions. Not a worthwhile session for me.”
“The panel could have been a little better balanced. They all(?) were consultants, it would have been nice to have someone on the panel who is just working for a company.”

My Thoughts

It’s tough to tell towards whom the comments were aimed (except the comments that mentioned Mico which were evenly split positive and less-than-positive).

I’ll share that it’s tough to get high ratings from a panel presentation. Part of the reason is it’s technically impossible to prepare in advance for the questions from the audience. I believe that’s also an advantage for some attendees.

Of the 20 people who supplied evaluations, 4 ranked the panel session below 3.0. You don’t have to be good at statistics to realize those scores are going to tank the averages.

Conclusion

If you’re going to present, you should understand the dynamics of evaluations.

I shared with a friend – who happens to be a very good speaker and received some disappointing comments on his evaluations – with experience, we know when we’ve delivered as good a presentation as we can and when we’ve fallen short of that goal. For me:

I know I left it all on the field in my presentations at PASS Summit 2018. I did my very best. I practiced and the practice paid off. I planned and the planning helped – especially with timing demos in both the precon and Faster SSIS.

I’ve written about evaluations in the past – especially for free events – see Being a More-Aware Free Technical Event Attendee in which I describe a game – called “There’s Always One” – that I play with evaluations.

Because I know I delivered these presentations well, I’m happy with these numbers. Will I strive to do better? Will I incorporate this feedback into future deliveries of the precon and Faster SSIS? Goodness yes! Overall, though, I’m happy to have had the honor of delivering a precon solo at the PASS Summit and that I was selected to deliver a session and participate on a panel.

Frank (@Tableteer) and I just recorded a show in which I mention both this post and the game called “There’s Always One.” Check it out, along with other podcast recordings at DataDriven.tv!

:{>

Biml, Book Reviews, and Metadata-Driven Frameworks

I occasionally (rarely) read reviews at Amazon of books I’ve written. If I learn of a complaint regarding the book I often try to help. If a reader experiences difficulty with demos, I often offer to help, and sometimes I meet with readers to work through some difficulty related to the book (as I offered here). About half the time, there’s a problem with the way the book explains an example or the sample code; the other half the time the reader does not understand what is written.

I own both cases. As a writer it’s my job to provide good examples that are as easy to understand as possible. Sometimes I fail.

In very rare instances, I feel the review misrepresents the contents of a book –  enough to justify clarification. This review afforded one such opportunity. None of what I share below is new to regular readers of my blog. I chose to respond in a direct manner because I know and respect the author of the review, and believe he will receive my feedback in the manner in which it was intended – a manner (helpful feedback) very similar to the manner in which I believe his review was intended (also helpful feedback).

Enjoy.

My Reply to This Review of The Biml Book:

Thank you for sharing your thoughts in this review.

I maintain successful technology solutions are a combination of three factors:

  1. The problem we are trying to solve;
  2. The technology used to solve the problem; and
  3. The developer attempting to use a technology to solve the problem.

I believe any technologist, given enough time and what my mother refers to as “gumption” (a bias for action combined with a will – and the stubbornness, er… tenacity – to succeed), can solve any problem with any technology.

I find your statement, “It doesn’t really teach BIML but rather teaches a specific optional framework for those who want to do everything with BIML rather than increase productivity for repetitive patterns” inaccurate. Did you read the entire book? I ask that question because I most often encounter readers who state the precise opposite of the opinion expressed in this review (some of your fellow reviewers express the opposite opinion here, even). I disagree with your statement, but I understand your opinion in light of other opinions expressed to me by readers during my past decade+ of writing several books. I share one example in this blog post. The short version: we often get more – or different – things from reading than we realize.

There is a section in The Biml Book – made up of 3 of the 20 chapters and appendices – that proposes a basic metadata-driven Biml framework which is the basis of a metadata-driven Biml framework in production in several large enterprises. I wrote those 3 chapters and the basic metadata-driven Biml framework in question. Varigence has also produced a metadata-driven Biml framework called BimlFlex – based upon similar principles – which has been deployed at myriad enterprises. The enterprise architects at these organizations have been able to improve code quality and increase productivity, all while decreasing the amount of time required to bring data-related solutions to market. They do not share your opinion, although they share at least some of the problems (you mentioned ETL) you are trying to solve.

Am I saying Biml – or the framework about which I wrote or even BimlFlex – is the end-all-be-all for every SSIS development effort? Goodness no! In fact, you will find disclaimers included in my writings on Biml and SSIS frameworks. I know because I wrote the disclaimers.

Misalignment on any of the three factors for successful technology solutions – the problem, the technology, and/or the developer – can lead to impedance mismatches in application and implementation. For example, Biml is not a good solution for some classes of ETL or data integration (points 1 and 2). And learning Biml takes about 40 hours of tenacious commitment (which goes to point 3). This is all coupled with a simple fact: data integration is hard. SSIS does a good job as a generic, provider-driven solution – but most find SSIS challenging to learn and non-intuitive (I did when I first started using it). Does that mean SSIS is not worth the effort to learn? Goodness, no! It does mean complaints about simplifying the learning process are to be expected and somewhat rhetorical.

Architects disagree. We have varying amounts of experience. We have different kinds of experience. Some architects have worked only as lone-wolf consultants or as members of single-person teams. Others have worked only as leaders of small teams of ETL developers. Rarer still are enterprise architects who are cross-disciplined and experienced as both lone-wolf consultants and managers of large teams in independent consulting firms and large enterprises. Less-experienced architects sometimes believe they have solved “all the things” when they have merely solved “one of the things.” Does this make them bad people? Goodness, no. It makes them less-experienced, though, and helps identify them as such. Did the “one thing” need solving? Goodness, yes. But enterprise architecture is part science and part art. Understanding the value of a solution one does not prefer – or the value of a solution that does not apply to the problem one is trying to solve – lands squarely in the art portion of the gig.

Regarding the science portion of the gig: engineers are qualified to qualitatively pronounce any solution is “over-engineered.” Non-engineers are not qualified to make any such determination, in my opinion.

By way of example: I recently returned from the PASS Summit. Although I did not attend the session, I know an SSIS architect delivered a session in which he made dismissing statements regarding any and all metadata-driven frameworks related to ETL with SSIS. If I didn’t attend his session, how do I know about the content of the presentation? A number of people in attendance approached me after the session to share their opinion that the architect, though he shared some useful information, does not appreciate the problems faced by most enterprise architects – especially enterprise architects who lead teams of ETL developers.

My advice to all enterprise architects and would-be enterprise architects is: Don’t be that guy.

Instead, be the enterprise architect who continues to examine a solution until underlying constraints and drivers and reasons the solution was designed the way it was designed are understood. Realize and recognize the difference between “That’s wrong,” “I disagree,” and “I prefer a different solution.” One size does not fit all.

Finally, some of the demos and Biml platforms were not working at the time you wrote this review. Many have been addressed since that time. In the future, updates to SSIS, Biml, and ETL best practices will invalidate what this team of authors wrote during the first half of 2017. As such, the statements, “And much of the online information is outdated and no longer works. It’s a shame someone hasn’t found a way to simplify the learning process.” is a tautology that defines progress. Learning is part of the job of a technologist. I encourage us all – myself most of all – to continue learning.

Peace.

Catalog Browser v0.7.8.0

I’ve been making smaller, more incremental changes to SSIS Catalog Browser – a free utility from the Data Integration Lifecycle Management suite (DILM Suite).

You can use SSIS Catalog Browser to view SSIS Catalog contents on a unified surface. Catalog Browser works with SSIS Catalogs on-premises and Azure Data Factory SSIS Integration Runtime, or Azure SSIS. It’s pretty cool and the price ($0 USD) is right!

The latest change is a version check that offers to send you to the page to download an update. You will find this change starting with version 0.7.7.0.  Version 0.7.8.0 includes a slightly better-formatted version-check message. As I said, smaller, more incremental changes.

Enjoy!

Two (or More) Kinds of Developers

I’ve made statements about “two kinds of developers” for years. These statements are false inasmuch as all generalizations are false. The statements are not designed to be truisms. They are designed to make people think.

Last week – while presenting a full-day pre-conference session at the PASS Summit 2018 and again when delivering a session about Faster SSIS – I repeated the sentiment shown at the top of this post:

There are two kinds of developers:
1) Those who use source control; and
2) Those who will.

I follow up with: “Because if you do not use source control, you will lose code one day and it will break your heart.” Audience members laugh and the point is made.

More Than Two

There are myriad types of developers. And that’s a good thing. Why? Because there are myriad problems developers face in the wild and those problems need to be solved.

There are no one-size-fits-all solutions. If you attended the PASS Summit last week you likely saw some really cool demos. I know I did. And you may have thought – or even been told – that this, this right here is the answer for which you’ve searched your entire career.

There’s a possibility that the people selling you on whatever-this-is are absolutely correct.
There’s a greater possibility that they are less than absolutely correct.

I write this not to disparage anyone’s solution (or my own solutions). Promise.
I write this to dissuade the disparaging of anyone else’s solution (because that also happens).

The Bottom Line

Goldilocks. The bottom line is we all want the Goldilocks solution. We want the just-right solution that maximizes efficiency and minimizes complexity.

That’s just hard.

I can hear you thinking, “Why is maximizing efficiency and minimizing complexity hard, Andy?” I’m glad you asked. Solutions are a moving target, part art and part science, and the only way to learn where and when to draw the art-science line is experience.

A Moving Target

Maximizing efficiency and minimizing complexity is hard because it’s not at all the bottom line; it’s a line in the middle – a balancing of mutually-exclusive demands on your time, expertise, and energy.

Plus, it shifts.

Everything scales. Things scale either up and / or out or they scale down and / or in. In general (generality warning!), down and in is “bad” and up and out is “good.”

Experience Matters

Experienced architects understand subtle nuances; the art part of the art / science of enterprise software. When you speak with an experienced architect, you may hear her say, “It depends,” often. Good architects will finish the sentence and share at least some of the things upon which “it depends.” Less-experienced architects will present naked scalars and polarized advice.

Naked Scalars

Naked scalars are numeric values in a vacuum. They are unsupported because most are unsupportable. In other words, they are lies. Now, “lies” is a pretty harsh word. I prefer an engineering definition for the word “truth” that sounds an awful lot like the oath witnesses are asked to swear in US courts:

“Do you promise to tell the truth, the whole truth, and nothing but the truth, so help you God?”

This oath covers falsehoods that are shared, yes; but it also covers omissions of the truth.

Examples of naked scalars:

  • “97% of engineers believe ____.”
  • “10% of the people I know have practiced ____ successfully.”

Polarized Advice

Polarized advice can be a special case of naked scalars, advice focused on 0% and 100%. Polarized advice may or may not include scalars (naked or otherwise).

Examples of polarized advice:

  • “I’ve never seen a good use case for ____.”
  • “You should always ____.”

Are naked scalars and polarized advice always bad and wrong? Nope. That would be a generality (and we covered generalities already, did we not?).

Managing the Risk of Inexperience

What exactly is a consultant communicating when they engage naked scalars or polarized advice?
They are signalling a lack of experience.
They are, in effect, stating, “I do not have experience with ____.”

How do you manage the risk of inexperience?
You hire people – architects, especially – who understand there are good reasons systems are designed as they are. They will say things like, “I’m not sure why this was designed this way,” and mean it. It’s not a criticism; it’s an admission of curiosity. Trust me on this: You want curious consultants. They are more likely to identify a solution that solves the problem you are trying to solve in a way that doesn’t create new problems. Returning to the good reasons systems are designed as they are…

  1. Are (or were) the good reasons, well, good? Sometimes.
  2. Do the good reasons scale? Sometimes.
  3. Do the good reasons stand the test of time? Sometimes.

Good architects discern the baby from the bath water. Their experience separates good architects from the crowd. Not-as-good architects are less flexible, less willing to learn, and loathe to admit mistakes.

Let’s face facts, though: All architects and developers know what they know and don’t know what they don’t know. Better architects recognize these uncomfortable truths and mitigate them.

One way to mitigate inexperience – the best way, in my opinion, is to work with others.

The Story Of Us

At Enterprise Data & Analytics, our consultants and architects work together as a team. Our diverse group is a strength, bringing perspective to bear on the problems you are trying to solve. Our experience levels vary, the software and tools with which we work vary, and our demographics vary. As owner, I am honored to lead a team from diverse cultural – as well as diverse technical – backgrounds.

I didn’t set out to build a cultural- / age- / gender-diverse team. I set out to find the best people – to do what Jim Collins describes as “getting the right people on the bus.”

I found, though, that focusing on getting the right people on the bus had the side-effect of building a cultural- / age- / gender-diverse team.

As an added bonus, people of different genders approach problem-solving differently. People of different ethnicity pick up on stuff – especially cultural stuff, including enterprise culture – that people of other cultures miss.

EDNA‘s diversity is a strength that emerged unintentionally, but emerged nonetheless. As Chief Data Engineer, it’s very cool to watch our less-experienced consultants growing into more-experienced consultants and architects, while at the same time watching our people interact and perform as a team – each member catching stuff and contributing ideas because of their unique perspectives.

Cost Value

I can hear some of you thinking, “We’re on a budget here. Don’t good architects cost more than less-than-good architects, Andy?” I feel you. The answer is, “No. Good architects cost less than less-than-good architects.”

I can prove it. Because math. (Read that post for more information…)

It’s often accurate that good architects cost more per hour than less-than-good architects. Do you know why good architects charge more per hour?

Because they are worth more per hour.

(Generality!)

But consider this: “Time will tell” is a tried and true statement. Like good wine, the likelihood a generality is accurate improves with age. If enterprises continue to hire an organization – like Enterprise Data & Analytics or any other firm – to help them solve the problems they are trying to solve, then the folks shouting them down may be doing so in an effort to compete. Competition is fine, but I never hire anyone who talks bad about other clients or the competition. Why? They’ve demonstrated the capacity to talk bad about me at some later date.

Conclusion

I love our team!
I love our expertise!
I love our diversity!
I love that we always deliver value!

Contact me to learn more.

:{>

Want a Free #DILM Book? See Me at the #PASSsummit 2018!

I’ll be at the PASS Summit 2018 next week. I’m delivering a full-day precon Monday, presenting Faster SSIS and participating in the BI & Data Visualization Panel Wednesday, and Enterprise Data & Analytics is exhibiting Wednesday through Friday.

“You Mentioned a Free Book…”

Oh. Yeah. That.

I will be giving away free copies of my latest book: Data Integration Life Cycle Management with SSIS: A Short Introduction by Example starting Wednesday! Would you like to score a free copy? You’ll have to catch me carrying them in the Washington State Convention Center.

They are free while supplies last.

Where Can You Find Andy at the PASS Summit 2018?

Precon!

Monday 5 Nov 2018, I’m delivering a full-day pre-conference session titled Intelligent Data Integration with SSIS. I’m going to cover  everything listed at that link but there is an update about my precon content:

There will be Azure Data Factory content and demos!

Why this addition? Two reasons:

  1. My presentation titled Faster SSIS was selected for Wednesday, 7 Nov 2018 at 1:30 PM PT in room 612. I usually include the three Faster SSIS demos in my precon. This time, you can just view the Faster SSIS session to see those demos. I’ve added some Azure Data factory to the precon in place of these three demos!
  2. I am participating in a panel discussion titled BI & Data Visualization Panel Wednesday, 7 Nov 2018 at 3:15 PM PT in room TCC Yakima 1.

Enterprise Data & Analytics is Exhibiting!

That’s right, you can find me in the Exhibition Hall! Enterprise Data & Analytics is exhibiting at the PASS Summit 2018!

If you are attending the PASS Summit 2018, please catch me for a free book while supplies last. If I’m out of physical copies, come see me anyway… we will work something out. Promise!

:{>