A few years ago I had a conversation with Scott Currie, the CEO of Varigence and inventor of Biml. Scott is one of the smartest people I know and I know a lot of very smart people. Whenever I have the opportunity to communicate with the very smart people I know, I often ask them – based upon their knowledge of me and what I do and any plans I have shared – for advice on stuff I should focus upon moving forward. Scott’s monosyllabic reply to this question?
“Spark“
Over the course of the past couple years I’ve shared – on this blog, even – some of my experiences learning more about Apache Spark. While I expected a little resistance, I was a little surprised by the… intensity… of some of the private messages and emails I received. As I replied to the intense people (and there were only a few), I could stop learning stuff right now, today, and continue working with SSIS for the next decade or so. I believe the same goes for anyone working with SSIS today.
I’m not going to stop learning stuff.
Why SSIS?
I love SSIS. But I don’t use SSIS because I love it. I love SSIS because it solves a particularly difficult piece of the story of enterprise data: data integration and data engineering. I feel a similar way about writing (and I’ve blogged recently about writing). I write because I like to write, not for clicks or branding or any of the other benefits I glean from writing. I consider those benefits cool, but side-effects of my desire to write.
The same goes for SSIS: I love SSIS because it solves a problem that I want to solve. I hope that makes sense.
Why Spark?
Two reasons:
- Spark is taking a more prominent role in SQL Server 2019.
- Spark is the engine beneath Azure Data Factory Data Flows.
How can you learn more about Spark? There are bunches of videos out there on YouTube. YouTube is crowd-sourced and free, which makes it awesome. The quality of YouTube training is crowd-sourced and free, which is a challenge.
I learned a lot from edX. I’m a fan of how edX approaches MOOC (online education). They offer lots of courses for free, which means you can grow your knowledge by investing only time. It’s tough to beat $free. If you want, you can pay for a certificate which you can then add to your online resume or LinkedIn profile like I added this one.
Conclusion
My advice? Spend some time learning. Pick your favorite learning platform and jump in. If you have SQL Server and / or SSIS experience, you already know a lot about the problem enterprises are trying to solve with data engineering tools. That’s a good place to be, but not required.
Keep learning!
:{>
Comments