Hi, I’m Sebastian, a software engineer based in Vienna, Austria, where I run software engineering and machine learning R&D at a startup. My main focus is on deep learning for computer vision and designing data-intensive software systems.
What’s This Blog About?
The core topics are machine learning and artificial intelligence. The blog has also been named one of the top 50 machine learning blogs in 2021 by Feedspot.
Machine Learning Theory
I am a big believer in understanding problems from first principles. You can do lots of mind-blowing things with high-level programming frameworks. In fact, as a practitioner in an enterprise context, you can probably get away without understanding topics like vector calculus or matrix decomposition while still doing a good job. Still, if you don’t understand how the solutions encapsulated by those frameworks operate under the hood, you are essentially limited to textbook recipes. Your freedom to implement custom solutions is somewhat limited. Furthermore, in my experience understanding the mathematical foundations helps a lot in understanding why a particular algorithm is appropriate for a certain problem. So learning the math also should make you a better practitioner. Therefore, I discuss a lot of foundational topics like the mathematics behind machine learning solutions and how algorithms work on a theoretical level.
Building Machine Learning Systems
Building machine learning systems is about much more than TensorFlow or SciKitLearn. In most cases, you need to process and clean lots of data, build a solid software system, and deploy everything. I’ve spent the first years of my career in IT, developing native apps and enterprise applications on Amazon Web Services. Accordingly, software engineering fundamentals and deployment scenarios factor prominently in my discussions of practical applications.
Why did I start this blog?
On most projects, I receive clear instructions from clients who know exactly what they want. My fellow programmers and I then retreat into a mountain hut where we spend months coding a software system. Then we return the complete system to the customer, who is happy and never asks for changes ever again. The system runs smoothly from day one, and we seamlessly move on to the next project… of course, this whole story is pure fantasy.
The last time I had clear instructions and spent days coding in solitary confinement was during my dissertation for my computer science degree.
As a developer/entrepreneur, most projects nowadays entail a colorful mix of responsibilities that include designing software systems and machine learning solutions, communicating solutions to clients, managing remote developers and designers, writing patent applications, pitching investors, and explaining technical challenges to business folks, etc.
Almost everything involves communication. And most of the communication happens with non-technical people.
Being able to clearly communicate complex technical issues to other people without overwhelming them with irrelevant and arcane details is perhaps the most important skill of a modern engineer.
But that’s bloody hard…
How do you explain a neural network’s learning mechanism to someone who thinks a matrix is something you need to wake up from by consuming a red pill?
How do you explain inheritance to someone who associates it with receiving a pile of money from his late grandparents?
When I have difficulties explaining these things, it is by no means the fault of science fiction movies and lucky million dollar heirs :). It is my failure to articulate it in clear and simple terms without technical jargon. According to the physicist Richard Feynman, if you can’t explain something in simple terms, you don’t understand it.
This blog is my attempt to explain technical stuff in simple terms, which will hopefully endow me with better “explaining skills.”
I hope that my explanations will be useful to you and many others.