My Database Systems Journey I
I have been interested in database systems since the first time I used Postgres at university, maybe it was because I was good at writing SQL or because of the satisfaction of manipulating data and getting information of out it, either way, all the paths in my career had led me to be a data engineer nowadays, so I spent most of my days at work building data platforms and manipulating data, pretty cool for a data nerd, right? Well, it turns out that my interest in database systems has changed, I don’t only want to use them. I also have the desire to fully understand its internal details and be able to build them.
The Motivation
Database systems are pretty complex systems and it is well known that to fully be able to write a complete database system one should have at least 10 years of experience working on the subject. The majority of my career has been a journey to find a specific subject to which I wanted to dedicate my full potential and ultimately feel satisfied by the achievements in the journey. I believe that query engines in database systems have the perfect balance between complexity, interesting challenges, community, and market, so I started my study journey in that area so let’s see how it works.
The Open Source
As I was thinking about how to approach my learning, I had one thing clear, I wanted to learn from those building database-like systems in the open source. My first goal was to contribute to the open source in a project that was directly or indirectly related to database systems. I found delta to be a relatively new but mature project open-sourced by Databricks, the community is amazing so I open my first PR and it was merged :), after that I needed to continue looking for issues in which I could contribute and meet cool people.
After my first contribution I got more motivated a kept looking for more issues or new projects, and I stumble across a project called Mack. This project is managed by the great Matthew Powers(aka MrPowers). That project offers a variety of helper functions that make it easy to perform common delta lake operations. I asked Matthew if there was a similar project written in scala where I could contribute, and to my surprise, there was one called Jodie, so I started to contribute there as well.
Because Jodie is a small and new project I am able to do major contributions which helped me get more involved in the delta community, build new projects, and acquire knowledge that I will later use to do more database system-related contributions. So for me, it is a win-win strategy as part of my journey.
What’s Comming
I will be talking on this blog about the open source projects that I have been contributing(Jodie, Diane, hio and Delta) and also about my journey on database systems.