Recent Posts

Diane - All tables - Data Profiler

7 minute read

Diane is an open-source hive helper library for apache spark users. The library currently supports parquet and delta tables and provides a set of capabilitie...

Jodie - Append Without Duplication

3 minute read

Jodie is an open-source library that offers a variety of helper functions that make it easy to perform common delta lake operations using apache spark and sc...

My Database Systems Journey I

2 minute read

I have been interested in database systems since the first time I used Postgres at university, maybe it was because I was good at writing SQL or because of t...