My day job involves wrangling a lot of data very fast.
I've heard a lot of people raving about several technologies like DuckDB,
(Geo)Parquet, and Apache Arrow recently.
But despite being an "ear...
Over the past two weeks, I've been focused on optimizing some data pipelines.
I inherited some old ones which seemed especially slow,
and I finally hit a limit where an overhaul made sense.
The pipeli...
Today I had a rather peculiar need to search through features from TIGER
matching specific attributes.
These files are not CSV or JSON, but rather ESRI Shapefiles.
Shapefiles are a binary format which...