Stuff I Learned during Hanukkah of Data 2023

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

polars

144 26,043 10.0 Rust

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

That turned out to be related to pola-rs/polars#11912, and this linked comment provided a deceptively simple solution - use PARSE_DECLTYPES when creating the connection:

Pandas

393 41,923 10.0 Python

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
nbdime

7 2,595 8.7 TypeScript

Tools for diffing and merging of Jupyter notebooks.

I remember hearing about nbdime and thinking it sounded useful, but I've never really needed it since I rarely use Jupyter in the first place. But then I made some changes to my Hanukkah of Data 2023 notebook to work with the follow-up "speed run" challenge (a new dataset and slightly tweaked clues), and the native Git diff was too noisy to be useful. nbdime came to the rescue! Here are the changes I had to make for days 2 and 3 during the speed run:

datasette

187 8,934 9.3 Python

An open source multi-tool for exploring and publishing data

Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.

advent-of-code-jq

232 203 7.8 jq

Solving Advent of Code with jq

Hanukkah of Data is a series of data-themed puzzles, where you solve puzzles to move your way through a holiday-themed story using a fictional dataset. I think of it as "Advent of Code meets SQL Murder Mystery".

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Polars: alternativa ao Pandas
2 projects | /r/datasciencebr | 13 Jun 2023
Benchmarking for Pandas and Polars Using CSV and Parquet File
5 projects | /r/Python | 15 May 2023
Replacing Pandas with Polars. A Practical Guide
4 projects | news.ycombinator.com | 22 Jan 2023
Hanukkah of Data 2022 - Puzzle 2
2 projects | dev.to | 30 Dec 2022
High-performance Python
4 projects | /r/Python | 15 Jun 2022

Stuff I Learned during Hanukkah of Data 2023

This page summarizes the projects mentioned and recommended in the original post on dev.to
Python Science and Data analysis Sqlite dataframe-library jupyterlab-extension
Post date: 18 Dec 2023

polars

Pandas

WorkOS

nbdime

datasette

advent-of-code-jq

InfluxDB

Related posts

Stuff I Learned during Hanukkah of Data 2023

This page summarizes the projects mentioned and recommended in the original post on dev.to Python Science and Data analysis Sqlite dataframe-library jupyterlab-extension Post date: 18 Dec 2023

polars

Pandas

WorkOS

nbdime

datasette

advent-of-code-jq

InfluxDB

Related posts

This page summarizes the projects mentioned and recommended in the original post on dev.to
Python Science and Data analysis Sqlite dataframe-library jupyterlab-extension
Post date: 18 Dec 2023