The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 14 Python duckdb Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
splink
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
portable-data-stack-dagster
A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB, PostgreSQL and Superset
-
airflow-elt-blueprint
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Recommend checking out https://github.com/tobymao/sqlglot if you are interested in this capability for other SQL dialects
Tools like this are helpful for:
- Rendering SQL in a consistent way, eg for snapshot testing
Project mention: Show HN: Hashquery, a Python library for defining reusable analysis | news.ycombinator.com | 2024-04-23I really don't understand the appeal of dbt vs a proper programming language. The templating approach leads to massive spaghetti. I look forward to trying out something like Ibis [0]
0: https://ibis-project.org/
Project mention: Splink: Fast, accurate, scalable probabilistic data linkage | news.ycombinator.com | 2024-03-13
Project mention: Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor) | news.ycombinator.com | 2023-12-06Hey, HN community!
We're stoked to launch JupySQL today! JupySQL is an open-source library that brings a modern SQL experience to Jupyter. JupySQL is compatible with all major databases, such as Snowflake, Redshift, PostgreSQL, MySQL, MariaDB, DuckDB, SQL Server, Clickhouse, Trino, and more!
To get started, check out our tutorial: https://jupysql.ploomber.io/en/latest/quick-start.html
SQL is the defacto language for data analysis; however, analysis often requires a mix of SQL and Python. JupySQL bridges this gap, allowing users to execute SQL queries seamlessly in Jupyter and continue their analysis in Python. Add %%sql to the top of your cell and start writing SQL.
Here are some of JupySQL's main features:
- Syntax highlighting
Project mention: quack-reduce: duckdb as a stateless query engine over a data lake | news.ycombinator.com | 2024-01-27
Project mention: Show HN: Snowflake Data Quality Checks in Python | news.ycombinator.com | 2024-02-11
Project mention: IceDB v2 – An in-process Parquet merge engine to build dirt-cheap OLAP | news.ycombinator.com | 2023-06-17
Python duckdb related posts
- quack-reduce: duckdb as a stateless query engine over a data lake
- JupySQL: Connecting to a SQL database from Jupyter
- GitHub - ploomber/jupysql: Better SQL in Jupyter. 📊
- SQL CTE's in Jupyter notebooks, DuckDB integration and more
- TL;DR incorporate SQL functionality within Jupyter, access to modern data processing DBs (like DuckDB), polars and data exploration through plotting easier with JupySQL.
- IceDB v2 – An in-process Parquet merge engine to build dirt-cheap OLAP
- A full-featured SQL client for Jupyter
-
A note from our sponsor - WorkOS
workos.com | 25 Apr 2024
Index
What are some of the best open-source duckdb projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | sqlglot | 5,441 |
2 | ibis | 4,074 |
3 | ingestr | 2,308 |
4 | splink | 1,086 |
5 | dbt-duckdb | 719 |
6 | jupysql | 598 |
7 | inline-sql | 412 |
8 | quack-reduce | 116 |
9 | cuallee | 105 |
10 | portable-data-stack-dagster | 111 |
11 | talksheet | 95 |
12 | icedb | 91 |
13 | airflow-elt-blueprint | 42 |
14 | bigdataeng | 0 |
Sponsored