ibis vs duckdb

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

ibis		duckdb
	Project
23	Mentions	52
4,208	Stars	16,749
10.9%	Growth	11.6%
10.0	Activity	10.0
3 days ago	Latest Commit	1 day ago
Python	Language	C++
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

ibis

Posts with mentions or reviews of ibis. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-17.

Show HN: Hashquery, a Python library for defining reusable analysis
1 project | news.ycombinator.com | 23 Apr 2024

I really don't understand the appeal of dbt vs a proper programming language. The templating approach leads to massive spaghetti. I look forward to trying out something like Ibis [0]
0: https://ibis-project.org/
This Week In Python
5 projects | dev.to | 17 Mar 2024

ibis – portable Python dataframe library
Ibis: The portable Python dataframe library
1 project | news.ycombinator.com | 13 Mar 2024

1 project | news.ycombinator.com | 22 Feb 2024
FLaNK Stack 26 February 2024
50 projects | dev.to | 26 Feb 2024
Quarto
5 projects | news.ycombinator.com | 14 Feb 2024

The main benefit is that you get a Python (or R, Julia or Rust) interpreter. So you can evaluate code. A good example of the value of this is the Ibis docs which use Quarto: https://ibis-project.org/
Polars – A bird's eye view of Polars
4 projects | news.ycombinator.com | 13 Feb 2024

Ive found polars quite intuitive, though for python, I lean more towards [ibis](https://ibis-project.org/). The interface is nearly identical, but ibis has the benefit if building sql queries before pulling any actual data (like dbplyr) — whereas polars requires the data to be in-memory (at least for rdb’s, though correct me if Im wrong)
this to me seems like a good argument for only using ibis, but Im happy to be convinced otherwise
Ibis – Universal Interface for Data Wrangling
1 project | news.ycombinator.com | 13 Feb 2024
Vanna.ai: Chat with your SQL database
13 projects | news.ycombinator.com | 14 Jan 2024

Please add Ibis Birdbrain https://ibis-project.github.io/ibis-birdbrain/ to the list. Birdbrain is an AI-powered data bot, built on Ibis and Marvin, supporting more than 18 database backends.
See https://github.com/ibis-project/ibis and https://ibis-project.org for more details.
Ibis
1 project | news.ycombinator.com | 10 Jan 2024

duckdb

Posts with mentions or reviews of duckdb. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-06.

🪄 DuckDB sql hack : get things SORTED w/ constraint CHECK
1 project | dev.to | 4 Apr 2024
DuckDB: Move to push-based execution model (2021)
1 project | news.ycombinator.com | 15 Mar 2024
DuckDB performance improvements with the latest release
8 projects | news.ycombinator.com | 6 Nov 2023

I'm not sure if the fix is reassuring or not: https://github.com/duckdb/duckdb/pull/9411/files
Building a Distributed Data Warehouse Without Data Lakes
3 projects | news.ycombinator.com | 2 Nov 2023

It's an interesting question!
The problem is that the data is spread everywhere - no choice about that. So with that in mind, how do you query that data? Today, the idea is that you HAVE to put it into a central location. With tools like Bacalhau[1] and DuckDB [2], you no longer have to - a single query can be sharded amongst all your data - EFFECTIVELY giving you a lot of what you want from a data lake.
It's not a replacement, but if you can do a few of these items WITHOUT moving the data, you will be able to see really significant cost and time savings.
[1] https://github.com/bacalhau-project/bacalhau
[2] https://github.com/duckdb/duckdb
DuckDB 0.9.0
3 projects | news.ycombinator.com | 26 Sep 2023
Push or Pull, is this a question?
2 projects | dev.to | 9 Aug 2023

[4] Switch to Push-Based Execution Model by Mytherin · Pull Request #2393 · duckdb/duckdb (github.com)
Show HN: Hydra 1.0 – open-source column-oriented Postgres
12 projects | news.ycombinator.com | 3 Aug 2023

it depends on your query obviously.
In general, I did very deep benchmarking of pg, clickhouse and duckdb, and I sure didn't make stupid mistakes like this: https://news.ycombinator.com/item?id=36990831
My dataset has 50B rows and 2tb of data, and I think columnar dbs are very overhiped and I chose pg because:
- pg performance is acceptable, maybe 2-3x times slower than clickhouse and duckdb on some queries if pg is configured correctly and run on compressed storage
- clickhouse and duckdb start falling apart very fast because they specialized on very narrow type of queries: https://github.com/ClickHouse/ClickHouse/issues/47520 https://github.com/ClickHouse/ClickHouse/issues/47521 https://github.com/duckdb/duckdb/discussions/6696
🦆 Effortless Data Quality w/duckdb on GitHub ♾️
3 projects | dev.to | 25 Jul 2023

This action installs duckdb with the version provided in input.
Using SQL inside Python pipelines with Duckdb, Glaredb (and others?)
6 projects | /r/dataengineering | 30 Jun 2023

Duckdb: https://github.com/duckdb/duckdb - seems pretty popular, been keeping an eye on this for close to a year now.
CSV or Parquet File Format
3 projects | /r/Python | 1 Jun 2023

The Parquet-Go library is very complex, not yet success to use it. So I ask whether DuckDB can provide API https://github.com/duckdb/duckdb/issues/7776

What are some alternatives?

When comparing ibis and duckdb you can also consider the following projects:

snowflake-connector-python - Snowflake Connector for Python

ClickHouse - ClickHouse® is a free analytics DBMS for big data

PySpark-Boilerplate - A boilerplate for writing PySpark Jobs

sqlite-worker - A simple, and persistent, SQLite database for Web and Workers.

Apache Impala - Apache Impala

datasette - An open source multi-tool for exploring and publishing data

pangres - SQL upsert using pandas DataFrames for PostgreSQL, SQlite and MySQL with extra features

octosql - OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

sqlite_scanner - DuckDB extension to read and write to SQLite databases

metabase-clickhouse-driver - ClickHouse database driver for the Metabase business intelligence front-end

katacoda

datafusion - Apache DataFusion SQL Query Engine

ibis vs snowflake-connector-python duckdb vs ClickHouse ibis vs PySpark-Boilerplate duckdb vs sqlite-worker ibis vs Apache Impala duckdb vs datasette ibis vs pangres duckdb vs octosql ibis vs sqlite_scanner duckdb vs metabase-clickhouse-driver ibis vs katacoda duckdb vs datafusion

Compare ibis vs duckdb and see what are their differences.

ibis

duckdb

ibis

duckdb

What are some alternatives?