duckdb
toydb
Our great sponsors
duckdb | toydb | |
---|---|---|
52 | 16 | |
16,576 | 5,886 | |
10.7% | - | |
10.0 | 8.8 | |
5 days ago | 13 days ago | |
C++ | Rust | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
duckdb
- 🪄 DuckDB sql hack : get things SORTED w/ constraint CHECK
- DuckDB: Move to push-based execution model (2021)
-
DuckDB performance improvements with the latest release
I'm not sure if the fix is reassuring or not: https://github.com/duckdb/duckdb/pull/9411/files
-
Building a Distributed Data Warehouse Without Data Lakes
It's an interesting question!
The problem is that the data is spread everywhere - no choice about that. So with that in mind, how do you query that data? Today, the idea is that you HAVE to put it into a central location. With tools like Bacalhau[1] and DuckDB [2], you no longer have to - a single query can be sharded amongst all your data - EFFECTIVELY giving you a lot of what you want from a data lake.
It's not a replacement, but if you can do a few of these items WITHOUT moving the data, you will be able to see really significant cost and time savings.
[1] https://github.com/bacalhau-project/bacalhau
[2] https://github.com/duckdb/duckdb
- DuckDB 0.9.0
-
Push or Pull, is this a question?
[4] Switch to Push-Based Execution Model by Mytherin · Pull Request #2393 · duckdb/duckdb (github.com)
-
Show HN: Hydra 1.0 – open-source column-oriented Postgres
it depends on your query obviously.
In general, I did very deep benchmarking of pg, clickhouse and duckdb, and I sure didn't make stupid mistakes like this: https://news.ycombinator.com/item?id=36990831
My dataset has 50B rows and 2tb of data, and I think columnar dbs are very overhiped and I chose pg because:
- pg performance is acceptable, maybe 2-3x times slower than clickhouse and duckdb on some queries if pg is configured correctly and run on compressed storage
- clickhouse and duckdb start falling apart very fast because they specialized on very narrow type of queries: https://github.com/ClickHouse/ClickHouse/issues/47520 https://github.com/ClickHouse/ClickHouse/issues/47521 https://github.com/duckdb/duckdb/discussions/6696
-
🦆 Effortless Data Quality w/duckdb on GitHub ♾️
This action installs duckdb with the version provided in input.
-
Using SQL inside Python pipelines with Duckdb, Glaredb (and others?)
Duckdb: https://github.com/duckdb/duckdb - seems pretty popular, been keeping an eye on this for close to a year now.
-
CSV or Parquet File Format
The Parquet-Go library is very complex, not yet success to use it. So I ask whether DuckDB can provide API https://github.com/duckdb/duckdb/issues/7776
toydb
-
ToyDB: A Rust learning adventure, fun open-source project, and database learning resource for the community
This is great, but you might want to consider a different name. There's already a Rust project called ToyDB, and it's a distributed database with a Raft log, SQL, disk persistence, ACID transactions, etc. It's under active development (though the developer now works at Cockroach Labs), and has 5K stars on GitHub, so I think they have the right to the name.
- What would you rewrite in Rust?
-
Any ideas for resume
Build something you’d like to learn about. Things I’ve considered replicating: A distributed database (see https://github.com/erikgrinaker/toydb), an interpreter (crafting interpreters is a good book), a Ray tracer (http://raytracerchallenge.com/), an RPC compiler and framework, a simpler neural network framework ( https://github.com/pjreddie/darknet)…
-
Which software do you think would be essential for the RISC-V to be succesful ?
Hilariously, I was trying out ToyDB on the Lichee-RV recently. While it does compile and run the five-node example setup (and memory usage is surprisingly low, which is a plus considering the 0.5GB of RAM), performance is three orders of magnitude lower than on a desktop x86 PC. Some of that is due to just having a single core run 5 nodes, some is due to the lower clock speed and slower memory, and some is due to slower storage (SD card). I don't think that explains everything, so I may investigate that later.
-
Learning Rust You Need a Cognitive Frame
toydb
-
Database Development
Well I think if you could replicate this https://github.com/erikgrinaker/toydb anybody would hire you.
- SimpleDB: A Basic RDBMS Built from Scratch
- Ask HN: What are some good rust code to read to learn the language?
- Distributed SQL database in Rust, written as a learning project
- ToyDB: Distributed SQL Database in Rust
What are some alternatives?
ClickHouse - ClickHouse® is a free analytics DBMS for big data
surrealdb - A scalable, distributed, collaborative, document-graph database, for the realtime web
sqlite-worker - A simple, and persistent, SQLite database for Web and Workers.
prql - PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
datasette - An open source multi-tool for exploring and publishing data
bustub - The BusTub Relational Database Management System (Educational)
octosql - OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
duckdb-rs - Ergonomic bindings to duckdb for Rust
metabase-clickhouse-driver - ClickHouse database driver for the Metabase business intelligence front-end
talent-plan - open source training courses about distributed database and distributed systems
datafusion - Apache DataFusion SQL Query Engine
sled - the champagne of beta embedded databases