Friendlier SQL with DuckDB

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • SonarLint - Deliver Cleaner and Safer Code - Right in Your IDE of Choice!
  • JetBrains - Developer Ecosystem Survey 2022
  • Scout APM - Less time debugging, more time building
  • duckdb

    DuckDB is an in-process SQL OLAP Database Management System

    Good questions! You are correct that it depends. We do have indexes to help with point queries, but they are not going to be quite as fast as SQLite because DuckDB stores data in a columnar format. (Soon they will be persistent - see comments above!) That columnar format is really great for scanning many items, but not optimal for grabbing all of a single row.

    With DuckDB, bulk inserts are your friend and are actually super fast.

    Definitely let us know what you find! Just open up a discussion on Github if you'd like to share what you find out: https://github.com/duckdb/duckdb/discussions

  • sqlitescanner

    DuckDB extension to read SQLite databases

    Excellent question! I'll jump in - I am a part of the DuckDB team though, so if other users have thoughts it would be great to get other perspectives as well.

    First things first - we really like quite a lot about the SQLite approach. DuckDB is similarly easy to install and is built without dependencies, just like SQLite. It also runs in the same process as your application just like SQLite does. SQLite is excellent as a transactional database - lots of very specific inserts, updates, and deletes (called OLTP workloads). DuckDB can also read directly out of SQLite files as well, so you can mix and match them! (https://github.com/duckdblabs/sqlitescanner)

    DuckDB is much faster than SQLite when doing analytical queries (OLAP) like when calculating summaries or trends over time, or joining large tables together. It can use all of your CPU cores for sometimes ~100x speedup over SQLite.

    DuckDB also has some enhancements with respect to data transfer in and out of it. It can natively read Pandas, R, and Julia dataframes, and can read parquet files directly also (meaning without inserting first!).

    Does that help? Happy to add more details!

  • SonarLint

    Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.

  • db-benchmark

    reproducible benchmark of database-like ops

    Hi, good to hear that you guys care about testing. One thing apart from the Github issues that led me to believe it might not be super stable yet was the benchmark results on https://h2oai.github.io/db-benchmark/ which make it look like it couldn't handle the 50GB case due to a out of memory error. I see that the benchmark and the used versions are about a year old so maybe things changed a lot since then. Can you chime in regarding the current story of running bigger DBs like 1TB on a machine with just 32GB or so RAM? Especially regardung data mutations and DDL queries. Thanks!

  • go-duckdb

    go-duckdb provides a database/sql driver for the DuckDB database engine.

    Here is a solved Github Issue related to CGO for the Go bindings! If you have another issue, please feel free to post it on their Github page!

    https://github.com/marcboeker/go-duckdb/issues/4

  • budibase

    Budibase is an open-source low code platform that helps you build internal tools in minutes 🚀

  • postgresscanner

    Interesting thought! I have not tried this yet so I only have a guess as an answer. Could you export the data as SQL statements and then run those statements on DuckDB? That may be easier to set up, but may take longer to run...

    DuckDB also has the ability to read Postgres data directly, and there is a Postgres FDW that can read from DuckDB!

    https://github.com/duckdblabs/postgresscanner

    https://github.com/alitrack/duckdb_fdw

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts