reproducible benchmark of database-like ops (by h2oai)

Db-benchmark Alternatives

Similar projects and alternatives to db-benchmark

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better db-benchmark alternative or higher similarity.

Suggest an alternative to db-benchmark

Reviews and mentions

Posts with mentions or reviews of db-benchmark. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-10-12.
  • Data cleaning/ analysis 100-200 million rows of data. Is this doable in R, or is there another program I should try instead?
    reddit.com/r/rstats | 2021-10-12
    Yes, data.table can handle this. But your limiting factor might be RAM. This benckmark shows that data.table can load in RAM a billion rows (9 columns) faster than other solutions. (Source). They run their benchmark on a machine with 50 GB of RAM.
  • [S] I want to introduce C++ DataFrame
    Might you consider PR'ing your library into this benchmark: https://github.com/h2oai/db-benchmark? I'm sure it would make for a useful comparison and also raise the profile of your work.
    The link you missed in the README is to this page: https://h2oai.github.io/db-benchmark/
    Wow, looking at this benchmark posted elsewhere in the thread I'm quite impressed by how fast Julia's DataFrames are considering it's a high-level language. Not as fast as Polars in Python though!
  • Scikit-Learn Version 1.0
    news.ycombinator.com | 2021-09-14
    Data.table is Faster to write and faster to perform


  • Polars 0.16.0 is out!
    reddit.com/r/rust | 2021-09-14
    In this particular case the reason is because it is the most performant dataFrame API in the H2O benchmark, so imo it is justified.
  • Dicas para iniciante em ciencia de dados
    reddit.com/r/brdev | 2021-09-03
  • Apache Arrow Datafusion 5.0.0 release
    news.ycombinator.com | 2021-08-24
    > - Is it possible to handle data larger than fits into RAM?

    Not at the moment, but the community has plans to add support for disk spill.

    > - Any benchmark? like: https://h2oai.github.io/db-benchmark/ ( see 50GB + Join -> "timeout" | "out of memory" )

    One of the committer Daniel is working on a h2oai db benchmark PR for Datafusion :)

    news.ycombinator.com | 2021-08-24
    There is a PR from me with for db-benchmark. For the group by benchmarks, on my machine, it is currently somewhat slower than the fastest (Polars).


    Also we do support running TPC-H benchmarks. For the queries we can run, those are already finishing faster than Spark. We are planning to do more benchmarking and optimizations in the future.

  • R vs Stata
    reddit.com/r/RStudio | 2021-08-15
    If you want to start being called in-depth, I would recommend following this format: https://h2oai.github.io/db-benchmark
  • Nushell 0.34 released - the first release with dataframe support
    reddit.com/r/rust | 2021-07-14
    Congrats team and great work @elferherrera! Note that this backed by Polars and Arrow, and is as fast as it gets. :)
  • R, I love you.
  • Thoughts on Julia Programming Language
    Some benchmarks for dataframes: https://h2oai.github.io/db-benchmark/.
  • JavaScript and the next decade of data programming
    news.ycombinator.com | 2021-06-04
    So much mention of Dplyr and pandas being slow and yet not a single mention of data.table.


  • Julia 1.6 addresses latency issues
    news.ycombinator.com | 2021-05-25
    For a non-numerical benchmark where Julia does really well, you should also check out https://h2oai.github.io/db-benchmark/. There's a bunch of work to be done to improve DataFrames.jl more, but it's already one of the fastest tools for what it does (despite having only reached version 1.0 a few weeks ago).


Basic db-benchmark repo stats
20 days ago

h2oai/db-benchmark is an open source project licensed under Mozilla Public License 2.0 which is an OSI approved license.

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
Find remote jobs at our new job board 99remotejobs.com. There are 34 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.