hashtable-benchmarks
zestginx
hashtable-benchmarks | zestginx | |
---|---|---|
8 | 2 | |
29 | 33 | |
- | - | |
4.7 | 0.0 | |
5 months ago | almost 2 years ago | |
Java | C | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hashtable-benchmarks
-
Building a faster hash table for high performance SQL joins
Since the blog post mentioned a PR to replace linear probing with Robin Hood, I just wanted to mention that I found bidirectional linear probing to outperform Robin Hood across the board in my Java integer set benchmarks:
https://github.com/senderista/hashtable-benchmarks/blob/mast...
https://github.com/senderista/hashtable-benchmarks/wiki/64-b...
-
Ask HN: Who wants to be hired? (December 2023)
https://homes.cs.washington.edu/~magda/papers/wang-cidr17.pd...
I'm most interested in developing high-performance database engines in low-level languages, but open to any challenging systems programming project. I've been working in C++ for the last 3 years, but have written nontrivial projects in Rust and Java as well (e.g., https://github.com/senderista/rotated-array-set, https://github.com/senderista/hashtable-benchmarks). I would enjoy using Rust or Zig on a new project, but I consider the project itself to be much more important than the language it's written in. I am not interested in cryptocurrency, adtech, or fintech projects.
-
Factor is faster than Zig
Thanks for the details on your benchmarks. I would like sometime to extend BLP to a more generic setting; as I said I think any trick used with RH would also work with BLP. I just used an integer set because that's all I needed for my use case and it was easy to implement several different approaches for benchmarking. As you note, it favors use cases where the hash function is cheap (or invertible) and elements are cheap to move around.
About your question on load factors: no, the benchmarks are measuring exactly what they claim to be. The hash table constructor divides max data size by load factor to get the table size (https://github.com/senderista/hashtable-benchmarks/blob/mast...), and the benchmark code instantiates each hash table for exactly the measured data set size and load factor (https://github.com/senderista/hashtable-benchmarks/blob/mast...).
I can't explain the peaks around 1M in many of the plots; I didn't investigate them at the time and I don't have time now. It could be a JVM artifact, but I did try to use JMH "best practices", and there's no dynamic memory allocation or GC happening during the benchmark at all. It would be interesting to port these tables to Rust and repeat the measurements with Criterion. For more informative graphs I might try a log-linear approach: divide the intervals between the logarithmically spaced data sizes into a fixed number of subintervals (say 4).
-
Inside boost::unordered_flat_map
I think "bidirectional linear probing" is an underrated approach (and much simpler): https://github.com/senderista/hashtable-benchmarks/blob/master/src/main/java/set/int64/BLPLongHashSet.java
-
A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
I will probably never get around to porting my bidirectional linear probing integer hash set from Java to C++, but I hope someone can try adapting BLP to general C++ hashmaps and hashsets, because it significantly outperforms Robin Hood in my benchmarks.
-
Ask HN: Who wants to be hired? (March 2022)
https://homes.cs.washington.edu/~magda/papers/wang-cidr17.pd...
I'm most interested in developing high-performance database engines in low-level languages, but open to any challenging systems programming project. I've been working in C++ for the last 2 years, but have written nontrivial projects in Rust and Java as well (e.g., https://github.com/senderista/rotated-array-set, https://github.com/senderista/hashtable-benchmarks). I would enjoy using Rust or Zig on a new project, but I consider the project itself to be much more important than the language it's written in. I am not interested in cryptocurrency, adtech, or fintech projects.
zestginx
-
Ask HN: Who wants to be hired? (March 2022)
Location: London, United Kingdom
Remote: Yes
Willing to relocate: Yes [, within reason.]
Technologies: C#, Python, System Administration, Linux System Administration, JS, VM Management, Cloudflare Workers, whilst eager to learn other technologies.
Résumé/CV: https://www.linkedin.com/in/dneiroukh
Email: dnairoukh at thezest dot dev
I'm Diab Neiroukh. I'm a student who has taken a year out and is looking for work (part or full-time) during such; I'm also happy to continue on with any full-time work as part-time through into my studies [provided the hours are reasonable]. I study a joint major of Computing Science and Physics (as a BSc).
I've had at least three years of experience managing Linux servers, and once set up a Windows network for a local, private school. I know C#, JS, and Python [the latter more so] but am adaptable to other languages (see my work in C on Android/Linux kernels, and brief Java usage on other Android projects) and would be open to work as a Junior Developer/Software Engineer. I also maintain a fork of NGINX aiming to improve performance [and security] of the web server called Zestginx [1].
I've also dabbled in many different areas outside of the above (from audio signal processing to compilers and assembly to graphics design) a few of which are visible on my GitHub [2] and GitHub Gists [3]. This allows me to often give a unique view on topics using knowledge from other "departments".
[1] - https://github.com/ZestProjects/zestginx
[2] - https://github.com/lazerl0rd
[3] - https://gist.github.com/lazerl0rd
-
Ask HN: Who wants to be hired? (February 2022)
```
I'm Diab Neiroukh. I'm a student who has taken a year out and is looking for work (part or full-time) during such; I'm also happy to continue on with any full-time work as part-time through into my studies [provided the hours are reasonable]. I study a joint major of Computing Science and Physics (as a BSc).
I've had at least three years of experience managing Linux servers, and once set up a Windows network for a local, private school. I know C#, JS, and Python [the latter more so] but am adaptable to other languages (see my work in C on Android/Linux kernels, and brief Java usage on other Android projects) and would be open to work as a Junior Developer/Software Engineer. I also maintain a fork of NGINX aiming to improve performance [and security] of the web server called Zestginx [1].
I've also dabbled in many different areas outside of the above (from audio signal processing to compilers and assembly to graphics design) a few of which are visible on my GitHub [2] and GitHub Gists [3]. This allows me to often give a unique view on topics using knowledge from other "departments".
[1] - https://github.com/ZestProjects/zestginx
What are some alternatives?
unordered_dense - A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
boden - Purely native C++ cross-platform GUI framework for Android and iOS development. https://www.boden.io
myria - Myria is a scalable Analytics-as-a-Service platform based on relational algebra.
nafeez.xyz - ⚡ My personal website.
js2scheme
rapid-react - A light weight interactive CLI Automation Tool 🛠️ for rapid scaffolding of tailored React apps with Create React App under the hood. :atom:
flat_hash_map - A very fast hashtable
robin-hood-hashing - Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
G3root
resume