xxHash
Seastar
xxHash | Seastar | |
---|---|---|
28 | 25 | |
8,500 | 8,018 | |
- | 0.8% | |
8.3 | 9.7 | |
4 days ago | 6 days ago | |
C | C++ | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
xxHash
-
The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds
> GPU Hash Table?
How bad would performance have suffered if you sha256'd the lines to build the map? I'm going to guess "badly"?
Maybe something like this in CUDA: https://github.com/Cyan4973/xxHash ?
- ETag and HTTP Caching
-
Day 64: Implementing a basic Bloom Filter Using Java BitSet api
Examples of fast, simple hashes that are independent enough includes murmur, xxHash, Fowler–Noll–Vo hash function and many others
- Closed-addressing hashtables implementation
-
NIST Retires SHA-1 Cryptographic Algorithm
If you're only using the hash for non-cryptographic applications, there are much faster hashes: https://github.com/Cyan4973/xxHash
-
Does the checksum algorithm crc32c-intel support AMD Ryzen series 3000 or newer?
I found the benchmark result of AMD ryzen 5950X
-
[Study Project] A memory-optimized JSON data structure
But what's the catch, you're thinking ? Well, it is a bit slower than its counterparts when it comes to deserializing (and marginally faster for serializing). To achieve smaller footprint, it uses a few tricks and notably a custom hash table to deduplicate strings. This comes at a cost of course (even when featuring xxHash to speed things up), but keeps the slowdown reasonable (I think).
-
What do you typically use for non-cryptographic hash functions?
Non cryptographic hashes has collisions, for example, assume you having content like "abcdefg" which hashed value is "123", in case of weak hash algorithm some other content like "abcdefZ" can also have a hash "123" which basically means such hash function is failed to be unique fingerprint of particular content. BLAKE3 for example can do 6-7Gb/s which make it pretty fast and secure. If your requirement accepts collision with defined error rate, I would advise you to take a look at XXH3 if you need very snappy hash algorithm, which can run at pace or RAM access (30GB/s+), but again, run tests at particular equipment you targeting, may be AES hardware accelerated MeowHash will serve you better.
- C++ gonna die😥
- rsync, article 3: How does rsync work?
Seastar
-
I want to share my latest hobby project, dbeel: A distributed thread-per-core nosql db written in rust
I used glommio as the async executor (instead of something like tokio), and it is wonderful. For people wondering whether it's "good enough" or to use C++ and seastar (as I have thought about a lot before starting this project), take the leap of faith, it's fast - both in terms of run time and to code.
-
How much reason is there to be multi-threaded in the k8s environment
b) It's proven now e.g Seastar, Glommio that the fastest way to run a multi-threaded application is to have one instance with one thread pinned per CPU core. Then to have fibers/lightweight threads on top handling all of the asynchronous code. Your approach of lots of instances is the slowest so there will be a ton of unnecessary thread context-switching.
-
Are You Sure You Want to Use MMAP in Your Database Management System?
The most common example is DPDK [1]. It's a framework for building bespoke networking stacks that are usable from userspace, without involving the kernel.
You'll find DPDK mentioned a lot in the networking/HPC/data center literature. An example of a backend framework that uses DPDK is the seastar framework [2]. Also, I recently stumbled upon a paper for efficient RPC networks in data centers [3].
If you want to learn more, the p99 conference by ScyllaDB has tons of speakers talking about some interesting challenges.
[1] https://www.dpdk.org/.
[2] https://github.com/scylladb/seastar
[3] https://github.com/erpc-io/eRPC
-
Why does Actix-web's handler not require Send?
I assume Tokio itself, see e.g monoio or glommio, but also Seastar for C++.
-
What is DPDK library in C and how to learn it?
https://core.dpdk.org/supported/ lists supported nics. You're best just reading material from the dpdk website for figuring out roughly what it is. It is used for a lot of different goals. For most web C++ stuff it's mainly used because you can avoid round trips of data passing through the kernel and can reference network data without tons of copying. For an example check out the SeaStar framework, https://seastar.io/, which is under the hood of ScyllaDB.
-
How Numberly Replaced Kafka with a Rust-Based ScyllaDB Shard-Aware Application
As this is a Kafka sub, this may be a good opportunity to mention that Redpanda is based on the same framework (seastar) as Scylla. The idea of sharding work to CPU cores turns out to apply very well to the Kafka data model, too!
-
What are some C++ projects with high quality code that I can read through?
Seastar which is a thread per core runtime written by the Scylla devs thats used in both Redpanda and Scylla as the underlying runtime. https://github.com/scylladb/seastar
-
Abstraction Is Expensive
ScyllaDB is, ironically, maybe one of the worst examples the author could have come up with for "abstraction" in the article.
If folks aren't familiar with their work/internal tech, go check out some of their repos like Seastar. They have some of the most talented systems programmers on the planet writing thin veneers over kernel and hardware API's to squeeze every ounce out of performance.
https://github.com/scylladb/seastar
I know it's beside the point, but I just had to share because I thought that was funny
-
Modern JVM Multithreading • Paweł Jurczenko • Devoxx Poland 2021
I’ve seen frameworks for c++ (https://seastar.io/) and rust (https://github.com/actix/actix) which support what you’re describing out of the box.
-
Who is using C++ for web development?
If you're interested in scaling and asynchronous programming in c++ I highly recommend you investigate the SeaStar application framework. You wouldn't build a web service with SeaStar, rather you would build the infrastructure that you would use to build the web service on top of. https://github.com/scylladb/seastar
What are some alternatives?
BLAKE3 - the official Rust and C implementations of the BLAKE3 cryptographic hash function
Folly - An open-source C++ library developed and used at Facebook.
meow_hash - Official version of the Meow hash, an extremely fast level 1 hash
glommio - Glommio is a thread-per-core crate that makes writing highly parallel asynchronous applications in a thread-per-core architecture easier for rustaceans.
xxh - 🚀 Bring your favorite shell wherever you go through the ssh. Xonsh shell, fish, zsh, osquery and so on.
Boost.Asio - Asio C++ Library
blake3 - An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function
Boost - Super-project for modularized Boost
smhasher - Hash function quality and speed tests
ffead-cpp - Framework for Enterprise Application Development in c++, HTTP1/HTTP2/HTTP3 compliant, Supports multiple server backends
swift-crypto - Open-source implementation of a substantial portion of the API of Apple CryptoKit suitable for use on Linux platforms.
Qt - Qt Base (Core, Gui, Widgets, Network, ...)