|26 days ago||10 days ago|
|MIT License||Apache License 2.0|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Factor is faster than Zig
11 projects | news.ycombinator.com | 10 Nov 2023
In my example the table stores the hash codes themselves instead of the keys (because the hash function is invertible)
Oh, I see, right. If determining the home bucket is trivial, then the back-shifting method is great. The issue is just that it’s not as much of a general-purpose solution as it may initially seem.
“With a different algorithm (Robin Hood or bidirectional linear probing), the load factor can be kept well over 90% with good performance, as the benchmarks in the same repo demonstrate.”
I’ve seen the 90% claim made several times in literature on Robin Hood hash tables. In my experience, the claim is a bit exaggerated, although I suppose it depends on what our idea of “good performance” is. See these benchmarks, which again go up to a maximum load factor of 0.95 (Although boost and Absl forcibly grow/rehash at 0.85-0.9):
Tsl, Martinus, and CC are all Robin Hood tables (https://github.com/Tessil/robin-map, https://github.com/martinus/robin-hood-hashing, and https://github.com/JacksonAllan/CC, respectively). Absl and Boost are the well-known SIMD-based hash tables. Khash (https://github.com/attractivechaos/klib/blob/master/khash.h) is, I think, an ordinary open-addressing table using quadratic probing. Fastmap is a new, yet-to-be-published design that is fundamentally similar to bytell (https://www.youtube.com/watch?v=M2fKMP47slQ) but also incorporates some aspects of the aforementioned SIMD maps (it caches a 4-bit fragment of the hash code to avoid most key comparisons).
As you can see, all the Robin Hood maps spike upwards dramatically as the load factor gets high, becoming as much as 5-6 times slower at 0.95 vs 0.5 in one of the benchmarks (uint64_t key, 256-bit struct value: Total time to erase 1000 existing elements with N elements in map). Only the SIMD maps (with Boost being the better performer) and Fastmap appear mostly immune to load factor in all benchmarks, although the SIMD maps do - I believe - use tombstones for deletion.
I’ve only read briefly about bi-directional linear probing – never experimented with it.
A simple hash table in C
7 projects | news.ycombinator.com | 13 Jun 2023
So what's the best data structures and algorithms library for C?
8 projects | /r/C_Programming | 15 Mar 2023
It could be that the cost of the function calls, either directly or via a pointer, is drowned out by the cost of the one or more cache misses inevitably invoked with every hash table lookup. But I don't want to say too much before I've finished my benchmarking project and published the results. So let me just caution against laser-focusing on whether the comparator and hash function are/can be inlined. For example stb_ds uses a hardcoded hash function that presumably gets inlined, but in my benchmarking (again, I'll publish it here in coming weeks) shows it to be generally a poor performer (in comparison to not just CC, the current version of which doesn't necessarily inline those functions, but also STC, khash, and the C++ Robin Hood hash tables I tested).
Generic dynamic array in 60 lines of C
4 projects | news.ycombinator.com | 28 Feb 2023
Not an entirely uncommon idea. I've written one.
There's also a well-known one here, in klib: https://github.com/attractivechaos/klib/blob/master/kvec.h
C_dictionary: A simple dynamically typed and sized hashmap in C - feedback welcome
10 projects | /r/C_Programming | 23 Jan 2023
11 projects | /r/cpp | 18 Nov 2022
The New Ghostscript PDF Interpreter
4 projects | news.ycombinator.com | 31 Jul 2022
Code reuse is achievable by (mis)using the preprocessor system. It is possible to build a somewhat usable API, even for intrusive data structures. (eg. the linux kernel and klib)
I do agree that generics are required for modern programming, but for some, the cost of complexity of modern languages (compared to C) and the importance of compatibility seem to outweigh the benefits.
2 projects | /r/C_Programming | 10 Jul 2022
boost::unordered map is a new king of data structures
10 projects | /r/cpp | 30 Jun 2022
Unordered hash map shootout CMAP = https://github.com/tylov/STC KMAP = https://github.com/attractivechaos/klib PMAP = https://github.com/greg7mdp/parallel-hashmap FMAP = https://github.com/skarupke/flat_hash_map RMAP = https://github.com/martinus/robin-hood-hashing HMAP = https://github.com/Tessil/hopscotch-map TMAP = https://github.com/Tessil/robin-map UMAP = std::unordered_map Usage: shootout [n-million=40 key-bits=25] Random keys are in range [0, 2^25). Seed = 1656617916: T1: Insert/update random keys: KMAP: time: 1.949, size: 15064129, buckets: 33554432, sum: 165525449561381 CMAP: time: 1.649, size: 15064129, buckets: 22145833, sum: 165525449561381 PMAP: time: 2.434, size: 15064129, buckets: 33554431, sum: 165525449561381 FMAP: time: 2.112, size: 15064129, buckets: 33554432, sum: 165525449561381 RMAP: time: 1.708, size: 15064129, buckets: 33554431, sum: 165525449561381 HMAP: time: 2.054, size: 15064129, buckets: 33554432, sum: 165525449561381 TMAP: time: 1.645, size: 15064129, buckets: 33554432, sum: 165525449561381 UMAP: time: 6.313, size: 15064129, buckets: 31160981, sum: 165525449561381 T2: Insert sequential keys, then remove them in same order: KMAP: time: 1.173, size: 0, buckets: 33554432, erased 20000000 CMAP: time: 1.651, size: 0, buckets: 33218751, erased 20000000 PMAP: time: 3.840, size: 0, buckets: 33554431, erased 20000000 FMAP: time: 1.722, size: 0, buckets: 33554432, erased 20000000 RMAP: time: 2.359, size: 0, buckets: 33554431, erased 20000000 HMAP: time: 0.849, size: 0, buckets: 33554432, erased 20000000 TMAP: time: 0.660, size: 0, buckets: 33554432, erased 20000000 UMAP: time: 2.138, size: 0, buckets: 31160981, erased 20000000 T3: Remove random keys: KMAP: time: 1.973, size: 0, buckets: 33554432, erased 23367671 CMAP: time: 2.020, size: 0, buckets: 33218751, erased 23367671 PMAP: time: 2.940, size: 0, buckets: 33554431, erased 23367671 FMAP: time: 1.147, size: 0, buckets: 33554432, erased 23367671 RMAP: time: 1.941, size: 0, buckets: 33554431, erased 23367671 HMAP: time: 1.135, size: 0, buckets: 33554432, erased 23367671 TMAP: time: 1.064, size: 0, buckets: 33554432, erased 23367671 UMAP: time: 5.632, size: 0, buckets: 31160981, erased 23367671 T4: Iterate random keys: KMAP: time: 0.748, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 CMAP: time: 0.627, size: 23367671, buckets: 33218751, repeats: 8, sum: 4465059465719680 PMAP: time: 0.680, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 FMAP: time: 0.735, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 RMAP: time: 0.464, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 HMAP: time: 0.719, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 TMAP: time: 0.662, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 UMAP: time: 6.168, size: 23367671, buckets: 31160981, repeats: 8, sum: 4465059465719680 T5: Lookup random keys: KMAP: time: 0.943, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 CMAP: time: 0.863, size: 23367671, buckets: 33218751, lookups: 34235332, found: 29040438 PMAP: time: 1.635, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 FMAP: time: 0.969, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 RMAP: time: 1.705, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 HMAP: time: 0.712, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 TMAP: time: 0.584, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 UMAP: time: 1.974, size: 23367671, buckets: 31160981, lookups: 34235332, found: 29040438
C++ containers but in C
8 projects | /r/C_Programming | 8 Mar 2022
How to build an Ionic Barcode Scanner with Capacitor
2 projects | /r/ionic | 8 May 2023
The biggest difference between the two plugins is the SDK used to recognise the barcodes. The Capacitor Community Barcode Scanner plugin currently uses the ZXing decoder and the Capacitor ML Kit Barcode Scanning plugin uses the ML Kit from Google.
The Basics of how QR codes work
4 projects | /r/programming | 18 Nov 2022
Tool Request: Investigate QR Codes in emails
2 projects | /r/cybersecurity | 10 Aug 2022
Guest WiFi using a QR code
6 projects | news.ycombinator.com | 12 Jul 2022
How to decode a QR-code image in (preferably pure) Python?
4 projects | /r/codehunter | 9 Apr 2022
PyXing (website here) is supposedly a Python port of the popular Java ZXing library, but the initial and only commit is 6 years old and the project has no readme or documentation whatsoever.
Java Command-Line, GUI and Web Apps for Scanning Barcode and QR Code
2 projects | dev.to | 24 Mar 2022
In previous article, we discussed how to construct a command-line barcode and QR code scanning application using Java and Dynamsoft Barcode Reader. In this article, we will create more fancy applications, such as a desktop GUI application and a web application. In addition, we will import ZXing SDK to make a comparison with Dynamsoft Barcode Reader.
Scalable Processing of Swiss PDF Documents using 2D Barcodes on AWS
5 projects | dev.to | 5 Nov 2021
Extract the raw barcode from candidate regions: The extraction of barcodes from the candidate regions is being performed with zxing, an open source library which supports many variations of 1D and 2D barcodes.
Mechanical sympathy for QR codes: making NSW check-in better
4 projects | news.ycombinator.com | 12 Oct 2021
Since most government software is written in Java, the QR codes on PDF check-in posters are probably being generated with ZXing .
Generating a QR code with only ARM Assembly
5 projects | news.ycombinator.com | 8 Sep 2021
zxing has an online decoder -- https://zxing.org/w/decode.jspx -- but I've seen it fail so often that I'm wondering whether it's very "strict" / doesn't account for image distortion that much?
Or does the web site use hardcoded settings that are more tweakable in the library?
Huh, interesting -- zxing is in "Maintenance Mode Only": https://github.com/zxing/zxing
I'd be suuuuper interested to know: what's the state of the art in terms of QR decoding, both in terms of decoding speed, amenability to various image distortions, and (which I've always dreamed of) the ability to partially decode a QR code, even if the error correction fails for some parts of it, or even totally? i.e. a very very verbose debugging/decoding mode?5 projects | news.ycombinator.com | 8 Sep 2021
What are some alternatives?
ZBar - Clone of the mercurial repository http://zbar.hg.sourceforge.net:8000/hgroot/zbar/zbar
mlkit - A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS
C++ Format - A modern formatting library
Thumbnailator - Thumbnailator - a thumbnail generation library for Java
Tess4J - Java JNA wrapper for Tesseract OCR API
Code Scanner - Code scanner library for Android, based on ZXing
PHP CPP - Library to build PHP extensions with C++
Picasso - A powerful image downloading and caching library for Android
stb - stb single-file public domain libraries for C/C++
TwelveMonkeys - TwelveMonkeys ImageIO: Additional plug-ins and extensions for Java's ImageIO
Imgscalr - Simple Java image-scaling library implementing Chris Campbell's incremental scaling algorithm as well as Java2D's "best-practices" image-scaling techniques.