wordcount
raikv
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wordcount
-
Performance comparison: counting words in Python, Go, C++, C, AWK, Forth, and Rust
Another similar experiment. And has Java included :)
-
Performance comparison: counting words in Python, Go, C++, C, Awk, Forth, Rust
There is a similar great project here [1]. The performance of Java there is super impressive.
[1] https://github.com/juditacs/wordcount
raikv
-
New x86 micro-op vulnerability breaks all known Spectre defenses
I have a graph for this:
https://github.com/raitechnology/raikv/blob/master/graph/mt_...
The CPU in this case is a Threadripper 3970x, 32 cores, 64 SMT.
My experience is this: When the L3 cache is effective, then the memory latency hiding via memory prefetch works well across SMT threads. If the hashtable load requires a chain walk, the SMT latency hiding is less effective because the calculated prefetch location is not the actual hit. I couldn't get prefetching multiple slots as the load increased to be as effective as prefetching a single slot.
-
Performance comparison: counting words in Python, Go, C++, C, Awk, Forth, Rust
Amusingly, I've done a multi-threaded version of the word counting program in order to test a shm kv store. I needed benchmark that created a lot of cross thread concurrent accesses to keys and I found a blog about this test. My version has serious constraints though, you have to create a shared memory map with enough space to hold all of the keys beforehand, as it doesn't resize the shm kv map as it runs.
This is the source for it:
https://github.com/raitechnology/raikv/blob/master/test/ctes...
The speedup of the multi-threaded version vs the single-threaded version is about linear. The single threaded version uses 2 threads, one to read stdin and one to hash the keys, the 16 threaded version uses one thread to read, 16 to hash.
$ time ctest -t 1 < ~/data/enwiki-p10p30303
What are some alternatives?
countwords - Playing with counting word frequencies (and performance) in various languages.
adix - An Adaptive Index Library for Nim
KindleClippingsTranslator - Czytacz slowek
llfio - P1031 low level file i/o and filesystem library for the C++ standard
parallel-hashmap - A family of header-only, very fast and memory-friendly hashmap and btree containers.
generate-random-numbers
abseil-cpp - Abseil Common Libraries (C++)
word_frequency_nim - The word frequency program, written in simple nim.
CPython - The Python programming language