KindleClippingsTranslator
raikv
KindleClippingsTranslator | raikv | |
---|---|---|
1 | 2 | |
1 | 7 | |
- | - | |
0.0 | 7.3 | |
almost 10 years ago | 3 months ago | |
Python | C++ | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
KindleClippingsTranslator
-
Performance comparison: counting words in Python, Go, C++, C, Awk, Forth, Rust
Sure here it is[0]. This one is for parsing Kindle clippings but it's quite similar (except no sorting of all words, I was usually using some service for mobi to txt switch before feeding script). I should probably clean it up and switch implementation to english (but don't use it anymore).
[0] https://github.com/Machiaweliczny/KindleClippingsTranslator/...
raikv
-
New x86 micro-op vulnerability breaks all known Spectre defenses
I have a graph for this:
https://github.com/raitechnology/raikv/blob/master/graph/mt_...
The CPU in this case is a Threadripper 3970x, 32 cores, 64 SMT.
My experience is this: When the L3 cache is effective, then the memory latency hiding via memory prefetch works well across SMT threads. If the hashtable load requires a chain walk, the SMT latency hiding is less effective because the calculated prefetch location is not the actual hit. I couldn't get prefetching multiple slots as the load increased to be as effective as prefetching a single slot.
-
Performance comparison: counting words in Python, Go, C++, C, Awk, Forth, Rust
Amusingly, I've done a multi-threaded version of the word counting program in order to test a shm kv store. I needed benchmark that created a lot of cross thread concurrent accesses to keys and I found a blog about this test. My version has serious constraints though, you have to create a shared memory map with enough space to hold all of the keys beforehand, as it doesn't resize the shm kv map as it runs.
This is the source for it:
https://github.com/raitechnology/raikv/blob/master/test/ctes...
The speedup of the multi-threaded version vs the single-threaded version is about linear. The single threaded version uses 2 threads, one to read stdin and one to hash the keys, the 16 threaded version uses one thread to read, 16 to hash.
$ time ctest -t 1 < ~/data/enwiki-p10p30303
What are some alternatives?
adix - An Adaptive Index Library for Nim
countwords - Playing with counting word frequencies (and performance) in various languages.
wordcount - Counting words in different programming languages.
generate-random-numbers
word_frequency_nim - The word frequency program, written in simple nim.
CPython - The Python programming language