countwords
RAMCloud
countwords | RAMCloud | |
---|---|---|
5 | 1 | |
4 | 481 | |
- | 0.4% | |
2.6 | 0.0 | |
6 months ago | over 4 years ago | |
Rust | C++ | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
countwords
- Are there benchmark results of current Forth implementations (interpreted & compiled)?
-
Open any file as bytes
See an example: https://github.com/kimono-koans/countwords/blob/master/rust/fast-simple/main.rs
-
I/O is no longer the bottleneck
this is truly 1978 all over again. No flame graphs, no hardware counters no bottleneck analysis. Using these 'optimizations' for job interviews is questionable at best.
[1] https://benhoyt.com/writings/count-words/
-
Correct name for word matching problem
This might actually be interesting to you: https://benhoyt.com/writings/count-words/
-
Performance comparison: counting words in Python, C/C++, Awk, Rust, and more
In case anyone is interested, I did an optimized, but much more simple, Rust implementation just today[0], which is faster than the optimized implementation on my machine. No indexing into arrays of bytes, etc., no "code golf" measures.
Looks like idiomatic Rust, which I think is interesting. Shows there is more than one way to skin a cat.
[0]: https://github.com/kimono-koans/countwords/blob/master/rust/...
RAMCloud
-
I/O is no longer the bottleneck
On a related note, John Ousterhout (in the RAMCloud project) was basically betting that the latency of accessing RAM on another computer on a fast local network will eventually become competitive to local RAM access.
https://ramcloud.atlassian.net/wiki/spaces/RAM/overview
What are some alternatives?
gccontent-benchmark - Benchmarking different languages for a simple bioinformatics task (Counting the GC fraction of DNA in a FASTA file)
huniq - Filter out duplicates on the command line. Replacement for `sort | uniq` optimized for speed (10x faster) when sorting is not needed.
countwords - Playing with counting word frequencies (and performance) in various languages.
adix - An Adaptive Index Library for Nim
countwords - Playing with counting word frequencies (and performance) in various languages.
Killed by Google - Part guillotine, part graveyard for Google's doomed apps, services, and hardware.
robin-hood-hashing - Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
napkin-math - Techniques and numbers for estimating system's performance from first-principles
simdjson - Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
countwords - Playing with counting word frequencies (and performance) in various languages.
share-file-systems - Use a Windows/OSX like GUI in the browser to share files cross OS privately. No cloud, no server, no third party.