Playing with counting word frequencies (and performance) in various languages. (by benhoyt)

Countwords Alternatives

Similar projects and alternatives to countwords

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better countwords alternative or higher similarity.

countwords reviews and mentions

Posts with mentions or reviews of countwords. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-15.
  • How fast is really ASP.NET Core?
    4 projects | /r/programming | 15 Nov 2022
    "dang, I didn't know that was 50x faster than the idiomatic way" or "hey, I didn't know that this implementation in the stdlib prioritized this over that and made this so slow, that's interesting" -- .e.g, there's some kinda neat language details to be found in something like Ben Hoyt's community word count benchmarks repo and 'simple' vs 'optimal' code:
  • Correct name for word matching problem
    2 projects | /r/algorithms | 13 Oct 2022
    It benchmarks programs that count the total number of unique words in some input. It's not exactly equivalent to your problem, but it's similarish. All of the programs used some kind of hash map for lookups, but I contributed a program that used a trie. Its performance in my experience varies depending on the CPU interestingly enough. On my old CPU (i7-6900K) it was a little slower, but on my new cpu (i9-12900KS) it was faster.
  • Performance comparison: counting words in Python, C/C++, Awk, Rust, and more
    12 projects | | 24 Jul 2022
    Are you looking at the "simple" or the "optimized" versions? For the optimized, yes, the Go one is very similar to the C. For the simple, idiomatic version, the Go version [1] is much simpler than the C one [2]: 40 very straight-forward LoC vs 93 rather more complex ones including pointer arithmetic, tricky manual memory management, and so on.


    12 projects | | 24 Jul 2022
    I don't think the performance is due to start up time at all. I actually cloned the repo, and ran the benchmark and found that Swift's execution time scales drastically with the size of the input.

    The benchmark tests each executable by piping in the full King James Bible duplicated 10 times[1] (each copy is 4.13 MB[2]). When I ran it using just a single copy of the input text, the execution time dropped to 58-59 milliseconds, but when I ran the benchmark without modifications it jumped up to over 4 seconds. A hello world script for comparison runs in about 13 milliseconds. The Swift team actually boasts about its quick start up time on the official website [3].




    12 projects | | 24 Jul 2022
    Re: the Rust performance implementation, I was able to get ~25% better performance by rewriting the for loops as iterators and by using a buffered writer, which seems crazy put it's true.[0] I chalked it up to some crazy ILP/SIMD tricks the compiler is doing.

    I even submitted a PR, but Ben decided he was tired of maintaining and decided to archive the project (which fair enough!).


    12 projects | | 24 Jul 2022
    Why not read the source code? :-)

    I wrote comments explaining things:

    12 projects | | 24 Jul 2022
  • The difference between Go and Rust
    6 projects | /r/programming | 28 Sep 2021
    And yet Go was faster than Rust in a simple app that count words:
  • How to Rapidly Improve at Any Programming Language
    8 projects | | 18 Sep 2021
    > but the performance profiles & characteristics that we must know about in order to make a choice on which tool to use. And it shouldn't be that each user has to figure it out on their own, dig into PR's or whatever.

    That's an interesting take – I like the idea of a catalog of standard tasks with implementations in several languages as well as their performance characteristics. I suppose Rosetta Code gets the ball rolling with this, but it's missing some performance metrics. It reminds me of [Ben Hoyt's piece]( on counting unique words in the KJV Bible in different languages.

  • Faster string keyed maps in Go
    2 projects | /r/golang | 22 Jul 2021
    This article shows that map lookups can be optimized by using the (unintuitive) pattern:
  • A note from our sponsor - InfluxDB | 24 Feb 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →


Basic countwords repo stats
almost 2 years ago
The modern API for authentication & user identity.
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.