Rebuilding the most popular spellchecker. Part 1

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • Sonar - Write Clean C++ Code. Always.
  • SaaSHub - Software Alternatives and Reviews
  • WeCantSpell.Hunspell

    A port of Hunspell v1 for .NET and .NET Standard

    Note that there are also a few "pragmatic" ports of Hunspell into other languages (in order to use it in environments where C++ dependency is undesireable), namely WeCantSpell.Hunspell in C# and nspell in JS (very incomplete); and aforementioned nuspell can also be considered a "port" (from legacy C++ to a modern one).

  • JamSpell

    Modern spell checking library - accurate, fast, multi-language

    Obviously, there are open-source spellcheckers other than Hunspell. GNU aspell (that at one point was superseded by Hunspell, but still holds its ground in English suggestion quality), to name one of the older ones; but also there are novel approaches, like SymSpell, claiming to be "1 million times faster" or ML-based JamSpell, claiming to be much more accurate.

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

  • hunspell

    The most popular spellchecking library.

    Currently, Hunspell is maintained on GitHub (repo has only around 1k stars, will you believe it?). It seems that maintenance is not that easy if you'll weight the number of open issues and PRs, and the latest commits timeline: at the time of writing it (Jan 2021), the last commit to master was of May 2020, and the last release was 1.7 on Dec 2018. Hunspell's codebase is mostly "old-school" C++. It is being slowly modernized and it has very few comments; there are thousands of two-branch ifs to handle non-Unicode and Unicode text separately. There is also an attempt to rewrite Hunspell from scratch in a modern C++, which at some point was developed under the hunspell GitHub organization. Now it is independent and called nuspell (and, while not yet supporting all of the Hunspell features, already "achieved" version 4.2.0).

  • spylls

    Pure Python spell-checker, (almost) full port of Hunspell

    Currently, Spylls has ≈1.5k lines of library code in 14 files. It conforms (with some reservations) to all Hunspell's integrational tests. Those tests look like a set of files each, consisting of "test dictionary + what words should be considered good, what words should be considered bad, what should be suggested instead of the bad words", and there are 127 of such sets to pass. There are 2 thousand comment lines in the code, explaining thoroughly every detail of the algorithm and rendered at the Spylls documentation site; note that besides docstrings at the beginning of each class and method, there are also inline comments in code—that's why the documentation site uses custom theme with inline "Show code" feature.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts