hunspell
JamSpell
Our great sponsors
hunspell | JamSpell | |
---|---|---|
15 | 3 | |
1,683 | 529 | |
2.1% | - | |
3.6 | 0.0 | |
14 days ago | about 1 year ago | |
C++ | C++ | |
GNU Lesser General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hunspell
- hunspell version?
-
Text Editor that supports spelling and grammar checking.
i prefer and use hunspell
-
spell-check selected text?
One can implement Huntspell which is what all browsers use (for example when typing in text areas). Is very simple and is C++.
- Documentation on writing a spell checker
-
MindForger 1.53.0 brings Kanban and Eisenhower Matrix on tags, spell check, CSV with OHE tags export and µ terminal
Hunspell-based spell check
-
ISuckAtSpelling.nvim: A NeoVim plugin that auto-corrects spelling mistakes in various natural and programming languages!
Excellent questions. https://github.com/wooorm/dictionaries here are some. The original dataset is here https://github.com/hunspell/hunspell#dictionaries
Since you are using already GPLv3: Why not reusing hunspell dictionaries/wordbooks? https://github.com/hunspell/hunspell
-
Rebuilding the spellchecker, pt.4: Introduction to suggest algorithm
Those questions are open ones—and even the way they can be answered is unclear. Intuitively, Hunspell's suggestions are quite decent—otherwise, it wouldn't be the most widespread spellchecker, after all. A fair amount of "unhappy customers" can be easily found, too, in hunspell's repo issues. At the same time, one should distinguish between different reasons for the sub-par suggestion quality. It might be due to the algorithm itself, or due to the source data quality: the literal absence of the desired suggestion in the dictionary, or lack of aff-file settings that could've guided Hunspell to finding it.
-
Rebuilding the most popular spellchecker. Part 1
Currently, Hunspell is maintained on GitHub (repo has only around 1k stars, will you believe it?). It seems that maintenance is not that easy if you'll weight the number of open issues and PRs, and the latest commits timeline: at the time of writing it (Jan 2021), the last commit to master was of May 2020, and the last release was 1.7 on Dec 2018. Hunspell's codebase is mostly "old-school" C++. It is being slowly modernized and it has very few comments; there are thousands of two-branch ifs to handle non-Unicode and Unicode text separately. There is also an attempt to rewrite Hunspell from scratch in a modern C++, which at some point was developed under the hunspell GitHub organization. Now it is independent and called nuspell (and, while not yet supporting all of the Hunspell features, already "achieved" version 4.2.0).
JamSpell
-
Rebuilding the spellchecker, pt.4: Introduction to suggest algorithm
There is, for example, a curious evaluation table provided by a modern ML-based spellchecker JamSpell. According to it, JamSpell is awesome—while Hunspell is a mere 0.03% better than dummy ("fix nothing") spellchecker... Which doesn't ring true, somehow!
-
Rebuilding the spellchecker, pt.3: Lookup–compounds and solutions
That's a huge topic, which I am planning to cover towards the end of the article series please like and subscribe, but in short: yes, my opinion is that spellchecking is actually a "machine learning problem in disguise", and most of existing dictionaries are more a roundabout way of storing something-not-unlike-models than analytical data.
But ML approach will raise a question of data availability. What good your "deep learning OSS spellchecker" will do if there aren't good (and open) models for it which cover as much languages as existing Hunspell dictionaries do? And what if adding a bunch of new words requires laborous model retraining? It is not unsolvable, but non-trivial.
I believe all the giants have something like this inside (I don't think spelling correction in Google search bar is handled with Hunspell, right?), but it is much harder to do as an open tool, ready to embedding into other software.
There are a notable attempts, though: JamSpell for one (https://github.com/bakwc/JamSpell), which has an open "free" models, and more precise commercial ones; source code is open (maybe also only for using "simplistic" models, haven't dug deeper).
-
Rebuilding the most popular spellchecker. Part 1
Obviously, there are open-source spellcheckers other than Hunspell. GNU aspell (that at one point was superseded by Hunspell, but still holds its ground in English suggestion quality), to name one of the older ones; but also there are novel approaches, like SymSpell, claiming to be "1 million times faster" or ML-based JamSpell, claiming to be much more accurate.
What are some alternatives?
SymSpell - SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
nuspell - 🖋️ Fast and safe spellchecking C++ library
SymSpell - A JavaScript implementation of the Symmetric Delete spelling correction algorithm.
spellsitter.nvim - Treesitter powered spellchecker
WeCantSpell.Hunspell - A port of Hunspell v1 for .NET and .NET Standard
cspell - A Spell Checker for Code!
vim-abolish - abolish.vim: easily search for, substitute, and abbreviate multiple variants of a word
vim-litecorrect - Lightweight auto-correction for Vim
abbrev-man.nvim - 🍍 A NeoVim plugin for managing vim abbreviations.
dictionaries - Hunspell dictionaries in UTF-8
goSpellcheck - A terrible spell checker in Go.