hunspell
SymSpell
Our great sponsors
hunspell | SymSpell | |
---|---|---|
15 | 12 | |
1,683 | 2,767 | |
2.1% | - | |
3.6 | 4.1 | |
14 days ago | 11 months ago | |
C++ | C# | |
GNU Lesser General Public License v3.0 only | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hunspell
- hunspell version?
-
Text Editor that supports spelling and grammar checking.
i prefer and use hunspell
-
spell-check selected text?
One can implement Huntspell which is what all browsers use (for example when typing in text areas). Is very simple and is C++.
- Documentation on writing a spell checker
-
MindForger 1.53.0 brings Kanban and Eisenhower Matrix on tags, spell check, CSV with OHE tags export and µ terminal
Hunspell-based spell check
-
ISuckAtSpelling.nvim: A NeoVim plugin that auto-corrects spelling mistakes in various natural and programming languages!
Excellent questions. https://github.com/wooorm/dictionaries here are some. The original dataset is here https://github.com/hunspell/hunspell#dictionaries
Since you are using already GPLv3: Why not reusing hunspell dictionaries/wordbooks? https://github.com/hunspell/hunspell
-
Rebuilding the spellchecker, pt.4: Introduction to suggest algorithm
Those questions are open ones—and even the way they can be answered is unclear. Intuitively, Hunspell's suggestions are quite decent—otherwise, it wouldn't be the most widespread spellchecker, after all. A fair amount of "unhappy customers" can be easily found, too, in hunspell's repo issues. At the same time, one should distinguish between different reasons for the sub-par suggestion quality. It might be due to the algorithm itself, or due to the source data quality: the literal absence of the desired suggestion in the dictionary, or lack of aff-file settings that could've guided Hunspell to finding it.
-
Rebuilding the most popular spellchecker. Part 1
Currently, Hunspell is maintained on GitHub (repo has only around 1k stars, will you believe it?). It seems that maintenance is not that easy if you'll weight the number of open issues and PRs, and the latest commits timeline: at the time of writing it (Jan 2021), the last commit to master was of May 2020, and the last release was 1.7 on Dec 2018. Hunspell's codebase is mostly "old-school" C++. It is being slowly modernized and it has very few comments; there are thousands of two-branch ifs to handle non-Unicode and Unicode text separately. There is also an attempt to rewrite Hunspell from scratch in a modern C++, which at some point was developed under the hunspell GitHub organization. Now it is independent and called nuspell (and, while not yet supporting all of the Hunspell features, already "achieved" version 4.2.0).
SymSpell
-
Hacker News top posts: Mar 6, 2022
SymSpell: 1M times faster spelling correction\ (6 comments)
- SymSpell: 1M times faster spelling correction
-
Typo correction using NLP
SymSpell
-
Fuzzy Name Matching in Postgres
I'm glad to see these built-in to Postgres, as these are the basics of fuzzy string matching.
A quantum leap would be to integrate an implementation of the symmetric delete algorithm, such as https://github.com/wolfgarbe/SymSpell
Soundex and Phonex can yield too many false negatives outside of phonetically English names. Levenshtein/Jaro-Winkler aren't indexable solutions themselves, so they require N^2 comparisons. SymSpell conceptually combines these two into an indexed string-distance solution. It has the usual index issue of being designed for many reads, few writes.
-
Rebuilding the spellchecker, pt.4: Introduction to suggest algorithm
Some of the modern approaches to spellchecking still take this road: for example, SymSpell algorithm (claiming to be "1 million times faster") is at its core just a brilliant idea for a novel storage format for a flat word list, that allows optimizing the calculation of edit distance significantly.
-
Rebuilding the spellchecker, pt.3: Lookup–compounds and solutions
https://github.com/wolfgarbe/SymSpell lists 5 JS implementations (+ a Rust one that compiles to web assembly)
What are some alternatives?
JamSpell - Modern spell checking library - accurate, fast, multi-language
nuspell - 🖋️ Fast and safe spellchecking C++ library
spellsitter.nvim - Treesitter powered spellchecker
WeCantSpell.Hunspell - A port of Hunspell v1 for .NET and .NET Standard
cspell - A Spell Checker for Code!
vim-abolish - abolish.vim: easily search for, substitute, and abbreviate multiple variants of a word
vim-litecorrect - Lightweight auto-correction for Vim
nnsplit - Semantic text segmentation. For sentence boundary detection, compound splitting and more.
languagetool - Style and Grammar Checker for 25+ Languages
SymSpell - A JavaScript implementation of the Symmetric Delete spelling correction algorithm.
abbrev-man.nvim - 🍍 A NeoVim plugin for managing vim abbreviations.