Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SymSpell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
There is, for example, a curious evaluation table provided by a modern ML-based spellchecker JamSpell. According to it, JamSpell is awesome—while Hunspell is a mere 0.03% better than dummy ("fix nothing") spellchecker... Which doesn't ring true, somehow!
Those questions are open ones—and even the way they can be answered is unclear. Intuitively, Hunspell's suggestions are quite decent—otherwise, it wouldn't be the most widespread spellchecker, after all. A fair amount of "unhappy customers" can be easily found, too, in hunspell's repo issues. At the same time, one should distinguish between different reasons for the sub-par suggestion quality. It might be due to the algorithm itself, or due to the source data quality: the literal absence of the desired suggestion in the dictionary, or lack of aff-file settings that could've guided Hunspell to finding it.
Some of the modern approaches to spellchecking still take this road: for example, SymSpell algorithm (claiming to be "1 million times faster") is at its core just a brilliant idea for a novel storage format for a flat word list, that allows optimizing the calculation of edit distance significantly.