suggest vs SymSpell

suggest

An mmap-persistent Wolfe Garbe's SymSpell spell checking algorithm in Nim (by c-blake)

SymSpell

SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm (by wolfgarbe)

Levenshtein fuzzy-search approximate-string-matching edit-distance Spellcheck spell-check levenshtein-distance damerau-levenshtein Spelling fuzzy-matching word-segmentation chinese-text-segmentation chinese-word-segmentation text-segmentation spelling-correction Symspell

Source Code

seekstorm.com

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

suggest		SymSpell
	Project
2	Mentions	16
14	Stars	3,043
-	Growth	-
3.6	Activity	5.8
10 months ago	Latest Commit	about 1 month ago
Nim	Language	C#
ISC License	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

suggest

Posts with mentions or reviews of suggest. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-05.

Self Hosted SaaS Alternatives
17 projects | news.ycombinator.com | 5 Mar 2023

You are welcome. Thanks are too rarely offered. :-)
You may also be interested in word stemming ( such as used by snowball stemmer in https://github.com/c-blake/nimsearch ) or other NLP techniques, but I don't know how internationalized/multi-lingual that stuff is, but conceptually you might want "series of stemmed words" to be the content fragments of interest.
Similarity scores have many applications. Weights on graph of cancelled downloads ranked by size might be one. :)
Of course, for your specific "truncation" problem, you might also be able to just do an edit distance against the much smaller filenames and compare data prefixes in files or use a SHA256 of a content-based first slice. ( There are edit distance algos in Nim in https://github.com/c-blake/cligen/blob/master/cligen/textUt.... as well as in https://github.com/c-blake/suggest ).
Or, you could do a little program like ndup/sh/ndup to create a "mirrored file tree" of such content-based slices then you could use any true duplicate-file finder (like https://github.com/c-blake/bu/blob/main/dups.nim) on the little signature system to identify duplicates and go from path suffixes in those clusters back to the main filesystem. Of course, a single KV store within one or two files would be more efficient than thousands of tiny files. There are many possibilities.
SymSpell: 1M times faster spelling correction
8 projects | news.ycombinator.com | 6 Mar 2022

As jamra correctly points out, the entry point to this (which gets a lot of traction on HN) is indeed attacking a strawman tutorial-written-on-an-airplane algorithm. So, the 1M speed-up is majorly over-hyped.
That said, the technique is not wholly without merit, but does carry certain "risk-reward" trade offs related to latency in the memory/storage system because of SymSpell's reliance upon large hash tables. For details see https://github.com/c-blake/suggest

SymSpell

Posts with mentions or reviews of SymSpell. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-30.

Should you combine edit distance "spell check" algorithms with phonetic matching algorithms for robust keyword finding?
1 project | /r/AskComputerScience | 7 Nov 2023

The SimSpell algorithm uses deletions to determine edit distance of the input query word compared to a dictionary of correctly spelled words. The Double Metaphone algorithm (or other phonetic algorithms) convert the words to phonetic versions (phonetic "hashes" basically), and you then search based on the input phonetic hash matching the dictionary of phonetic hashes.
Show HN: I automated 1/2 of my typing
11 projects | news.ycombinator.com | 30 Aug 2023
Learn more about spell checkers
2 projects | /r/nlp_knowledge_sharing | 18 Mar 2023

Books: a. "Speech and Language Processing" by Daniel Jurafsky and James H. Martin (3rd Edition) - This book covers various aspects of natural language processing, including a section on spelling correction that provides a comprehensive introduction to the topic. b. "Foundations of Statistical Natural Language Processing" by Christopher D. Manning and Hinrich Schütze - This book provides an overview of statistical approaches in NLP, including a chapter on spelling correction. Articles: a. "How to Write a Spelling Corrector" by Peter Norvig - This article demonstrates the development of a simple spelling corrector using statistical algorithms. It's a great starting point for understanding the basics of spell checkers. (Link: https://norvig.com/spell-correct.html) b. "The Design of a Proofreading Software Service" by Michael D. Garris and James L. Blue - This article presents the design and implementation of a spelling correction system that can be integrated into various applications. (Link: https://www.nist.gov/system/files/documents/itl/iad/89403123.pdf) c. "A Fast and Flexible Spellchecker" by Atkinson, K. (2006) - This article details the design of a spell checker that uses a combination of rule-based and statistical approaches for improved performance. (Link: https://aspell.net/0.60.6.1/aspell-0.60.6.1.pdf) Online Resources: a. The Natural Language Toolkit (NLTK) - This is a popular Python library for natural language processing. It includes a spell checker module and various examples of how to use it. (Link: https://www.nltk.org/) b. SymSpell - This is an open-source spell checking library that uses a Symmetric Delete spelling correction algorithm for high performance and accuracy. The GitHub repository includes a detailed description of the algorithm and examples of how to use it. (Link: https://github.com/wolfgarbe/SymSpell) These resources should provide a solid foundation for understanding the design, algorithms, and usage of spell checkers. Happy learning!
Turn the spellchecker into autocorrection software
2 projects | /r/learnprogramming | 13 Feb 2023

Can this github.com/wolfgarbe/SymSpell or this github.com/ruby/did_you_mean or any of these github.com/topics/spell-check?o=desc&s=forks spellcheckers be used as an autocorrection software?
Help with deep learning project "autocorrection"
1 project | /r/deeplearning | 15 Jan 2023

Do you absolutely need to use deep learning? There are tons of way faster autocorrect implementations that use levenshtein distances and non-DL techniques such as SymSpell or Norvig’s algorithm. DL is both expensive and requires tons of data to train on, I would stay away from that unless you’re doing it for your own enrichment or a school project.
Spellcheck and Levenshtein distance
1 project | /r/MLQuestions | 15 Nov 2022

This library claims to be orders of magnitude faster: https://github.com/wolfgarbe/SymSpell
Auto correct/Auto complete feature
1 project | /r/AskComputerScience | 27 Jun 2022

If you want to do both at the same time (prefix search, allowing for misspellings), you can use a trie, but rather than just putting all your words in it, you can put everything in the "deletion neighborhood" of each word (that is, each possible variant of each word that has one character deleted), in an approach sort of like what's described here. Fair warning, though, that this gets a little hairy, and you'll have to decide how to weight prefix matches vs. misspellings in your rankings.
SymSpell: 1M times faster spelling correction
1 project | /r/hackernews | 6 Mar 2022

8 projects | news.ycombinator.com | 6 Mar 2022
Hacker News top posts: Mar 6, 2022
3 projects | /r/hackerdigest | 6 Mar 2022

SymSpell: 1M times faster spelling correction\ (6 comments)

What are some alternatives?

When comparing suggest and SymSpell you can also consider the following projects:

nimsearch - A nascent tutorial/intro to search engine ideas in Nim

JamSpell - Modern spell checking library - accurate, fast, multi-language

abydos - Abydos NLP/IR library for Python

hunspell - The most popular spellchecking library.

ordiri

wtpsplit - Code for Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation

jsymspell - Java 8+ zero-dependency port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm

languagetool - Style and Grammar Checker for 25+ Languages

home-ops - Wife approved HomeOps driven by Kubernetes and GitOps using Flux

SymSpell - A JavaScript implementation of the Symmetric Delete spelling correction algorithm.

core - OPNsense GUI, API and systems backend

NLP-progress - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

suggest vs nimsearch SymSpell vs JamSpell suggest vs abydos SymSpell vs hunspell suggest vs ordiri SymSpell vs wtpsplit suggest vs jsymspell SymSpell vs languagetool suggest vs home-ops SymSpell vs SymSpell suggest vs core SymSpell vs NLP-progress

Compare suggest vs SymSpell and see what are their differences.

suggest

SymSpell

suggest

SymSpell

What are some alternatives?