suggest vs abydos

suggest

An mmap-persistent Wolfe Garbe's SymSpell spell checking algorithm in Nim (by c-blake)

abydos

Abydos NLP/IR library for Python (by chrislit)

phonetic-algorithms distance-metric string-metrics Python NLP Natural Language Processing Machine Learning fuzzy-matching soundex Levenshtein

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

suggest		abydos
	Project
2	Mentions	1
14	Stars	174
-	Growth	-
3.6	Activity	0.0
10 months ago	Latest Commit	over 1 year ago
Nim	Language	Python
ISC License	License	GNU General Public License v3.0 only

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

suggest

Posts with mentions or reviews of suggest. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-05.

Self Hosted SaaS Alternatives
17 projects | news.ycombinator.com | 5 Mar 2023

You are welcome. Thanks are too rarely offered. :-)
You may also be interested in word stemming ( such as used by snowball stemmer in https://github.com/c-blake/nimsearch ) or other NLP techniques, but I don't know how internationalized/multi-lingual that stuff is, but conceptually you might want "series of stemmed words" to be the content fragments of interest.
Similarity scores have many applications. Weights on graph of cancelled downloads ranked by size might be one. :)
Of course, for your specific "truncation" problem, you might also be able to just do an edit distance against the much smaller filenames and compare data prefixes in files or use a SHA256 of a content-based first slice. ( There are edit distance algos in Nim in https://github.com/c-blake/cligen/blob/master/cligen/textUt.... as well as in https://github.com/c-blake/suggest ).
Or, you could do a little program like ndup/sh/ndup to create a "mirrored file tree" of such content-based slices then you could use any true duplicate-file finder (like https://github.com/c-blake/bu/blob/main/dups.nim) on the little signature system to identify duplicates and go from path suffixes in those clusters back to the main filesystem. Of course, a single KV store within one or two files would be more efficient than thousands of tiny files. There are many possibilities.
SymSpell: 1M times faster spelling correction
8 projects | news.ycombinator.com | 6 Mar 2022

As jamra correctly points out, the entry point to this (which gets a lot of traction on HN) is indeed attacking a strawman tutorial-written-on-an-airplane algorithm. So, the 1M speed-up is majorly over-hyped.
That said, the technique is not wholly without merit, but does carry certain "risk-reward" trade offs related to latency in the memory/storage system because of SymSpell's reliance upon large hash tables. For details see https://github.com/c-blake/suggest

abydos

Posts with mentions or reviews of abydos. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-03-06.

SymSpell: 1M times faster spelling correction
8 projects | news.ycombinator.com | 6 Mar 2022

There's a pretty cool python library with a huge number of these if you want to experiment (GPLv3): https://github.com/chrislit/abydos

What are some alternatives?

When comparing suggest and abydos you can also consider the following projects:

nimsearch - A nascent tutorial/intro to search engine ideas in Nim

SymSpell - SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

ordiri

KeenWrite - Free, open-source, cross-platform desktop Markdown text editor with live preview, string interpolation, and math.

jsymspell - Java 8+ zero-dependency port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm

pythainlp - Thai Natural Language Processing in Python.

home-ops - Wife approved HomeOps driven by Kubernetes and GitOps using Flux

fuzzy-item-matching - Use machine learning and the Databricks Lakehouse Platform for product matching that can be used by marketplaces and suppliers for various purposes. Resolve differences between product definitions and descriptions and determine which items are likely pairs and which are distinct across disparate data sets.

core - OPNsense GUI, API and systems backend

coolify - An open-source & self-hostable Heroku / Netlify / Vercel alternative.

symspell - Haskell implementation of the SymSpell spelling correction algorithm

suggest vs nimsearch abydos vs SymSpell suggest vs ordiri abydos vs KeenWrite suggest vs jsymspell abydos vs pythainlp suggest vs home-ops abydos vs fuzzy-item-matching suggest vs core abydos vs jsymspell suggest vs coolify abydos vs symspell

Compare suggest vs abydos and see what are their differences.

suggest

abydos

suggest

abydos

What are some alternatives?