abydos vs suggest

abydos

Abydos NLP/IR library for Python (by chrislit)

Source Code

Suggest alternative

Edit details

suggest

An mmap-persistent Wolfe Garbe's SymSpell spell checking algorithm in Nim (by c-blake)

Suggest topics

Source Code

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

abydos		suggest
	Project
1	Mentions	2
167	Stars	14
-	Growth	-
0.0	Activity	3.6
over 1 year ago	Latest Commit	10 months ago
Python	Language	Nim
GNU General Public License v3.0 only	License	ISC License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

abydos

Posts with mentions or reviews of abydos. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-03-06.

SymSpell: 1M times faster spelling correction
8 projects | news.ycombinator.com | 6 Mar 2022

There's a pretty cool python library with a huge number of these if you want to experiment (GPLv3): https://github.com/chrislit/abydos

suggest

Posts with mentions or reviews of suggest. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-05.

Self Hosted SaaS Alternatives
17 projects | news.ycombinator.com | 5 Mar 2023

You are welcome. Thanks are too rarely offered. :-)
You may also be interested in word stemming ( such as used by snowball stemmer in https://github.com/c-blake/nimsearch ) or other NLP techniques, but I don't know how internationalized/multi-lingual that stuff is, but conceptually you might want "series of stemmed words" to be the content fragments of interest.
Similarity scores have many applications. Weights on graph of cancelled downloads ranked by size might be one. :)
Of course, for your specific "truncation" problem, you might also be able to just do an edit distance against the much smaller filenames and compare data prefixes in files or use a SHA256 of a content-based first slice. ( There are edit distance algos in Nim in https://github.com/c-blake/cligen/blob/master/cligen/textUt.... as well as in https://github.com/c-blake/suggest ).
Or, you could do a little program like ndup/sh/ndup to create a "mirrored file tree" of such content-based slices then you could use any true duplicate-file finder (like https://github.com/c-blake/bu/blob/main/dups.nim) on the little signature system to identify duplicates and go from path suffixes in those clusters back to the main filesystem. Of course, a single KV store within one or two files would be more efficient than thousands of tiny files. There are many possibilities.
SymSpell: 1M times faster spelling correction
8 projects | news.ycombinator.com | 6 Mar 2022

As jamra correctly points out, the entry point to this (which gets a lot of traction on HN) is indeed attacking a strawman tutorial-written-on-an-airplane algorithm. So, the 1M speed-up is majorly over-hyped.
That said, the technique is not wholly without merit, but does carry certain "risk-reward" trade offs related to latency in the memory/storage system because of SymSpell's reliance upon large hash tables. For details see https://github.com/c-blake/suggest

What are some alternatives?

When comparing abydos and suggest you can also consider the following projects:

SymSpell - SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

nimsearch - A nascent tutorial/intro to search engine ideas in Nim

abydos vs SymSpell suggest vs nimsearch

Compare abydos vs suggest and see what are their differences.

abydos

suggest

abydos

suggest

What are some alternatives?