Java String Similarity VS polyleven

Compare Java String Similarity vs polyleven and see what are their differences.

Java String Similarity

Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ... (by tdebatty)

polyleven

Fast Levenshtein Distance Library for Python 3 (by fujimotos)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
Java String Similarity polyleven
- 1
2,654 76
- -
0.0 10.0
almost 2 years ago over 1 year ago
Java C
GNU General Public License v3.0 or later MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

Java String Similarity

Posts with mentions or reviews of Java String Similarity. We have used some of these posts to build our list of alternatives and similar projects.

We haven't tracked posts mentioning Java String Similarity yet.
Tracking mentions began in Dec 2020.

polyleven

Posts with mentions or reviews of polyleven. We have used some of these posts to build our list of alternatives and similar projects.
  • Spellcheck and Levenshtein distance
    1 project | /r/learnmachinelearning | 15 Nov 2022
    polyleven is the fastest Levenshtein distance library I've been able to find. It also has a threshold parameter which can be used to speed up the calculations. That being said, I've had a lot more success speeding up the processing of large text datasets by converting the words to a vector space (using e.g. word2vec) then calculating euclidean distance, which is much faster than calculating Levenshtein distance (assuming you are using vectorized operations). The fastest solution would probably be to use approximate nearest neighbor search (see for example the faiss library), but again you'll have to embed your words in a vector space and you'll need to decide if this is viable for your use case.

What are some alternatives?

When comparing Java String Similarity and polyleven you can also consider the following projects:

TextDistance - 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

distlib - Distance related functions (Damerau-Levenshtein, Jaro-Winkler , longest common substring & subsequence) implemented as SQLite run-time loadable extension. Any UTF-8 strings are supported.

go-edlib - 📚 String comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...

SymSpell - SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm

Leetcode - Solutions to LeetCode problems; updated daily. Subscribe to my YouTube channel for more.

RapidFuzz - Rapid fuzzy string matching in Python using various string metrics

Java - All Algorithms implemented in Java

lev - Levenshtein distance function as C Extension for Python 3

StringDistances.jl - String Distances in Julia

interviews - Everything you need to know to get the job.

java-algorithms-implementation - Algorithms and Data Structures implemented in Java

Quickenshtein - Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support