pybktree
TextDistance
pybktree | TextDistance | |
---|---|---|
2 | 6 | |
167 | 3,307 | |
- | 0.7% | |
0.0 | 6.1 | |
over 2 years ago | 17 days ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pybktree
-
Ask HN: What are some 'cool' but obscure data structures you know about?
The BK-Tree, which allows fast querying of "close" matches, such as Hamming distance (number of bits different). http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK...
I wrote a Python library implementing them a number of years ago: https://github.com/benhoyt/pybktree
-
Find closest match to word in really large list
Alternatively a BK-tree might suit your needs https://github.com/benhoyt/pybktree/blob/master/pybktree.py
TextDistance
- textdistance: Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
-
Near duplicate image detection
Now let's compare the hash to all the image hashes using Levenshtein distance. We'll use the textdistance library for that.
- life4/textdistance: Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
- Textdistance: Compute distance between sequences with 30 algorithms
What are some alternatives?
nutree - A Python library for tree data structures with an intuitive, yet powerful API.
jellyfish - 🪼 a python library for doing approximate and phonetic matching of strings.
multiversion-concurrency-control - Implementation of multiversion concurrency control, Raft, Left Right concurrency Hashmaps and a multi consumer multi producer Ringbuffer, concurrent and parallel load-balanced loops, parallel actors implementation in Main.java, Actor2.java and a parallel interpreter
fuzzywuzzy - Fuzzy String Matching in Python
Folly - An open-source C++ library developed and used at Facebook.
Levenshtein - The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity
minisketch - Minisketch: an optimized library for BCH-based set reconciliation
pydantic - Data validation using Python type hints
ann-benchmarks - Benchmarks of approximate nearest neighbor libraries in Python
Python Left-Right Parser - Python Parser
rchowell
ceja - PySpark phonetic and string matching algorithms