-
LSH
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I'm looking into this, after reading the wikipedia entry of this it sound promissing! I already found a python lib https://github.com/mattilyra/LSH for this, I will get back to you once I tested this!
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
-
VDO: Userspace tools for pools of deduplicated and compressed block storage
-
How I discovered Named Entity Recognition while trying to remove gibberish from a string.
-
DwarFS – The Deduplicating Warp-Speed Advanced Read-Only File System
-
Ask HN: Open-source Windows 11 backup solutions
-
Step by step guide to create customized chatbot by using spaCy (Python NLP library)