Python similarity

Open-source Python projects categorized as similarity

Top 5 Python similarity Projects

  • python-string-similarity

    A library implementing different string similarity and distance measures using Python.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • Duplicate-Image-Finder

    difPy - Python package for finding duplicate or similar images within folders

  • unisim

    UniSim is a package for efficient similarity computation, fuzzy matching, and clustering of data.

    Project mention: Finding near-duplicates with Jaccard similarity and MinHash | news.ycombinator.com | 2024-07-04

    Hashing or tiny neural nets combined with a Vector Search engine with Tanimoto/Jaccard is a very common deduplication strategy for large datasets. It might be wiser than using linear-complexity MapReduce operations.

    There is a nice Google project using 0.5 M parameter RETSim model and the USearch engine for that: https://github.com/google/unisim

  • pysimilar

    A python library for computing the similarity between two strings (text) based on cosine similarity

  • image-deduplication-plugin

    Remove exact and approximate duplicates from your dataset in FiftyOne!

    Project mention: Aug 7, 2024 - Developing Data-Centric Visual AI Apps Workshop | dev.to | 2024-08-07

    From concept interpolation to image deduplication, optical character recognition, and even curating your own AI art gallery by adding generated images directly into a dataset, your imagination is the only limit. Join us to discover how you can unleash your creativity and interact with data like never before.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python similarity discussion

Log in or Post with

Python similarity related posts

  • Finding near-duplicates with Jaccard similarity and MinHash

    4 projects | news.ycombinator.com | 4 Jul 2024
  • Aurora: an open source Automated malware similarity platform with modularity in mind.

    1 project | /r/blueteamsec | 6 Jun 2021
  • Speeding up edit distance calculation process

    1 project | /r/learnpython | 20 May 2021

Index

What are some of the best open-source similarity projects in Python? This list will help you:

Project Stars
1 python-string-similarity 982
2 Duplicate-Image-Finder 446
3 unisim 109
4 pysimilar 19
5 image-deduplication-plugin 14

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com