Help checking for duplicate images with large number of images

This page summarizes the projects mentioned and recommended in the original post on /r/learnpython

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • CLIP

    CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

  • All of my research shows that we are going to need to use some kind of table or something because we are dealing with tens of thousands of images if not more so it would not really be able to take an image and manually open every single one of the others each iteration. Research so far seems to show we are looking for equality and not so much direct comparison. I have been able to find what seems to be the closest solution which would be at https://stackoverflow.com/questions/71514124/find-near-duplicate-and-faked-images where two proposed solutions would either use the OpenAI Contrastive Language-Image Pre-Training (CLIP) Model in one solution or use the CV2 Gaussian Blur method listed in another solution. The problem we are having is that we are not sure how to convert this to a large scale as mentioned above where it would work with thousands of images. Speed is not an ultra huge factor, it could run for a while, but it just seems like opening tens thousands of images for comparison each iteration would not be the best approach, or am I incorrect in thinking this?

  • imagededup

    😎 Finding duplicate images made easy!

  • imagededup

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Duplicate-Image-Finder

    difPy - Python package for finding duplicate or similar images within folders

  • difPy

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Releasing The Force Of Machine Learning: A Novice’s Guide 😃

    3 projects | dev.to | 22 Feb 2024
  • Ask HN: What is a AI chip and how does it work?

    4 projects | news.ycombinator.com | 27 May 2023
  • List of AI-Models

    14 projects | /r/GPT_do_dah | 16 May 2023
  • .gitignore that is not checked into repository

    4 projects | news.ycombinator.com | 2 Apr 2023
  • Creating Image Frames from Videos for Deep Learning Models

    2 projects | dev.to | 7 Feb 2023