Help checking for duplicate images with large number of images

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

CLIP

104 22,316 1.2 Jupyter Notebook

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

All of my research shows that we are going to need to use some kind of table or something because we are dealing with tens of thousands of images if not more so it would not really be able to take an image and manually open every single one of the others each iteration. Research so far seems to show we are looking for equality and not so much direct comparison. I have been able to find what seems to be the closest solution which would be at https://stackoverflow.com/questions/71514124/find-near-duplicate-and-faked-images where two proposed solutions would either use the OpenAI Contrastive Language-Image Pre-Training (CLIP) Model in one solution or use the CV2 Gaussian Blur method listed in another solution. The problem we are having is that we are not sure how to convert this to a large scale as mentioned above where it would work with thousands of images. Speed is not an ultra huge factor, it could run for a while, but it just seems like opening tens thousands of images for comparison each iteration would not be the best approach, or am I incorrect in thinking this?

imagededup

7 4,951 1.5 Python

😎 Finding duplicate images made easy!

imagededup

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Duplicate-Image-Finder

3 396 9.0 Python

difPy - Python package for finding duplicate or similar images within folders

difPy

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Releasing The Force Of Machine Learning: A Novice’s Guide 😃

3 projects | dev.to | 22 Feb 2024
Ask HN: What is a AI chip and how does it work?

4 projects | news.ycombinator.com | 27 May 2023
List of AI-Models

14 projects | /r/GPT_do_dah | 16 May 2023
.gitignore that is not checked into repository

4 projects | news.ycombinator.com | 2 Apr 2023
Creating Image Frames from Videos for Deep Learning Models

2 projects | dev.to | 7 Feb 2023

Help checking for duplicate images with large number of images

This page summarizes the projects mentioned and recommended in the original post on /r/learnpython
image-deduplication Python neural-network duplicate Tensorflow
Post date: 29 Jan 2023

CLIP

imagededup

InfluxDB

Duplicate-Image-Finder

Related posts

Releasing The Force Of Machine Learning: A Novice’s Guide 😃

Ask HN: What is a AI chip and how does it work?

List of AI-Models

.gitignore that is not checked into repository

Creating Image Frames from Videos for Deep Learning Models

Help checking for duplicate images with large number of images

This page summarizes the projects mentioned and recommended in the original post on /r/learnpython image-deduplication Python neural-network duplicate Tensorflow Post date: 29 Jan 2023

CLIP

imagededup

InfluxDB

Duplicate-Image-Finder

Related posts

Releasing The Force Of Machine Learning: A Novice’s Guide 😃

Ask HN: What is a AI chip and how does it work?

List of AI-Models

.gitignore that is not checked into repository

Creating Image Frames from Videos for Deep Learning Models

This page summarizes the projects mentioned and recommended in the original post on /r/learnpython
image-deduplication Python neural-network duplicate Tensorflow
Post date: 29 Jan 2023