Simple, Fast, and Scalable Reverse Image Search

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

dhash

1 24 10.0 Python

Perceptual hashing algorithm (dhash) to find similar images (by Rayraegah)

Interesting read. Especially the lookup method based on partitioning.
I tried to implement a similar reverse image search based on dHash as explained here https://github.com/Rayraegah/dhash . However, I also had lookup performance problems. Exact matches are not a problem but the Hamming distance threshold matching is. Because my project was in Python, I tried to eke out more performance by writing a BK-tree backend module in C++ https://github.com/mxmlnkn/cppbktree It was 2 to 10x faster than an existing similar module but still was too slow when trying to look up something in a database of millions of images. However, as lookup tended to depend on the exact Hamming-distance threshold value, my next step would have been to try and optimize the hash. E.g, make it shorter so that only a short Hamming distance is necessary to be looked up but the mentioned multi-indexing method looks much more promising and tested.

cppbktree

1 6 6.2 C++

Python BK-Tree module based on a C++ implementation

Interesting read. Especially the lookup method based on partitioning.
I tried to implement a similar reverse image search based on dHash as explained here https://github.com/Rayraegah/dhash . However, I also had lookup performance problems. Exact matches are not a problem but the Hamming distance threshold matching is. Because my project was in Python, I tried to eke out more performance by writing a BK-tree backend module in C++ https://github.com/mxmlnkn/cppbktree It was 2 to 10x faster than an existing similar module but still was too slow when trying to look up something in a database of millions of images. However, as lookup tended to depend on the exact Hamming-distance threshold value, my next step would have been to try and optimize the hash. E.g, make it shorter so that only a short Hamming distance is necessary to be looked up but the mentioned multi-indexing method looks much more promising and tested.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
phashml

1 - -

I have found the ML image categorization models an excellent method of extracting a unique descriptor. It is possible to compress the image for matching and storage into a compact signature.
I did it here: https://github.com/starkdg/phashml
It is available in a python module that uses tensorflow model.

pyphashml

1 25 10.0 Python

image perceptual hash based on ML

There's limits to how short you can make the perceptual hash. The more you compress it, the more information you lose.
The ML image classification models can be used to extract a good descriptor that can be further reduced into a compact signature.
https://github.com/starkdg/pyphashml
For indexing, I've had some success with distance-based indexing. Here's a comparison of some structures I used:
https://github.com/starkdg/pyphashml
Feel free to contact me, if you want to discuss this further.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

GPUsGoBurr: Get up to 2x higher performance by Tuning LLM Inference Deployment

1 project | news.ycombinator.com | 15 May 2024
Show HN: Tarsier – vision for text-only LLM web agents that beats GPT-4o

8 projects | news.ycombinator.com | 15 May 2024
PaliGemma: Open-Source Multimodal Model by Google

5 projects | news.ycombinator.com | 15 May 2024
Project Gameface Launches on Android

1 project | news.ycombinator.com | 15 May 2024
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

1 project | news.ycombinator.com | 15 May 2024

Simple, Fast, and Scalable Reverse Image Search

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 19 Oct 2022

dhash

cppbktree

InfluxDB

phashml

pyphashml

Related posts

GPUsGoBurr: Get up to 2x higher performance by Tuning LLM Inference Deployment

Show HN: Tarsier – vision for text-only LLM web agents that beats GPT-4o

PaliGemma: Open-Source Multimodal Model by Google

Project Gameface Launches on Android

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation