Simple, Fast, and Scalable Reverse Image Search

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • dhash

    Perceptual hashing algorithm (dhash) to find similar images (by Rayraegah)

  • Interesting read. Especially the lookup method based on partitioning.

    I tried to implement a similar reverse image search based on dHash as explained here https://github.com/Rayraegah/dhash . However, I also had lookup performance problems. Exact matches are not a problem but the Hamming distance threshold matching is. Because my project was in Python, I tried to eke out more performance by writing a BK-tree backend module in C++ https://github.com/mxmlnkn/cppbktree It was 2 to 10x faster than an existing similar module but still was too slow when trying to look up something in a database of millions of images. However, as lookup tended to depend on the exact Hamming-distance threshold value, my next step would have been to try and optimize the hash. E.g, make it shorter so that only a short Hamming distance is necessary to be looked up but the mentioned multi-indexing method looks much more promising and tested.

  • cppbktree

    Python BK-Tree module based on a C++ implementation

  • Interesting read. Especially the lookup method based on partitioning.

    I tried to implement a similar reverse image search based on dHash as explained here https://github.com/Rayraegah/dhash . However, I also had lookup performance problems. Exact matches are not a problem but the Hamming distance threshold matching is. Because my project was in Python, I tried to eke out more performance by writing a BK-tree backend module in C++ https://github.com/mxmlnkn/cppbktree It was 2 to 10x faster than an existing similar module but still was too slow when trying to look up something in a database of millions of images. However, as lookup tended to depend on the exact Hamming-distance threshold value, my next step would have been to try and optimize the hash. E.g, make it shorter so that only a short Hamming distance is necessary to be looked up but the mentioned multi-indexing method looks much more promising and tested.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • phashml

  • I have found the ML image categorization models an excellent method of extracting a unique descriptor. It is possible to compress the image for matching and storage into a compact signature.

    I did it here: https://github.com/starkdg/phashml

    It is available in a python module that uses tensorflow model.

  • pyphashml

    image perceptual hash based on ML

  • There's limits to how short you can make the perceptual hash. The more you compress it, the more information you lose.

    The ML image classification models can be used to extract a good descriptor that can be further reduced into a compact signature.

    https://github.com/starkdg/pyphashml

    For indexing, I've had some success with distance-based indexing. Here's a comparison of some structures I used:

    https://github.com/starkdg/pyphashml

    Feel free to contact me, if you want to discuss this further.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • GPUsGoBurr: Get up to 2x higher performance by Tuning LLM Inference Deployment

    1 project | news.ycombinator.com | 15 May 2024
  • Show HN: Tarsier – vision for text-only LLM web agents that beats GPT-4o

    8 projects | news.ycombinator.com | 15 May 2024
  • PaliGemma: Open-Source Multimodal Model by Google

    5 projects | news.ycombinator.com | 15 May 2024
  • Project Gameface Launches on Android

    1 project | news.ycombinator.com | 15 May 2024
  • AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

    1 project | news.ycombinator.com | 15 May 2024