Search inside YouTube videos using natural language queries

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews

    Yes, this is definitely possible. You can maybe try computing some kind of image distance between frames or some keyframe extraction.

    Once you compute the features, the search is very efficient! I tried it for searching in the 2M photos dataset from Unsplash and it takes like 2-3 seconds: https://github.com/haltakov/natural-language-image-search

    I plan to run my personal photos through it :)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • CLIP

    CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

  • Yes, I know that this is a bit slow. The problem is you really need 1.7.1, because 1.7.0 leads to some strange issues and broken results:

    https://github.com/openai/CLIP/issues/13#issuecomment-771143...

    Yes, I know. :D Your previous project with Unsplash made me try a similar approach [1] for banners of video games on Steam.

    [1] https://github.com/woctezuma/steam-image-search

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts