Show HN: I scraped 25M Shopify products to build a search engine

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • Geziyor

    Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

    As someone who has scraped millions of items myself, I had success using Geziyor (https://github.com/geziyor/geziyor) built in Go. Shopify sites are especially easy to scrape because they tend to share the same product data formatting and don't hide it behind JS rendering.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • usearch

    Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

    As you scale, you may benefit from these two projects I maintain, and the Big Tech uses :)

    https://github.com/unum-cloud/usearch - for faster search

    https://github.com/unum-cloud/uform - for cheaper multi-lingual multi-modal embeddings

  • uform

    Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

    As you scale, you may benefit from these two projects I maintain, and the Big Tech uses :)

    https://github.com/unum-cloud/usearch - for faster search

    https://github.com/unum-cloud/uform - for cheaper multi-lingual multi-modal embeddings

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts