Show HN: I scraped 25M Shopify products to build a search engine

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. Geziyor

    Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

    As someone who has scraped millions of items myself, I had success using Geziyor (https://github.com/geziyor/geziyor) built in Go. Shopify sites are especially easy to scrape because they tend to share the same product data formatting and don't hide it behind JS rendering.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. usearch

    Fast Open-Source Search & Clustering engine Γ— for Vectors & πŸ”œ Strings Γ— in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram πŸ”

    As you scale, you may benefit from these two projects I maintain, and the Big Tech uses :)

    https://github.com/unum-cloud/usearch - for faster search

    https://github.com/unum-cloud/uform - for cheaper multi-lingual multi-modal embeddings

  4. uform

    Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and πŸ”œ video, up to 5x faster than OpenAI CLIP and LLaVA πŸ–ΌοΈ & πŸ–‹οΈ

    As you scale, you may benefit from these two projects I maintain, and the Big Tech uses :)

    https://github.com/unum-cloud/usearch - for faster search

    https://github.com/unum-cloud/uform - for cheaper multi-lingual multi-modal embeddings

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • AWS Graviton 3 > Graviton 4 for Vector Similarity Search

    3 projects | dev.to | 30 Mar 2025
  • Why HNSW Is Not the Answer

    1 project | news.ycombinator.com | 23 Dec 2024
  • Usearch: Single-File Similarity Search

    1 project | news.ycombinator.com | 9 Aug 2024
  • SIMD-accelerated distance functions for SQLite

    1 project | news.ycombinator.com | 16 Jun 2024
  • Recapping the AI, Machine Learning and Data Science Meetupβ€Š-β€ŠMay 30,Β 2024

    3 projects | dev.to | 4 Jun 2024

Did you know that Go is
the 4th most popular programming language
based on number of references?