Create a search engine with PostgreSQL: Postgres vs Elasticsearch

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • examples

    Your one-stop-shop to try Xata out. From packages to apps, whatever you need to get started. (by xataio)

  • And yes, a data platform like Xata solves most of points 1 and 2, by taking that complexity and offering it as a service, together with other goodies.

  • zombodb

    Making Postgres and Elasticsearch work together like it's 2023

  • Point 2 is generally solvable via engineering effort and careful dedicated code. From the existing tools, PGSync is an open source project that aims to specifically solve this problem. ZomboDB is an interesting Postgres extension that tackles point 2 (and I think partially point 3), by controlling and querying Elasticsearch through Postgres. I haven't yet tried either of these two projects, so I can't comment on their trade-offs, but I wanted to mention them.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • csv2opensearch

    Import CSV files into OpenSearch or Elasticsearch

  • I was curious to know at roughly what amount of data Postgres slows down compared to Elasticsearch. On the movies dataset (34K rows) that we used in part 1, all queries were reasonably fast (<300 ms). So for the testing here, I chose a larger data set: a recipes dataset from Kaggle, containing 2.3M recipes. The commands to load the CSV file in PostgreSQL can be found in this gist. For Elasticsearch, I've loaded the same CSV file using this tool.

  • pgvector

    Open-source vector similarity search for Postgres

  • Postgres supports vector search via the pgvector extension, while Elasticsearch has it built-in via the KNN search. You can find benchmarks on ann-benchmarks (look for pgvector and luceneknn) but keep in mind that both implementations are under active development and their performance is being improved.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts