Postgres Full Text Search vs. the Rest

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

MeiliSearch

129 43,284 9.8 Rust

A lightning-fast search API that fits effortlessly into your apps, websites, and workflow

Hi Quentin, thank you for your reply. I was just referring to this comment of yours: https://github.com/meilisearch/MeiliSearch/discussions/870#d.... It is good to hear that Meilisearch cloud is encrypted though. Will definitely keep an eye on Meilisearch for future projects.

tantivy

48 9,896 9.1 Rust

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Yup! I'd heard of Quickwit and sonic, but Quickwit seems to have pivoted to being a log-search focused engine. It's built on Tantivy[0] IIRC so I could have used something like Toshi[1].
Sonic[2] I know much less about but it also seems good. Honestly anything except ES is what I like to hear about (though OpenSearch is interesting).
Another thing I think the world really needs is a CLI +/- API tool (ideally rust lib + CLI + API) that unifies interacting with these things. I got REALLY close to writing it while working on this article, but I was already running late and I have a penchant for yak shaving.
This won't be the last thing I write about search engines -- there's been a LOT of movement in the space that has nothing to do with the elastic/opensearch debacle and I don't see enough tires getting kicked.
[0]: https://github.com/quickwit-oss/tantivy
[1]: https://github.com/toshi-search/Toshi
[2]: https://github.com/valeriansaliou/sonic

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Toshi

12 4,117 6.1 Rust

A full-text search engine in rust

Yup! I'd heard of Quickwit and sonic, but Quickwit seems to have pivoted to being a log-search focused engine. It's built on Tantivy[0] IIRC so I could have used something like Toshi[1].
Sonic[2] I know much less about but it also seems good. Honestly anything except ES is what I like to hear about (though OpenSearch is interesting).
Another thing I think the world really needs is a CLI +/- API tool (ideally rust lib + CLI + API) that unifies interacting with these things. I got REALLY close to writing it while working on this article, but I was already running late and I have a penchant for yak shaving.
This won't be the last thing I write about search engines -- there's been a LOT of movement in the space that has nothing to do with the elastic/opensearch debacle and I don't see enough tires getting kicked.
[0]: https://github.com/quickwit-oss/tantivy
[1]: https://github.com/toshi-search/Toshi
[2]: https://github.com/valeriansaliou/sonic

quickwit

64 6,052 9.8 Rust

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.

I wish we had an extension like ZomboDB but using a lighter search engine like https://github.com/quickwit-oss/quickwit, https://github.com/toshi-search/Toshi and https://github.com/mosuka/bayard
Here I'm listing engines based on https://github.com/quickwit-oss/tantivy - tantivy is comparable to Lucene in its scope - but I'm sure there are other engines that could tackle ElasticSearch.
Another thing that could happen is maybe directly embed tantivy in Postgres using an extension, perhaps this could be an option too.

Typesense

129 17,876 9.8 C++

Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
litestream

165 9,964 7.5 Go

Streaming replication for SQLite.

This hasn't made the front page of HN (maybe someone should post it), but Litestream actually doesn't do replication that way any more -- I stumbled upon this commit like 2 days ago:
https://github.com/benbjohnson/litestream/pull/411

sonic

48 19,419 7.0 Rust

🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

Yup! I'd heard of Quickwit and sonic, but Quickwit seems to have pivoted to being a log-search focused engine. It's built on Tantivy[0] IIRC so I could have used something like Toshi[1].
Sonic[2] I know much less about but it also seems good. Honestly anything except ES is what I like to hear about (though OpenSearch is interesting).
Another thing I think the world really needs is a CLI +/- API tool (ideally rust lib + CLI + API) that unifies interacting with these things. I got REALLY close to writing it while working on this article, but I was already running late and I have a penchant for yak shaving.
This won't be the last thing I write about search engines -- there's been a LOT of movement in the space that has nothing to do with the elastic/opensearch debacle and I don't see enough tires getting kicked.
[0]: https://github.com/quickwit-oss/tantivy
[1]: https://github.com/toshi-search/Toshi
[2]: https://github.com/valeriansaliou/sonic

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
pgx

19 2,376 9.6 Rust

Discontinued Build Postgres Extensions with Rust! [Moved to: https://github.com/tcdi/pgrx] (by tcdi)

> That thread led me to a project/product idea where you take an existing Postgres instance used for normal products or whatever, replicate it to various read only clusters with a custom search extension loaded and some orchestrator sitting on top (I’ve written most of one in rust that uses 0mq to communicate with it’s nodes) and create drop in search from existing databases with a nice guided web gui for automatic tuning suitable for most business use cases.
Very interesting idea -- just want to add one thing, write it in rust (with pgx?[0]) :)
[0]: https://github.com/tcdi/pgx

PostgreSQL

405 14,673 10.0 C

Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch

There are EXTREME limitations in the source code around it that no one really talks about but have important implications.
Phrase searching doesn't really work the way our analysts would have liked and needed. There are a bunch of technical limitations in the source about how much data is tracked about the tokens. I can't remember exactly but there was something weird about stemming or lemming in the phrase search too.
The following variables need to be bumped up to get phrase searching more accurate.
- MAXSTRLEN (2047) https://github.com/postgres/postgres/blob/master/src/include...
- MAXSTRPOS (1048575) https://github.com/postgres/postgres/blob/master/src/include...
- MAXENTRYPOS (16363) https://github.com/postgres/postgres/blob/master/src/include...
- MAXNUMPOS (256) https://github.com/postgres/postgres/blob/master/src/include...
TsHeadline for highlighting doesn't consider phrase searching so you can weird results. It probably needs to be rewritten to match websearch_to_tsquery.
The accuracy issue drained the blood from my BA's faces. I eventually just went with on-premise SOLR because it's easier to add new hardware for it than elasticsearch.
TLDR: postgres search is probably fine for short-form content, but major gotcha's once you go past those max limits. Also phrase searching will probably not work the way people are used too.

zombodb

23 4,608 8.3 PLpgSQL

Making Postgres and Elasticsearch work together like it's 2023
charabia

5 210 8.4 Rust

Library used by Meilisearch to tokenize queries and documents

Good to know. If you find some Russian language support improvement we can do, don't hesitate to create an issue on our tokeniser https://github.com/meilisearch/charabia.
Moreover, it's Hacktoberfest. If you want to help us improve the language support, it would be awesome!

bayard

4 1,839 0.0 Rust

A full-text search and indexing server written in Rust.

I wish we had an extension like ZomboDB but using a lighter search engine like https://github.com/quickwit-oss/quickwit, https://github.com/toshi-search/Toshi and https://github.com/mosuka/bayard
Here I'm listing engines based on https://github.com/quickwit-oss/tantivy - tantivy is comparable to Lucene in its scope - but I'm sure there are other engines that could tackle ElasticSearch.
Another thing that could happen is maybe directly embed tantivy in Postgres using an extension, perhaps this could be an option too.

rum

11 693 4.0 C

RUM access method - inverted index with additional information in posting lists (by postgrespro)

My experience with Postgres FTS (did a comparison with Elastic a couple years back), is that filtering works fine and is speedy enough, but ranking crumbles when the resulting set is large.
If you have a large-ish data set with lots of similar data (4M addresses and location names was the test case), Postgres FTS just doesn't perform.
There is no index that helps scoring results. You would have to install an extension like RUM index (https://github.com/postgrespro/rum) to improve this, which may or may not be an option (often not if you use managed databases).
If you want a best of both worlds, one could investigate this extensions (again, often not an option for managed databases): https://github.com/matthewfranglen/postgres-elasticsearch-fd...
Either way, writing something that indexes your postgres database into elastic/opensearch is a one time investment that usually pays off in the long run.

postgres-elasticsearch-fdw

3 106 4.2 Python

Postgres to Elastic Search Foreign Data Wrapper

My experience with Postgres FTS (did a comparison with Elastic a couple years back), is that filtering works fine and is speedy enough, but ranking crumbles when the resulting set is large.
If you have a large-ish data set with lots of similar data (4M addresses and location names was the test case), Postgres FTS just doesn't perform.
There is no index that helps scoring results. You would have to install an extension like RUM index (https://github.com/postgrespro/rum) to improve this, which may or may not be an option (often not if you use managed databases).
If you want a best of both worlds, one could investigate this extensions (again, often not an option for managed databases): https://github.com/matthewfranglen/postgres-elasticsearch-fd...
Either way, writing something that indexes your postgres database into elastic/opensearch is a one time investment that usually pays off in the long run.

postgres-elasticsearch-fd

3 - -

My experience with Postgres FTS (did a comparison with Elastic a couple years back), is that filtering works fine and is speedy enough, but ranking crumbles when the resulting set is large.
If you have a large-ish data set with lots of similar data (4M addresses and location names was the test case), Postgres FTS just doesn't perform.
There is no index that helps scoring results. You would have to install an extension like RUM index (https://github.com/postgrespro/rum) to improve this, which may or may not be an option (often not if you use managed databases).
If you want a best of both worlds, one could investigate this extensions (again, often not an option for managed databases): https://github.com/matthewfranglen/postgres-elasticsearch-fd...
Either way, writing something that indexes your postgres database into elastic/opensearch is a one time investment that usually pays off in the long run.

Searchkick

10 6,389 7.3 Ruby

Intelligent search made easy

You're right, that's actually what we implemented, application-level hooks, but they needed development and maintenance effort that come for free with the adapter we're using for OpenSearch integration, which also comes with welcome features: synonyms, partial matches, and many others.
Spoiler, the adapter is Searchkick: https://github.com/ankane/searchkick

pg-tsquery

1 56 4.3 JavaScript

:mag: Parse user input into a valid PostgreSQL tsquery

You could have a bit of logic on top of PG to better handle the search query, for example using https://github.com/caub/pg-tsquery

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

ZincSearch – lightweight alternative to Elasticsearch written in Go
24 projects | news.ycombinator.com | 22 Sep 2022
Meilisearch raises a $5M Seed to change the world of user-facing search
5 projects | news.ycombinator.com | 4 Feb 2022
Code Search Is Hard
13 projects | news.ycombinator.com | 10 Apr 2024
SeekStorm VS tantivy - a user suggested alternative
2 projects | 22 Mar 2024
YaCy, a distributed Web Search Engine, based on a peer-to-peer network
9 projects | news.ycombinator.com | 5 Mar 2024

Postgres Full Text Search vs. the Rest

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Rust search-engine Search Postgresql ElasticSearch
Post date: 14 Oct 2022

MeiliSearch

tantivy

WorkOS

Toshi

quickwit

Typesense

litestream

sonic

InfluxDB

pgx

PostgreSQL

zombodb

charabia

bayard

rum

postgres-elasticsearch-fdw

postgres-elasticsearch-fd

Searchkick

pg-tsquery

SaaSHub

Related posts

Postgres Full Text Search vs. the Rest

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Rust search-engine Search Postgresql ElasticSearch Post date: 14 Oct 2022

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Rust search-engine Search Postgresql ElasticSearch
Post date: 14 Oct 2022