pgvecto.rs VS txtai

Compare pgvecto.rs vs txtai and see what are their differences.

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
pgvecto.rs txtai
17 356
1,429 7,080
14.3% 3.8%
9.3 9.3
1 day ago 6 days ago
Rust Python
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pgvecto.rs

Posts with mentions or reviews of pgvecto.rs. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-13.
  • My binary vector search is better than your FP32 vectors
    1 project | dev.to | 25 Mar 2024
    To evaluate the performance metrics in comparison to the original vector approach, we conducted benchmarking using the dbpedia-entities-openai3-text-embedding-3-large-3072-1M dataset. The benchmark was performed on a Google Cloud virtual machine (VM) with specifications of n2-standard-8, which includes 8 virtual CPUs and 32GB of memory. We used pgvecto.rs v0.2.1 as the vector database.
  • pgvecto.rs 0.2: Unifying Relational Queries and Vector Search in PostgreSQL
    2 projects | dev.to | 13 Mar 2024
    Please check out our documentation for more details. We encourage you to try out pgvecto.rs, benchmark it against your workloads, and contribute your indexing innovations. Join our Discord community to connect with the developers and other users working to improve pgvecto.rs!
  • pgvecto.rs alternatives - qdrant and Weaviate
    3 projects | 13 Mar 2024
  • Milvus VS pgvecto.rs - a user suggested alternative
    2 projects | 13 Mar 2024
  • You Shouldn't Invest in Vector Databases?
    4 projects | news.ycombinator.com | 25 Nov 2023
    It's kind of a tradeoff. Performance is just one factor when choosing the vector database. In pgvecto.rs https://github.com/tensorchord/pgvecto.rs, we store the index separately from PostgreSQL's internal storage, unlike pgvector's approach. This enable us to get multi-threaded indexing, async indexing without blocking the insertion, and faster search speed comparing to pgvector.

    I don't see any fundamental reason why the index in Postgres would be slower than a specialized vector database. The query pattern of the vector database is simply a point query using an index, similar to other queries in an OLTP system.

    The only limitation I see is scalability. It's not easy to make PostgreSQL distributed, but solutions like Citus exist, making it still possible.

    (I'm the author of pgvecto.rs)

  • How We Made PostgreSQL a Better Vector Database
    2 projects | news.ycombinator.com | 25 Sep 2023
    Hi, we've solved the problem you mentioned! Please take a look on our open source postgres vector extension https://github.com/tensorchord/pgvecto.rs.

    Our index building process is significantly faster than pgvector on hnsw because we can utilize all the cores, whereas pgvector can only use one core. And for the filter support, we do support pre-filtering, which will guarantee enough results no matter the condition is.

  • First Postgres Vector Extension with Filtering Support
    1 project | news.ycombinator.com | 28 Aug 2023
    Hi,

    In our previous post titled β€œDo we really need a specialized vector database?” on HN (https://news.ycombinator.com/item?id=37097004) we discussed the importance of using a Postgres-based solution for vector search. However, we acknowledged that existing Postgres vector extensions lack support for metadata filtering.

    We are excited to announce that we have now addressed this limitation. We are proud to be the first (https://github.com/tensorchord/pgvecto.rs) to enable conditional filtering directly on HNSW indexes within Postgres. This breakthrough allows for efficient and effective metadata filtering in combination with vector search, eliminating the tradeoff previously associated with using Postgres for this purpose.

    We invite you to explore our updated offering and experience the benefits of seamless metadata filtering within a Postgres-based vector search system.

  • A Summary of LLMOps
    2 projects | news.ycombinator.com | 10 Aug 2023
    Yeah, I think in many cases you just need a vector search lib, instead of a DB.

    And in some other cases, you may want postgres vector extension e.g. https://github.com/tensorchord/pgvecto.rs instead of a specialized vector db.

  • An early look at HNSW performance with pgvector
    2 projects | news.ycombinator.com | 10 Aug 2023
    Seems that pgvector has a viable competitor extension: https://github.com/tensorchord/pgvecto.rs
  • 20x Faster as the Beginning: Introducing pgvecto.rs extension written in Rust
    1 project | /r/rust | 8 Aug 2023
    We are thrilled to announce the release of https://github.com/tensorchord/pgvecto.rs, a powerful Postgres extension for vector similarity search written in Rust. Its HNSW algorithm is 20x faster than pgvector at 90% recall. But speed is just the start - pgvecto.rs is architected to add new algorithms easily. We've made it an extensible architecture for contributors to implement the new indexes quickly, and we look forward to the open-source community driving pgvecto.rs to new heights!

txtai

Posts with mentions or reviews of txtai. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-01.
  • Show HN: FileKitty – Combine and label text files for LLM prompt contexts
    5 projects | news.ycombinator.com | 1 May 2024
  • What contributing to Open-source is, and what it isn't
    1 project | news.ycombinator.com | 27 Apr 2024
    I tend to agree with this sentiment. Many junior devs and/or those in college want to contribute. Then they feel entitled to merge a PR that they worked hard on often without guidance. I'm all for working with people but projects have standards and not all ideas make sense. In many cases, especially with commercial open source, the project is the base of a companies identity. So it's not just for drive-by ideas to pad a resume or finish a school project.

    For those who do want to do this, I'd recommend writing an issue and/or reaching out to the developers to engage in a dialogue. This takes work but it will increase the likelihood of a PR being merged.

    Disclaimer: I'm the primary developer of txtai (https://github.com/neuml/txtai), an open-source vector database + RAG framework

  • Build knowledge graphs with LLM-driven entity extraction
    1 project | dev.to | 21 Feb 2024
    txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.
  • Bootstrap or VC?
    1 project | news.ycombinator.com | 5 Feb 2024
    Bootstrapping only works if you have the runway to do it and you don't feel the need to grow fast.

    With NeuML (https://neuml.com), I've went the bootstrapping route. I've been able to build a fairly successful open source project (txtai 6K stars https://github.com/neuml/txtai) and a revenue positive company. It's a "live within your means" strategy.

    VC funding can have a snowball effect where you need more and more. Then you're in the loop of needing funding rounds to survive. The hope is someday you're acquired or start turning a profit.

    I would say both have their pros and cons. Not all ideas have the luxury of time.

  • txtai: An embeddings database for semantic search, graph networks and RAG
    1 project | news.ycombinator.com | 3 Feb 2024
  • Ask HN: What happened to startups, why is everything so polished?
    2 projects | news.ycombinator.com | 27 Jan 2024
    I agree that in many cases people are puffing their feathers to try to be something they're not (at least not yet). Some believe in the fake it until you make it mentality.

    With NeuML (https://neuml.com), the website is a simple HTML page. On social media, I'm honest about what NeuML is, that I'm in my 40s with a family and not striving to be the next Steve Jobs. I've been able to build a fairly successful open source project (txtai 6K stars https://github.com/neuml/txtai) and a revenue positive company. For me, authenticity and being genuine is most important. I would say that being genuine has been way more of an asset than liability.

  • Are we at peak vector database?
    8 projects | news.ycombinator.com | 25 Jan 2024
    I'll add txtai (https://github.com/neuml/txtai) to the list.

    There is still plenty of room for innovation in this space. Just need to focus on the right projects that are innovating and not the ones (re)working on problems solved in 2020/2021.

  • Txtai: An all-in-one embeddings database for semantic search and LLM workflows
    1 project | news.ycombinator.com | 24 Jan 2024
  • Generate knowledge with Semantic Graphs and RAG
    1 project | dev.to | 23 Jan 2024
    txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.
  • Show HN: Open-source Rule-based PDF parser for RAG
    9 projects | news.ycombinator.com | 23 Jan 2024
    Nice project! I've long used Tika for document parsing given it's maturity and wide number of formats supported. The XHTML output helps with chunking documents for RAG.

    Here's a couple examples:

    - https://neuml.hashnode.dev/build-rag-pipelines-with-txtai

    - https://neuml.hashnode.dev/extract-text-from-documents

    Disclaimer: I'm the primary author of txtai (https://github.com/neuml/txtai).

What are some alternatives?

When comparing pgvecto.rs and txtai you can also consider the following projects:

pgvector - Open-source vector similarity search for Postgres

sentence-transformers - Multilingual Sentence & Image Embeddings with BERT

modelz-llm - OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)

tika-python - Tika-Python is a Python binding to the Apache Tikaβ„’ REST services allowing Tika to be called natively in the Python community.

pgvecto.rs-bench

transformers - πŸ€— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Awesome-LLMOps - An awesome & curated list of best LLMOps tools for developers

faiss - A library for efficient similarity search and clustering of dense vectors.

faiss-rs - Rust language bindings for Faiss

CLIP - CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

DocumentGPT - DocumentGPT is a web application that allows you to chat over your research document using OpenAI's chat API and perform semantic search using vector databases. This tool provides a seamless interface for interacting with your research document, exploring search results, and engaging in a conversation with an AI chatbot.

paperai - πŸ“„ πŸ€– Semantic search and workflows for medical/scientific papers