postgresml vs oban

postgresml

The GPU-powered AI application database. Get your app to market faster using the simplicity of SQL and the latest NLP, ML + LLM models. (by postgresml)

Source Code

postgresml.org

Suggest alternative

Edit details

oban

💎 Robust job processing in Elixir, backed by modern PostgreSQL and SQLite3 (by sorentwo)

Queue

Source Code

getoban.pro

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

postgresml		oban
	Project
23	Mentions	27
5,442	Stars	3,056
1.8%	Growth	-
9.7	Activity	9.3
5 days ago	Latest Commit	2 days ago
Rust	Language	Elixir
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

postgresml

Posts with mentions or reviews of postgresml. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-18.

PostgresML
1 project | /r/programming | 30 Aug 2023
[P] pgml-chat: A command-line tool for deploying low-latency knowledge-based chatbots
1 project | /r/MachineLearning | 18 Aug 2023

The Python client SDK is so small, because it's just a wrapper around the Rust client SDK: https://github.com/postgresml/postgresml/tree/master/pgml-sdks/rust/pgml. Currently we also support JS/Typescript SDKs as well, all generated from the same safe and efficient underlying Rust implementation, using some fancy Rust macros.
Pg_later: Asynchronous Queries for Postgres
4 projects | news.ycombinator.com | 18 Aug 2023

I don't think you'd replace a materialized view with pg_later, but it might help you populate or update your materialized view if you are trying to do that asynchronously. pglater.exec() works with DDL too!
I use it a lot for long running queries when doing data science and machine learning work, and a lot of times when executing queries from a jupyter notebook or CLI. That way if my jupyter kernel dies, my query execution continues even if the network or my environment has an issue. I've started using it a bit more with https://github.com/postgresml/postgresml for model training tasks too, since those can be quite long running depending on the situation.
Replace pinecone.
3 projects | /r/LocalLLaMA | 16 Jun 2023

PostgresML comes with pgvector as a vector database. The cool thing is it can run your models in the same memory space as a database extension. We’re also working on ggml support for huggingface transformers, but could use some help testing more LLMs for compatibility. https://github.com/postgresml/postgresml/pull/748
Python SDK for PostgresML with scalable LLM embedding memory and text generation
1 project | news.ycombinator.com | 2 Jun 2023

We've been working on a Python SDK[1] for PostgresML to make it easier for application developers to get the performance and scalability benefits of integrated memory for LLMs, by combining embedding generation, vector recall and LLM tasks from HuggingFace in a single database query.
This work builds on our previous efforts that give a 10x performance improvement from generating the LLM embedding[2] from input text along with tuning vector recall[3] in a single process to avoid excessive network transit.
We'd love your feedback on our roadmap[4] for this extension, if you have other use cases for an ML application database. So far, we've implemented our best practices for scalable vector storage to provide an example reference implementation for interacting with an ML application database based on Postgres.
[1]: https://github.com/postgresml/postgresml/tree/master/pgml-sd...
[P] Python SDK for PostgresML w/ scalable LLM embedding memory and text generation
1 project | /r/MachineLearning | 2 Jun 2023

We've been working on a Python SDK for PostgresML to make it easier for application developers to get the performance and scalability benefits of integrated memory for LLMs, by combining embedding generation, vector recall and LLM tasks from HuggingFace in a single database query.
Show HN: We unified LLMs, vector memory, ranking, pruning models in one process
2 projects | news.ycombinator.com | 12 May 2023

Links:
[1]: https://huggingface.co/spaces/mteb/leaderboard
[2]: https://postgresml.org/blog/generating-llm-embeddings-with-o...
[3]: https://postgresml.org/blog/tuning-vector-recall-while-gener...
[4]: https://postgresml.org/blog/personalize-embedding-vector-sea...
Github: https://github.com/postgresml/postgresml
Personalize embedding results with application data in your database
1 project | news.ycombinator.com | 11 May 2023
[P] We've unified LLMs w/ vector memory + reranking & pruning models in a single process for better performance
1 project | /r/MachineLearning | 10 May 2023

Github: https://github.com/postgresml/postgresml
How to store hugging face model in postgreSQL
1 project | /r/LanguageTechnology | 5 Feb 2023

I'd encourage you to do inference outside of PostgreSQL (use TF serving and make requests against it, or do batch inference), but if you're determined to do so, they have an extension that integrates with the transformers library and allows for calling models directly from SQL.

oban

Posts with mentions or reviews of oban. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-16.

How to Use Flume in your Elixir Application
2 projects | dev.to | 16 Apr 2024

Oban, backed by PostgreSQL or SQLite, also provides a queue-based job processing system. Exq, on the other hand, is backed by Redis. It provides features similar to Flume, but without built-in rate limiting and batch processing capabilities.
Postgres as Queue
8 projects | news.ycombinator.com | 9 Feb 2024

In Elixir land Oban[0] uses Postgres as queue and seems to work quite well.
[0] - https://github.com/sorentwo/oban
Zero Downtime Postgres Upgrades
4 projects | news.ycombinator.com | 12 Dec 2023

I hear you on that, and can say that Postgres is incredibly capable at going beyond typical relational database workloads. One example are durable queues that are transactionally consistent with the rest of the database play a unique role in our architecture that would otherwise require more ceremony. More details here: https://getoban.pro
We are also working on shifting some workloads off of Postgres on to more appropriate systems as we scale, like logging. But we intentionally chose to minimize dependencies by pushing Postgres further to move faster, with migration plans ready as we continue to reach new levels of scale (e.g. using a dedicated log storage solution like elastic search or clickhouse).
Deno Cron
15 projects | news.ycombinator.com | 29 Nov 2023
Switching to Elixir
11 projects | news.ycombinator.com | 9 Nov 2023

You can actually have "background jobs" in very different ways in Elixir.
> I want background work to live on different compute capacity than http requests, both because they have very different resources usage
In Elixir, because of the way the BEAM works (the unit of parallelism is much cheaper and consume a low amount of memory), "incoming http requests" and related "workers" are not as expensive (a lot less actually) compared to other stacks (for instance Ruby and Python), where it is quite critical to release "http workers" and not hold the connection (which is what lead to the creation of background job tools like Resque, DelayedJob, Sidekiq, Celery...).
This means that you can actually hold incoming HTTP connections a lot longer without troubles.
A consequence of this is that implementing "reverse proxies", or anything calling third party servers _right in the middle_ of your own HTTP call, is usually perfectly acceptable (something I've done more than a couple of times, the latest one powering the reverse proxy behind https://transport.data.gouv.fr - code available at https://github.com/etalab/transport-site/tree/master/apps/un...).
As a consequence, what would be a bad pattern in Python or Ruby (holding the incoming HTTP connection) is not a problem with Elixir.
> because I want to have state or queues in front of background work so there's a well-defined process for retry, error handling, and back-pressure.
Unless you deal with immediate stuff like reverse proxying or cheap "one off async tasks" (like recording a metric), there also are solutions to have more "stateful" background works in Elixir, too.
A popular background job queue is https://github.com/sorentwo/oban (roughly similar to Sidekiq at al), which uses Postgres.
It handles retries, errors etc.
But it's not the only solution, as you have other tools dedicated to processing, such as Broadway (https://github.com/dashbitco/broadway), which handles back-pressure, fault-tolerance, batching etc natively.
You also have more simple options, such as flow (https://github.com/dashbitco/flow), gen_stage (https://github.com/elixir-lang/gen_stage), Task.async_stream (https://hexdocs.pm/elixir/1.12/Task.html#async_stream/5) etc.
It allows to use the "right tool for the job" quite easily.
It is also interesting to note there is no need to "go evented" if you need to fetch data from multiple HTTP servers: it can happen in the exact same process (even: in a background task attached to your HTTP server), as done here https://transport.data.gouv.fr/explore (if you zoom you will see vehicle moving in realtime, and ~80 data sources are being polled every 10 seconds & broadcasted to the visitors via pubsub & websockets).
Show HN: A simple API/CLI for scheduling HTTP requests
2 projects | news.ycombinator.com | 27 Sep 2023

Hi HN!
This is something I've been tinkering on for the past couple months. It's basically just an API/CLI for scheduling delayed or recurring jobs as HTTP requests.
I initially built it as a personal tool to save myself a bit of time on little side projects where I've needed scheduled/recurring alerts, but decided it could be a good opportunity to practice building out a nice landing page [0] and documentation [1]. And who knows, maybe someone else will find it useful ¯\_(ツ)_/¯
The tool relies heavily on Elixir's Oban [2] library for managing jobs, and Mintlify [3] for documentation. I also shamelessly stole most of the frontend design from Resend [4] because I'm a fan of the aesthetic and thought it would be good for my design chops to use their design as a guide. I also discovered Radix [5] UI while working on this, which ended up being immensely helpful for moving quickly on the frontend.
Anyways, I almost certainly spent a bit too much time on small UX details that are most likely utterly inconsequential, but it was a fun exercise in polish :)
All feedback is welcome!
[0] https://www.booper.dev/
[1] https://docs.booper.dev/
[2] https://github.com/sorentwo/oban
[3] https://mintlify.com/
[4] https://resend.com/
[5] https://www.radix-ui.com/
Choose Postgres Queue Technology
17 projects | news.ycombinator.com | 24 Sep 2023
Pg_later: Asynchronous Queries for Postgres
4 projects | news.ycombinator.com | 18 Aug 2023

Idk about pgagent but any table is a resilient queue with the multiple locks available in pg along with some SELECT pg_advisory_lock or SELECT FOR UPDATE queries, and/or LISTEN/NOTIFY.
Several bg job libs are built around native locking functionality
> Relies upon Postgres integrity, session-level Advisory Locks to provide run-once safety and stay within the limits of schema.rb, and LISTEN/NOTIFY to reduce queuing latency.
https://github.com/bensheldon/good_job
> |> lock("FOR UPDATE SKIP LOCKED")
https://github.com/sorentwo/oban/blob/8acfe4dcfb3e55bbf233aa...
Keep the Monolith, but Split the Workloads
8 projects | news.ycombinator.com | 24 Apr 2023

> Bad code in a specific part of the codebase bringing down the whole app, as in our November incident.
This is a non-issue if you're using a Elixir/Erlang monolith given its fault tolerant nature.
The noisy neighbour issue (resource hogging) is still something you need to manage though. If you use something like Oban[1] (for background job queues and cron jobs), you can set both local and global limits. Local being the current node, and global the cluster.
Operating in a shared cluster (vs split workload deployments) give you the benefit of being much more efficient with your hardware. I've heard many stories of massive infra savings due to moving to an Elixir/Erlang system.
1. https://github.com/sorentwo/oban
Library for reliably running jobs
2 projects | /r/elixir | 23 Apr 2023

What are some alternatives?

When comparing postgresml and oban you can also consider the following projects:

MindsDB - The platform for customizing AI from enterprise data

broadway - Concurrent and multi-stage data ingestion and data processing with Elixir

Postico - Public issue tracking for Postico

exq - Job processing library for Elixir - compatible with Resque / Sidekiq

Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]

Rihanna - Rihanna is a high performance postgres-backed job queue for Elixir

deepchecks - Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.

kafka_ex - Kafka client library for Elixir

dskueb

verk - A job processing system that just verks! 🧛‍

metaflow - :rocket: Build and manage real-life ML, AI, and data science projects with ease!

honeydew - Job Queue for Elixir. Clustered or Local. Straight BEAM. Optional Ecto. 💪🍈

postgresml vs MindsDB oban vs broadway postgresml vs Postico oban vs exq postgresml vs Activeloop Hub oban vs Rihanna postgresml vs deepchecks oban vs kafka_ex postgresml vs dskueb oban vs verk postgresml vs metaflow oban vs honeydew

Compare postgresml vs oban and see what are their differences.

postgresml

oban

postgresml

oban

What are some alternatives?