Ray vs Faust

Ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. (by ray-project)

Source Code

ray.io

Docs

Suggest alternative

Edit details

Faust

Python Stream Processing (by robinhood)

Concurrency and Parallelism kafka-streams Kafka Python Asyncio Distributed Systems Stream Processing

Source Code

Docs

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Ray		Faust
	Project
42	Mentions	8
30,879	Stars	6,668
2.8%	Growth	0.0%
10.0	Activity	1.4
5 days ago	Latest Commit	5 months ago
Python	Language	Python
Apache License 2.0	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

Ray

Posts with mentions or reviews of Ray. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-05.

Open Source Advent Fun Wraps Up!
10 projects | dev.to | 5 Jan 2024

22. Ray | Github | tutorial
TransformerXL + PPO Baseline + MemoryGym
10 projects | /r/reinforcementlearning | 15 Feb 2023

RLlib
Elixir Livebook now as a desktop app
12 projects | news.ycombinator.com | 2 Aug 2022

I've wondered whether it's easier to add data analyst stuff to Elixir that Python seems to have, or add features to Python that Erlang (and by extension Elixir) provides out of the box.
By what I can see, if you want multiprocessing on Python in an easier way (let's say running async), you have to use something like ray core[0], then if you want multiple machines you need redis(?). Elixir/Erlang supports this out of the box.
Explorer[1] is an interesting approach, where it uses Rust via Rustler (Elixir library to call Rust code) and uses Polars as its dataframe library. I think Rustler needs to be reworked for this usecase, as it can be slow to return data. I made initial improvements which drastically improves encoding (https://github.com/elixir-nx/explorer/pull/282 and https://github.com/elixir-nx/explorer/pull/286, tldr 20+ seconds down to 3).
[0] https://github.com/ray-project/ray
preprocessing millions of records - how to speed up the processing
2 projects | /r/datascience | 3 Jun 2022

Dask, Ray(ray.io), or pyspark(if you have a cluster)
3% of 666 Python codebases we checked had a silently failing unit test
20 projects | /r/Python | 15 Feb 2022

https://github.com/ansible-community/ara/pull/358 https://github.com/b12io/orchestra/pull/830 https://github.com/batiste/django-page-cms/pull/210 https://github.com/carpentries/amy/pull/2130 https://github.com/celery/django-celery/pull/612 https://github.com/django-cms/django-cms/pull/7241 https://github.com/django-oscar/django-oscar/pull/3867 https://github.com/esrg-knights/Squire/pull/253https://github.com/Frojd/django-react-templatetags/pull/64 https://github.com/groveco/django-sql-explorer/pull/474 https://github.com/jazzband/django-silk/pull/550 https://github.com/keras-team/keras/pull/16073 https://github.com/ministryofjustice/cla_backend/pull/773 https://github.com/nitely/Spirit/pull/306 https://github.com/python/pythondotorg/pull/1987 https://github.com/rapidpro/rapidpro/pull/1610 https://github.com/ray-project/ray/pull/22396 https://github.com/saltstack/salt/pull/61647 https://github.com/Swiss-Polar-Institute/project-application/pull/483 https://github.com/UEWBot/dipvis/pull/216
Rust OpenCV - Simple Guide
3 projects | /r/rust | 14 Feb 2022

I'd really want use Rust+OpenCV instead of Python+OpenCV to process a lot of images (xxxxxx pieces on a central NAS). I would want to do it by also splitting the work over multiple worker nodes for speed. Unfortunately, I've so far not had the time to figure this out... Meanwhile, a Rust API for Ray is being worked on! https://github.com/ray-project/ray/issues/20609
Blazer - HPC python library for MPI workflows
2 projects | /r/HPC | 10 Feb 2022

ray.io doesn't support MPI natively. And thus is not "supercomputer" friendly. Blazer runs on MPI which runs across the NUMA (non-unified memory architecture) setup of a supercomputer. The compute interconnect is 100's of times faster than network remoting, which ray.io uses.
JORLDY: OpenSource Reinforcement Learning Framework
2 projects | /r/reinforcementlearning | 8 Nov 2021

Distributed RL algorithms are provided using ray
Python stands to lose its GIL, and gain a lot of speed
5 projects | /r/programming | 20 Oct 2021

I had a similar use case and ended up using ray. https://github.com/ray-project/ray
How to deploy a rllib-trained model?
3 projects | /r/reinforcementlearning | 16 Oct 2021

Currently, rllib's "--export-formats" does nothing; I have folders of checkpoints, but no models. Looks like currently the internal export_model function isn't implemented: https://github.com/ray-project/ray/issues/19021

Faust

Posts with mentions or reviews of Faust. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-07.

Faust VS quix-streams - a user suggested alternative
2 projects | 7 Dec 2023
Kafka ETL tool, is there any?
4 projects | /r/apachekafka | 14 Feb 2023

If you really want a "modern" language (I assume you just want Python based on your other comments), there's Robinhood's Faust, though it's been deprecated for a while. It'll still probably do what you want given your criteria, but it's not really suitable for long-term use given it hasn't been updated since October 2020.
How to join using Faust Streaming (Python implementation of Kafka Streams API)?
2 projects | /r/apachekafka | 31 Jan 2023
Using Kafka with Python... is Confluent the only option?
3 projects | /r/apachekafka | 8 May 2022

Unfortunately Faust is dead, robinhood abandoned it 2020, there are no new commits and they don’t react to any questions etc..: https://github.com/robinhood/faust
Why did Robinhood abandon Faust?
2 projects | /r/dataengineering | 25 Apr 2022
Event-Driven Architectures with Kafka and Python
2 projects | /r/apachekafka | 25 Oct 2021

The best bet seemed like the open source Faust (from Robinhood, and Celery in its lineage). It was too heavy duty for our simple needs and moreover it unfortunately seems abandoned. There is a community fork in a bit of disarray last i looked(tests failed etc.)

What are some alternatives?

When comparing Ray and Faust you can also consider the following projects:

optuna - A hyperparameter optimization framework

stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

gevent - Coroutine-based concurrency library for Python

stable-baselines - A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

SCOOP (Scalable COncurrent Operations in Python) - SCOOP (Scalable COncurrent Operations in Python)

Thespian Actor Library - Python Actor concurrency library

Dask - Parallel computing with task scheduling

Wallaroo - Distributed Stream Processing

django-celery - Old Celery integration project for Django

vibora - Fast, asynchronous and elegant Python web framework.

Ray vs optuna Ray vs stable-baselines3 Ray vs gevent Ray vs stable-baselines Ray vs SCOOP (Scalable COncurrent Operations in Python) Ray vs Thespian Actor Library Ray vs Dask Faust vs Wallaroo Faust vs Thespian Actor Library Faust vs gevent Ray vs django-celery Faust vs vibora

Compare Ray vs Faust and see what are their differences.

Ray

Faust

Ray

Faust

What are some alternatives?