opteryx vs datafusion-python

opteryx

🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives. (by mabel-dev)

Source Code

opteryx.dev

Suggest alternative

Edit details

datafusion-python

Apache DataFusion Python Bindings (by apache)

Suggest topics

Source Code

datafusion.apache.org

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

opteryx		datafusion-python
	Project
1	Mentions	2
43	Stars	296
-	Growth	5.7%
9.8	Activity	8.4
6 days ago	Latest Commit	3 days ago
Python	Language	Rust
Apache License 2.0	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

opteryx

Posts with mentions or reviews of opteryx. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-12-30.

Pure Python Distributed SQL Engine
9 projects | news.ycombinator.com | 30 Dec 2022

Thanks for sharing.
I have a SQL Engine in Python too (https://github.com/mabel-dev/opteryx). I focused my initial effort on supporting SQL statements and making the usage feel like a database - that probably reflects the problem I had in front of me when I set out - only handling handfuls of gigabytes in a batch environment for ETLs with a group of new-to-data-engineering engineers. Have recently started looking more at real-time performance, such as distributing work. Am interesting in how you've approached.

datafusion-python

Posts with mentions or reviews of datafusion-python. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-12-30.

Pure Python Distributed SQL Engine
9 projects | news.ycombinator.com | 30 Dec 2022

hmm I wasn't aware of https://github.com/apache/arrow-datafusion-python... thanks for the pointer.
time series target release by April this year. main challenge is supporting them in the SQL API -- execution engine support is already done

What are some alternatives?

When comparing opteryx and datafusion-python you can also consider the following projects:

quokka - Making data lake work for time series

sqlglot - Python SQL Parser and Transpiler

nomad - Deprecated and re-branded as Alto

pg8000 - A Pure-Python PostgreSQL Driver

influxdb3-python - Python module that provides a simple and convenient way to interact with InfluxDB 3.0.

datafusion-ballista - Apache Arrow Ballista Distributed Query Engine

sqlparser-rs - Extensible SQL Lexer and Parser for Rust

emr-serverless-samples - Example code for running Spark and Hive jobs on EMR Serverless.

opteryx vs quokka datafusion-python vs sqlglot opteryx vs nomad datafusion-python vs pg8000 opteryx vs influxdb3-python datafusion-python vs datafusion-ballista opteryx vs pg8000 datafusion-python vs sqlparser-rs opteryx vs datafusion-ballista datafusion-python vs quokka opteryx vs emr-serverless-samples opteryx vs sqlparser-rs

Compare opteryx vs datafusion-python and see what are their differences.

opteryx

datafusion-python

opteryx

datafusion-python

What are some alternatives?