Optimus
fugue
Our great sponsors
Optimus | fugue | |
---|---|---|
0 | 11 | |
1,434 | 1,853 | |
1.0% | 1.5% | |
1.9 | 6.7 | |
10 days ago | 10 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Optimus
We haven't tracked posts mentioning Optimus yet.
Tracking mentions began in Dec 2020.
fugue
- FLaNK Stack Weekly 22 January 2024
-
Daft: A High-Performance Distributed Dataframe Library for Multimodal Data
Please integrate it with Fugue.
- Ask HN: How do you test SQL?
-
Replacing Pandas with Polars. A Practical Guide
Fugue is an interesting library in this space , though I haven’t tried it
https://github.com/fugue-project/fugue
A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark, Dask and Ray without any rewrites.
-
The hand-picked selection of the best Python libraries and tools of 2022
fugue — distributed computing done easy
-
[P] Open data transformations in Python, no SQL required
This looks similar to fugue, am I right? How do they compare?
-
Pyspark now provides a native Pandas API
There's dask-sql, but I think it is being abandoned for fugue-project. I'm actually excited for this project as it is trying to provide a backend agnostic solution, which would seem like a difficult, lofty goal. I wish them luck.
What are some alternatives?
AWS Data Wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
modin - Modin: Scale your Pandas workflows by changing a single line of code
sweetviz - Visualize and compare datasets, target values and associations, with one line of code.
ga-extractor - Tool for extracting Google Analytics data suitable for migrating to other platforms/databases
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
zef - Toolkit for graph-relational data across space and time
anovos - Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
flashtext - Extract Keywords from sentence or Replace keywords in sentences.
ydata-profiling - 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
mlToolKits - learningOrchestra is a distributed Machine Learning integration tool that facilitates and streamlines iterative processes in a Data Science project.
ploomber - The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
ormolu - A formatter for Haskell source code