[P] Open data transformations in Python, no SQL required

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Judoscale - Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com
featured
InfluxDB high-performance time series database
Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
influxdata.com
featured
  1. RasgoQL

    Write python locally, execute SQL in your data warehouse

    You can check it out here: https://github.com/rasgointelligence/RasgoQL

  2. Judoscale

    Save 47% on cloud hosting with autoscaling that just works. Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.

    Judoscale logo
  3. fugue

    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

    This looks similar to fugue, am I right? How do they compare?

  4. ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

    Yeah, I fully agree with you. SQL has many disadvantages (although many good things as well). In any case, I'm not advocating for SQL, it's just what I've seen recently. I'm a Python fan building tools for data analysis in Python so I hope this SQL trend doesn't go too far as in "let's only do SQL" :)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Daft: A High-Performance Distributed Dataframe Library for Multimodal Data

    4 projects | news.ycombinator.com | 7 Jun 2023
  • No-Code Self-Service BI/Data Analytics Tool

    1 project | news.ycombinator.com | 13 Nov 2021
  • Functions matter – an alternative to SQL and map-reduce for data processing

    1 project | /r/datascience | 19 May 2021
  • NoSQL Data Modeling Techniques

    1 project | news.ycombinator.com | 10 Apr 2021
  • Boost your ML pipeline performance with efficient parallelism

    1 project | dev.to | 9 Apr 2025

Did you know that Python is
the 2nd most popular programming language
based on number of references?