[P] Open data transformations in Python, no SQL required

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • InfluxDB - Collect and Analyze Billions of Data Points in Real Time
  • Onboard AI - Learn any GitHub repo in 59 seconds
  • SaaSHub - Software Alternatives and Reviews
  • RasgoQL

    Write python locally, execute SQL in your data warehouse

    You can check it out here: https://github.com/rasgointelligence/RasgoQL

  • fugue

    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.

    This looks similar to fugue, am I right? How do they compare?

  • InfluxDB

    Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.

  • ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

    Yeah, I fully agree with you. SQL has many disadvantages (although many good things as well). In any case, I'm not advocating for SQL, it's just what I've seen recently. I'm a Python fan building tools for data analysis in Python so I hope this SQL trend doesn't go too far as in "let's only do SQL" :)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts