ibis
PySpark-Boilerplate
Our great sponsors
ibis | PySpark-Boilerplate | |
---|---|---|
23 | 1 | |
4,074 | 390 | |
7.9% | - | |
10.0 | 2.5 | |
6 days ago | 3 months ago | |
Python | Python | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ibis
-
Show HN: Hashquery, a Python library for defining reusable analysis
I really don't understand the appeal of dbt vs a proper programming language. The templating approach leads to massive spaghetti. I look forward to trying out something like Ibis [0]
0: https://ibis-project.org/
-
This Week In Python
ibis – portable Python dataframe library
- Ibis: The portable Python dataframe library
- FLaNK Stack 26 February 2024
-
Quarto
The main benefit is that you get a Python (or R, Julia or Rust) interpreter. So you can evaluate code. A good example of the value of this is the Ibis docs which use Quarto: https://ibis-project.org/
-
Polars – A bird's eye view of Polars
Ive found polars quite intuitive, though for python, I lean more towards [ibis](https://ibis-project.org/). The interface is nearly identical, but ibis has the benefit if building sql queries before pulling any actual data (like dbplyr) — whereas polars requires the data to be in-memory (at least for rdb’s, though correct me if Im wrong)
this to me seems like a good argument for only using ibis, but Im happy to be convinced otherwise
- Ibis – Universal Interface for Data Wrangling
-
Vanna.ai: Chat with your SQL database
Please add Ibis Birdbrain https://ibis-project.github.io/ibis-birdbrain/ to the list. Birdbrain is an AI-powered data bot, built on Ibis and Marvin, supporting more than 18 database backends.
See https://github.com/ibis-project/ibis and https://ibis-project.org for more details.
- Ibis
PySpark-Boilerplate
What are some alternatives?
snowflake-connector-python - Snowflake Connector for Python
Optimus - :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Apache Impala - Apache Impala
cookiecutter-django - Cookiecutter Django is a framework for jumpstarting production-ready Django projects quickly.
pangres - SQL upsert using pandas DataFrames for PostgreSQL, SQlite and MySQL with extra features
etl-markup-toolkit - ETL Markup Toolkit is a spark-native tool for expressing ETL transformations as configuration
sqlite_scanner - DuckDB extension to read and write to SQLite databases
soda-spark - Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
katacoda
Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data - Mobile robot data were analyzed with Apache-Spark to extract five different statistical result such as travel time, waiting time, average speed, occupancy and density were produced.
nodejs-polars - nodejs front-end of polars
tdigest - t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark