-
I've built popular PySpark (quinn, chispa) and Scala Spark (spark-daria, spark-fast-tests) libraries.
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
I've built popular PySpark (quinn, chispa) and Scala Spark (spark-daria, spark-fast-tests) libraries.
-
I've built popular PySpark (quinn, chispa) and Scala Spark (spark-daria, spark-fast-tests) libraries.
-
spark-fast-tests
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
I've built popular PySpark (quinn, chispa) and Scala Spark (spark-daria, spark-fast-tests) libraries.
-
If you are interested in using/learning Python, SQL and data warehouse skills, take a look at https://github.com/sodadata/soda-sql
-
ballista
Discontinued Distributed compute platform implemented in Rust, and powered by Apache Arrow.
His newer project, Ballista, was also donated to Apache Arrow. I hope to get the Rust skills to collaborate with him on open source work someday too. He's also doing really cool work on spark-rapids FYI.
-
His newer project, Ballista, was also donated to Apache Arrow. I hope to get the Rust skills to collaborate with him on open source work someday too. He's also doing really cool work on spark-rapids FYI.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Airbyte and Singer/Meltano if you want to learn more about ingestion pipelines. Airbyte and Meltano teams are very welcoming. SQLfluff a shiny SQL linter. Beautiful project with awesome maintainers.
-
Airbyte and Singer/Meltano if you want to learn more about ingestion pipelines. Airbyte and Meltano teams are very welcoming. SQLfluff a shiny SQL linter. Beautiful project with awesome maintainers.
-
sqlfluff
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
Airbyte and Singer/Meltano if you want to learn more about ingestion pipelines. Airbyte and Meltano teams are very welcoming. SQLfluff a shiny SQL linter. Beautiful project with awesome maintainers.
-
DataGristle by u/kenfar who influenced many of us in this sub.
-
Metabase
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:
If you want to work more on the visualization side maybe Metabase, Superset and Streamlit.
-
If you want to work more on the visualization side maybe Metabase, Superset and Streamlit.
-
If you want to work more on the visualization side maybe Metabase, Superset and Streamlit.
-
Skytrax-Data-Warehouse
Discontinued A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.
Always open to accept contributions to my project (Skytrax Data Warehouse). If you are into data stuff support my work at youtube as well (One Developer Pirate), I mostly make data-oriented videos. These days I'm making a SQL course from a data analysis perspective that is expected to release in next week.
-
Prefect! Specifically the Task Library: https://github.com/PrefectHQ/prefect
-
It's a near crime that Dagster hasn't been mentioned already.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives