beginner_de_project
Skytrax-Data-Warehouse
beginner_de_project | Skytrax-Data-Warehouse | |
---|---|---|
1 | 1 | |
389 | 131 | |
- | - | |
2.8 | 0.0 | |
about 1 month ago | about 4 years ago | |
HCL | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
beginner_de_project
-
Data Engineering project for beginners V2
Repo: https://github.com/josephmachado/beginner_de_project
Skytrax-Data-Warehouse
-
Open source contributions for a Data Engineer?
Always open to accept contributions to my project (Skytrax Data Warehouse). If you are into data stuff support my work at youtube as well (One Developer Pirate), I mostly make data-oriented videos. These days I'm making a SQL course from a data analysis perspective that is expected to release in next week.
What are some alternatives?
docker-airflow - Docker Apache Airflow
dbd - dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
AWS Data Wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
sqlfluff - A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
sec-airflow-ingester - Use Airflow to pull in remote data via API, pub/sub, kinesis, s3 etc. and then store it in s3 for later consumption by other services.
jaydebeapi - JayDeBeApi module allows you to connect from Python code to databases using Java JDBC. It provides a Python DB-API v2.0 to that database.
dbt-spotify-analytics - Containerized end-to-end analytics of Spotify data using Python, dbt, Postgres, and Metabase
airflow-api-tests - This is a collection of Pytest for the 2.0 Stable Rest Apis for Apache Airflow. I have another repo where you could setup airflow locally and play around with these. I am used to RestAssured, but trying out pytest here.
dagster - An orchestration platform for the development, production, and observation of data assets.
DataGristle - Tough and flexible tools for data analysis, transformation, validation and movement.
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.