Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today. Learn more →
Top 16 Python elt Projects
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
-
Mage
🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
-
meltano
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
astro-sdk
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
-
reddit-detective
Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more
-
sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
-
Meltano Singer SDK
Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com (by meltano)
-
dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Level 1 of MLOps is when you've put each lifecycle stage and their intefaces in an automated pipeline. The pipeline could be a python or bash script, or it could be a directed acyclic graph run by some orchestration framework like Airflow, dagster or one of the cloud-provider offerings. AI- or data-specific platforms like MLflow, ClearML and dvc also feature pipeline capabilities.
Project mention: How to Build a Chat App with Your Postgres Data using Agent Cloud | dev.to | 2024-05-13AgentCloud uses Airbyte to build data pipelines, which allow us to split, chunk, and embed data from over 300 data sources, including Postgres.
Project mention: Show HN: Automatically extract data from APIs with dlt and OpenAPI | news.ycombinator.com | 2024-05-29- You always have the last say. The generated code is declarative and ready to hack in case we pick the wrong paginator or response entity.
The tool and dlt are open source, find the code here: https://github.com/dlt-hub/dlt-init-openapi and here: https://github.com/dlt-hub/dlt
Project mention: meltano VS cloudquery - a user suggested alternative | libhunt.com/r/meltano | 2023-06-02
Project mention: Launch HN: Serra (YC S23) – Open-source, Python-based dbt alternative | news.ycombinator.com | 2023-08-14There is also sqlmesh (https://sqlmesh.com/). Pretty new as well. It introduces some interesting concepts. For smaller dbt projects it could be a drop-in replacement as it allows importing dbt projects.
Project mention: Orchestration: Thoughts on Dagster, Airflow and Prefect? | /r/dataengineering | 2023-06-01Have you tried the Astro SDK? https://github.com/astronomer/astro-sdk
Hi everyone, this is my first DE project. Baitur5/reddit_api_elt (github.com) . It is basically about a data pipeline that extracts Reddit data for a Google Data Studio report, focusing on a specific subreddit Can you guys check it out , and give some advice & tips on how to improve it or the next things I should add.
Python elt related posts
-
Show HN: Meltano Cloud (Gitlab spinout) – Managed infra for open source ELT
-
DBT lays off 15% of their staff
-
SQL Mesh - Auto DAG generation!!
-
SQL Mesh - Auto DAG generation!!
-
Data transformation tools other than DBT
-
Semantic Understanding of SQL
-
Virtual Data Environments
-
A note from our sponsor - Scout Monitoring
www.scoutapm.com | 1 Jun 2024
Index
What are some of the best open-source elt projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Airflow | 34,877 |
2 | airbyte | 14,379 |
3 | dbt-core | 9,025 |
4 | Mage | 7,202 |
5 | dlt | 1,837 |
6 | meltano | 1,626 |
7 | sqlmesh | 1,383 |
8 | dbt-metabase | 432 |
9 | versatile-data-kit | 412 |
10 | astro-sdk | 324 |
11 | dbt-coves | 216 |
12 | reddit-detective | 207 |
13 | sayn | 117 |
14 | Meltano Singer SDK | 86 |
15 | dbd | 56 |
16 | reddit_api_elt | 2 |
Sponsored