Python etl-pipeline

Open-source Python projects categorized as etl-pipeline

Top 12 Python etl-pipeline Projects

  • pyspark-example-project

    Implementing best practices for PySpark ETL jobs and applications.

  • Udacity-Data-Engineering-Projects

    Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

  • Project mention: Pitanje za data engineering? | /r/programiranje | 2023-06-30
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • patterns-devkit

    Data pipelines from re-usable components

  • unstract

    No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

  • Project mention: Ask HN: Is RAG the Future of LLMs? | news.ycombinator.com | 2024-04-14

    Fast changing libraries are a huge pain. That's why a no-code approach like Unstract (https://github.com/zipstack/unstract) makes sense.

  • prism

    Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python. (by runprism)

  • Project mention: Prism: the easiest way to create robust data workflows. Accessible via CLI | /r/coolgithubprojects | 2023-09-21
  • bitcoinMonitor

    Near real time ETL to populate a dashboard.

  • Project mention: Best place to learn APIS? | /r/dataengineering | 2023-06-26

    I have a sample code here that pulls data from an API and loads it into a DB, scheduled by cron, that can help with some ideas.

  • Spooq

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • dados-censup

    Automação da ingestão de dados disponibilizados pelo INEP referente ao censo superior da educacão brasileira.

  • pipeline-docs-data-extractor

    ETL-Texts aims to be a simple and efficient pipeline designed for extracting, translating, cleaning, and transforming text files.

  • Project mention: ETL Texts | news.ycombinator.com | 2024-01-14
  • workshop-realtime-data-pipelines

    You will inspect and run a sample architecture making use of Apache Pulsar™ and Pulsar Functions for real-time, event-streaming-based data ingestion, cleaning and processing.

  • ticker_selection_BI_dashboard

    Data Engineering Project: 4 shares of a stock data extraction, upload on MySql used to be in a BI project

  • reddit_api_elt

  • Project mention: Reddit ELT Pipeline | /r/dataengineering | 2023-12-11

    Hi everyone, this is my first DE project. Baitur5/reddit_api_elt (github.com) . It is basically about a data pipeline that extracts Reddit data for a Google Data Studio report, focusing on a specific subreddit Can you guys check it out , and give some advice & tips on how to improve it or the next things I should add.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python etl-pipeline related posts

Index

What are some of the best open-source etl-pipeline projects in Python? This list will help you:

Project Stars
1 pyspark-example-project 1,370
2 Udacity-Data-Engineering-Projects 1,295
3 patterns-devkit 106
4 unstract 106
5 prism 79
6 bitcoinMonitor 53
7 Spooq 8
8 dados-censup 6
9 pipeline-docs-data-extractor 5
10 workshop-realtime-data-pipelines 3
11 ticker_selection_BI_dashboard 2
12 reddit_api_elt 2

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com