Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 15 etl-framework Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
hamilton
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
-
Cinchoo ETL
ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
dataall
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
-
csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
-
Shift
Shift is a high performance better alternative to Airbyte, Singer, Meltano (by piyushsingariya)
-
flowrunner
Flowrunner is a lightweight package to organize and represent Data Engineering/Science workflows
-
DataPowerTools
Bridging the gap between IEnumerable and IDataReader for dealing with unstructured and loosely-structured data, plus fast ETL + SQL Bulk Copy.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: We might want to regularly keep track of how important each server is | news.ycombinator.com | 2024-02-06Check out CloudQuery - https://github.com/cloudquery/cloudquery for an easy cloud asset inventory.
Project mention: Show HN: Hamilton's UI – observability, lineage, and catalog for data pipelines | news.ycombinator.com | 2024-05-02
Project mention: Why do companies still build data ingestion tooling instead of using a third-party tool like Airbyte? | /r/dataengineering | 2023-12-06Coincidently, I saw a presentation today on a nice half-way-house solution: using embeddable Python libraries like Sling and dlt - both open-source. See https://www.youtube.com/watch?v=gAqOLgG2iYY There is also singer.io which is more of a protocol than a library, but can also be installed although it looks like it is a true community effort and not so well maintained.
An awesome read!
Something related that I found out about from HN a few months back is another engine called quokka. It's particularly interesting and applicable how quokka schedules distributed queries to outperform Spark https://github.com/marsupialtail/quokka/blob/master/blog/why...
Project mention: Flow PHP: the first and most advanced PHP ETL framework | news.ycombinator.com | 2024-04-16
As side hobby I started working on this personal project https://github.com/piyushsingariya/Kaku
Project mention: Recommended patterns or tools for data/row migration between databases? | /r/dotnet | 2023-06-22
etl-framework related posts
-
Flow PHP: the first and most advanced PHP ETL framework
-
FLaNK Weekly 31 December 2023
-
Why do companies still build data ingestion tooling instead of using a third-party tool like Airbyte?
-
SymmetricDS: Open-Source, cross platform database replication software
-
Breakthrough in the book search field! Use Apache SeaTunnel to improve the efficiency of book title similarity search
-
Quokka – Distributed Polars on Ray
-
Questions Regarding design DW
-
A note from our sponsor - InfluxDB
www.influxdata.com | 9 May 2024
Index
What are some of the best open-source etl-framework projects? This list will help you:
Project | Stars | |
---|---|---|
1 | Logstash | 14,014 |
2 | cloudquery | 5,591 |
3 | hamilton | 1,373 |
4 | getting-started | 1,220 |
5 | quokka | 1,084 |
6 | Cinchoo ETL | 738 |
7 | metorikku | 576 |
8 | flow | 352 |
9 | kgtk | 341 |
10 | dataall | 210 |
11 | patterns-devkit | 106 |
12 | csvplus | 66 |
13 | Shift | 9 |
14 | flowrunner | 8 |
15 | DataPowerTools | 8 |
Sponsored