Go ETL

Open-source Go projects categorized as ETL

Top 23 Go ETL Projects

  • Benthos

    Fancy stream processing made operationally mundane

  • Project mention: Ask HN: Who is hiring? (December 2023) | news.ycombinator.com | 2023-12-01
  • steampipe

    Zero-ETL, infinite possibilities. Live query APIs, code & more with SQL. No DB required.

  • Project mention: Steampipe: Dynamically query APIs, code and more with SQL | news.ycombinator.com | 2024-04-04
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • cloudquery

    The open source high performance ELT framework powered by Apache Arrow

  • Project mention: We might want to regularly keep track of how important each server is | news.ycombinator.com | 2024-02-06

    Check out CloudQuery - https://github.com/cloudquery/cloudquery for an easy cloud asset inventory.

  • Rudderstack

    Privacy and Security focused Segment-alternative, in Golang and React

  • Project mention: Rudderstack Switches to Elastic License | news.ycombinator.com | 2023-09-08
  • incubator-devlake

    Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

  • go-streams

    A lightweight stream processing library for Go

  • peerdb

    Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage

  • Project mention: Pgwire: a Rust library for PostgreSQL compatible application | news.ycombinator.com | 2024-03-20

    We at PeerDB (https://github.com/PeerDB-io/peerdb) were early adopters of Pgwire to implement our Postgres-compatible SQL Layer to do ETL. Very easy to work with. Saved us multiple months of effort to build it from scratch.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • aistore

    AIStore: scalable storage for AI applications

  • optimus

    Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management. (by raystack)

  • Project mention: Data Engineering Tools in Go | /r/dataengineering | 2023-05-18

    You can check odpf github, they created some dataops tools using go, one of the example is optimus (https://github.com/odpf/optimus) which is a data pipeline orchestrator

  • onepanel

    The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

  • omniparser

    omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.

  • conduit

    Conduit streams data between data stores. Kafka Connect replacement. No JVM required. (by ConduitIO)

  • Project mention: Pulling CDC data from Postgres | /r/dataengineering | 2023-04-30

    I'd like to mention Conduit + its Postgres connector. The Pg connector comes built-in, so all that is needed is a single Conduit binary to get started. It relies on WAL, but the connector creates the replication slot itself (if needed).

  • sling-cli

    Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.

  • Project mention: FLaNK 04 March 2024 | dev.to | 2024-03-04
  • Dataplane

    Dataplane is a data platform that makes it easy to construct a data mesh with automated data pipelines and workflows.

  • steampipe-plugin-aws

    Use SQL to instantly query AWS resources across regions and accounts. Open source CLI. No DB required.

  • Project mention: Osquery: An sqlite3 virtual table exposing operating system data to SQL | news.ycombinator.com | 2024-02-25

    be mindful of its AGPLv3 https://github.com/turbot/steampipe/blob/v0.21.8/LICENSE (AFAIK v0.4.3 is the last MIT release https://github.com/turbot/steampipe/blob/v0.4.3/LICENSE ) and the actual providers are Apache 2 <https://github.com/turbot/steampipe-plugin-aws/blob/v0.131.0...> (but I don't know if provider drift makes them compatible with 0.4 or not)

    iasql seems to be AWS only, but good for them for taking this on:

  • grate

    A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

  • peaks-consolidation

    The Peaks Consolidation is equipped with state-of-the-art algorithms and data structures that support high-performance databending exercises. It specializes in management accounting and consolidation, with some special topics in machine learning and bioinformatics.

  • Project mention: Filter a 7 billion-row dataset using 32GB Memory | /r/bigdata | 2023-06-29

    Script and Data

  • beneath

    Beneath is a serverless real-time data platform ⚡️

  • cuetils

    CLI and library for diff, patch, and ETL operations on CUE, JSON, and Yaml

  • zeus

    When no one can tell the difference between art, and an empty canvas, the meaning is lost. (by zeus-fyi)

  • Project mention: Kubernetes Compute Search Engine | /r/kubernetes | 2023-11-11
  • csvplus

    csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

  • avro

    Apache AVRO for go (by khezen)

  • steampipe-sqlite

    Steampipe SQLite is a zero-ETL engine for SQLite. Virtual tables translate queries into live API calls for cloud services and APIs. Hundreds of plugins with thousands of documented examples.

  • Project mention: Steampipe SQLite – Virtual tables translated for common APIs | news.ycombinator.com | 2023-12-20
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go ETL related posts

Index

What are some of the best open-source ETL projects in Go? This list will help you:

Project Stars
1 Benthos 7,559
2 steampipe 6,379
3 cloudquery 5,584
4 Rudderstack 3,926
5 incubator-devlake 2,424
6 go-streams 1,753
7 peerdb 1,615
8 aistore 1,089
9 optimus 737
10 onepanel 696
11 omniparser 634
12 conduit 345
13 sling-cli 251
14 Dataplane 183
15 steampipe-plugin-aws 171
16 grate 133
17 peaks-consolidation 102
18 beneath 78
19 cuetils 76
20 zeus 72
21 csvplus 66
22 avro 45
23 steampipe-sqlite 43

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com