Go data-engineering

Open-source Go projects categorized as data-engineering

Top 17 Go data-engineering Projects

data-engineering
  1. argo

    Workflow Engine for Kubernetes

    Project mention: Data on Kubernetes: Part 4 - Argo Workflows: Simplify parallel jobs : Container-native workflow engine for Kubernetes 🔮 | dev.to | 2024-07-28

    Remember to meet the prerequisites, including AWS cli, kubectl, terraform and Argo Workflow CLI.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. connect

    Fancy stream processing made operationally mundane (by redpanda-data)

    Project mention: Fivetran to Acquire Census | news.ycombinator.com | 2025-05-01
  4. cloudquery

    The developer first cloud governance platform

  5. lakeFS

    lakeFS - Data version control for your data lake | Git for data

  6. Rudderstack

    Privacy and Security focused Segment-alternative, in Golang and React

  7. memphis

    Memphis.dev is a highly scalable and effortless data streaming platform

  8. incubator-devlake

    Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

    Project mention: Apache DevLake | news.ycombinator.com | 2025-01-19
  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. bacalhau

    Community-driven, simple, yet powerful framework for fast, cost-effective distributed Compute over Data.

    Project mention: Show HN: Sample NCSA Log Generator | news.ycombinator.com | 2025-03-15

    Absolutely no business model behind this - just Apache2/MIT. If you like it, just use it! If you don't, happy to tweak it!

    [1] https://github.com/bacalhau-project/bacalhau

    [2] https://github.com/bacalhau-project/examples/tree/main/utili...

    [3] https://github.com/orgs/bacalhau-project/packages/container/...

  11. conduit

    Conduit streams data between data stores. Kafka Connect replacement. No JVM required. (by ConduitIO)

    Project mention: Is there an Alternative to Debezium + Kafka? | dev.to | 2024-11-03

    Conduit

  12. Dataplane

    Dataplane is a data platform that makes it easy to construct a data mesh with automated data pipelines and workflows.

  13. dud

    A lightweight CLI tool for versioning data alongside source code and building data pipelines.

  14. bulker

    Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL) (by jitsucom)

    Project mention: Bulker: Streaming and batching large amount of data into data warehouses | news.ycombinator.com | 2025-02-14
  15. rtdl

    rtdl makes it easy to build and maintain a real-time data lake (by realtimedatalake)

  16. pippin

    Go library to create and manage data pipelines on your machine

  17. csv2opensearch

    Import CSV files into OpenSearch or Elasticsearch

  18. amplify

    Bacalhau Amplify: automatic enrichment, enhancement, and explanation of your data (by bacalhau-project)

  19. Gear5

    high performance better alternative to Airbyte, Singer, Meltano

  20. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go data-engineering discussion

Log in or Post with

Go data-engineering related posts

  • Fivetran to Acquire Census

    3 projects | news.ycombinator.com | 1 May 2025
  • Apache DevLake

    1 project | news.ycombinator.com | 19 Jan 2025
  • Code Quality at Scale with AST Grep and LLMs

    2 projects | news.ycombinator.com | 17 Oct 2024
  • Databrew Blink: Open-Source Database CDC Tool

    1 project | news.ycombinator.com | 1 Aug 2024
  • connect VS goka - a user suggested alternative

    2 projects | 23 Jul 2024
  • Engineering Metrics Are Overrated

    1 project | dev.to | 3 Jul 2024
  • Go concurrency simplified. Part 1: Channels and goroutines

    2 projects | dev.to | 8 Dec 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 14 Jun 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

What are some of the best open-source data-engineering projects in Go? This list will help you:

# Project Stars
1 argo 15,713
2 connect 8,375
3 cloudquery 6,120
4 lakeFS 4,716
5 Rudderstack 4,207
6 memphis 3,303
7 incubator-devlake 2,748
8 bacalhau 806
9 conduit 528
10 Dataplane 226
11 dud 210
12 bulker 178
13 rtdl 45
14 pippin 14
15 csv2opensearch 12
16 amplify 12
17 Gear5 3

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Go is
the 4th most popular programming language
based on number of references?