Go Data

Open-source Go projects categorized as Data

Top 23 Go Data Projects

  1. flyte

    Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

    Project mention: Boost your ML pipeline performance with efficient parallelism | dev.to | 2025-04-09

    Flyte is a distributed computation framework that uses a Kubernetes Pod as the fundamental execution environment for each task in a pipeline. When you use MapTasks, Flyte automatically distributes the load among multiple Pods that run in parallel and limits each Pod to downloading and processing only a specific index from the inputs list, preventing inefficient duplicate data movement.

  2. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
  3. cloudquery

    The developer first cloud governance platform

  4. cue

    The home of the CUE language! Validate and define text-based and dynamic configuration

    Project mention: cue VS rcl - a user suggested alternative | libhunt.com/r/cue | 2025-03-15
  5. gofakeit

    Random fake data generator written in go

  6. memphis

    Memphis.dev is a highly scalable and effortless data streaming platform

  7. Stats

    A well tested and comprehensive Golang statistics library package with no dependencies. (by montanaflynn)

  8. incubator-devlake

    Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

    Project mention: Apache DevLake | news.ycombinator.com | 2025-01-19
  9. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  10. rill

    Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

    Project mention: The DuckDB Local UI | news.ycombinator.com | 2025-03-12

    Rill founder here, I have no comment on the UI similarity :) but I would emphasize our vision is building DuckDB-powered metrics layers and exploratory dashboards -- which we presented at DuckCon #6 last month, PDF below [1] -- and less on notebook style UIs like Hex and Jupyter.

    Rill is fully open-source under the Apache license. [2]

    [1] https://blobs.duckdb.org/events/duckcon6/mike-driscoll-rill-...

    [2] https://github.com/rilldata/rill

  11. tigris

    Tigris is an Open Source Serverless NoSQL Database and Search Platform.

    Project mention: Hetzner Object Storage | news.ycombinator.com | 2024-10-07

    I should mention Tigris[0] here. They're also a new Object Storage service, but they have this two-way replication facility with another S3-compatible service. The primary purpose they built it for is to mirror files from your existing S3 to Tigris as files are requested.

    However they also have an option to copy files that are added to Tigris, to S3 automatically [1] (`--shadow-write-through`). I asked their founder if it's okay to use it as an extra redundancy continuously instead of a one-time migration, and they said they have no issues with it.

    [0] https://www.tigrisdata.com

  12. pg_flo

    Stream, transform, and route PostgreSQL data in real-time.

    Project mention: Kuvasz-streamer: open-source CDC for Postgres for low latency replication | news.ycombinator.com | 2025-01-03

    * pg_flo: https://github.com/pgflo/pg_flo

    Are there others? Each of them has slightly different angles and messaging, but it is interesting to see.

  13. finance-go

    :bar_chart: Financial markets data library implemented in go.

  14. tyson

    πŸ₯Š TypeScript as a Configuration Language. TySON stands for TypeScript Object Notation

  15. aqueduct

    Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure. (by RunLLM)

  16. webpalm

    πŸ•ΈοΈ Crawl in the web network

  17. ArtiVC

    A version control system to manage large files.

  18. Dataplane

    Dataplane is a data platform that makes it easy to construct a data mesh with automated data pipelines and workflows.

  19. guardian

    Guardian is universal data access management tool with automated access workflows and security controls across data stores, analytical systems, and cloud products. (by raystack)

  20. pgsink

    Logically replicate data out of Postgres into sinks (files, Google BigQuery, etc)

  21. steampipe-postgres-fdw

    The Steampipe foreign data wrapper (FDW) is a zero-ETL product that provides Postgres foreign tables which translate queries into API calls to cloud services and APIs. It's bundled with Steampipe and also available as a set of standalone extensions for use in your own Postgres database.

  22. steampipe-sqlite

    Steampipe SQLite is a zero-ETL engine for SQLite. Virtual tables translate queries into live API calls for cloud services and APIs. Hundreds of plugins with thousands of documented examples.

  23. rtdl

    rtdl makes it easy to build and maintain a real-time data lake (by realtimedatalake)

  24. go-notebook

    Go-Notebook is inspired by Jupyter Project (link) in order to document Golang code.

  25. pippin

    Go library to create and manage data pipelines on your machine

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go Data discussion

Log in or Post with

Go Data related posts

  • cue VS rcl - a user suggested alternative

    2 projects | 15 Mar 2025
  • Apache DevLake

    1 project | news.ycombinator.com | 19 Jan 2025
  • Show HN: Holos – Configure Kubernetes with CUE data structures instead of YAML

    4 projects | news.ycombinator.com | 9 Dec 2024
  • Ask HN: Happy Thanksgiving What technology are you thankful for?

    1 project | news.ycombinator.com | 28 Nov 2024
  • Stream, transform, and route PostgreSQL data in real-time

    1 project | news.ycombinator.com | 3 Nov 2024
  • Stream, transform, and route PostgreSQL data in real-time (early build)

    1 project | news.ycombinator.com | 28 Oct 2024
  • Pg_flo: Move and transform data between PostgreSQL databases

    1 project | news.ycombinator.com | 26 Oct 2024
  • A note from our sponsor - CodeRabbit
    coderabbit.ai | 19 Apr 2025
    Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more β†’

Index

What are some of the best open-source Data projects in Go? This list will help you:

# Project Stars
1 flyte 6,182
2 cloudquery 6,076
3 cue 5,399
4 gofakeit 4,873
5 memphis 3,294
6 Stats 2,966
7 incubator-devlake 2,703
8 rill 2,010
9 tigris 934
10 pg_flo 773
11 finance-go 735
12 tyson 553
13 aqueduct 520
14 webpalm 366
15 ArtiVC 298
16 Dataplane 226
17 guardian 136
18 pgsink 89
19 steampipe-postgres-fdw 78
20 steampipe-sqlite 55
21 rtdl 45
22 go-notebook 38
23 pippin 14

Sponsored
InfluxDB high-performance time series database
Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
influxdata.com

Did you know that Go is
the 4th most popular programming language
based on number of references?