Top 23 Go Data Science Projects

excelize

15 17,279 8.8 Go

Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets

Project mention: Recommend a powerful excel processing library, @zurmokeeper/exceljs, which supports encryption and decryption of xlsx files and flexible setting of multiple table headers when exporting, etc. | /r/node | 2023-07-01

Then I found out that WPS only supports ecma376 standard encrytion for xlsx files. Then I referred to the official documentation and libraries in other languages, such as msoffcrypto-tool written in python. msoffcrypto-tool) and go's excelize. Since I don't know much about encryption and decryption, the process of implementation is also a bit of a twist.

gop

23 8,777 9.8 Go

The Go+ programming language is designed for engineering, STEM education, and data science

Project mention: Go Enums Suck | news.ycombinator.com | 2024-03-01

https://github.com/goplus/gop, but they go slightly too overboard imo.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
pachyderm

8 6,074 9.8 Go

Data-Centric Pipelines and Data Versioning

Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

20. Pachyderm | Github | tutorial

flyte

31 4,761 9.8 Go

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

Project mention: First 15 Open Source Advent projects | dev.to | 2023-12-15

9. Flyte by Union AI | Github | tutorial

gophernotes

10 3,766 3.0 Go

The Go kernel for Jupyter notebooks and nteract.

Project mention: Go: What We Got Right, What We Got Wrong | news.ycombinator.com | 2024-01-04

https://github.com/gopherdata/gophernotes
I've had this bookmarked for some time and just havent gotten around to it.

determined

10 2,851 9.9 Go

Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.

Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

17. Determined AI | Github | tutorial

lgo

0 2,347 0.0 Go

Interactive Go programming with Jupyter
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
dataframe-go

3 1,112 0.0 Go

DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration

Project mention: packages similar to Pandas | /r/golang | 2023-05-10

Numpy functionality is largely covered by https://www.gonum.org/ but for pandas I'm not sure if there is an equivalent as widely accepted. However, you might try https://github.com/rocketlaunchr/dataframe-go which I have not tried but it looks like it covers some of what you're looking for

FlowMeter

3 1,071 3.0 Go

⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐
reflow

7 952 6.2 Go

A language and runtime for distributed, incremental data processing in the cloud
bacalhau

12 602 9.8 Go

Compute over Data framework for public, transparent, and optionally verifiable computation

Project mention: Deno Cron | news.ycombinator.com | 2023-11-29

This is really interesting - we’ve tried really hard to solve some of these with Bacalhau[1] - a much simpler distributed compute platform. Would love your feedback!
[1] https://github.com/bacalhau-project/bacalhau
Disclosure: I confounded Bacalhau

aqueduct

2 521 8.7 Go

Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure. (by RunLLM)
decimal

3 490 0.0 Go

A high-performance, arbitrary-precision, floating-point decimal library. (by ericlagergren)
gonb

5 421 9.3 Go

GoNB, a Go Notebook Kernel for Jupyter

Project mention: Go, Python, Rust, and production AI applications | news.ycombinator.com | 2024-03-12

I've had these strong feelings and the OP describes it really well. Despite being a polyglot programmer, I really struggle with Python, both in expression and performance (unless it's just config for GPUs).
Some of this frustration was recently an "Unpopular Opinion" on the Go Time Podcast regarding Python being great for "data exploration" but not for "data engineering": https://changelog.com/gotime/304#t=3196
I've been yearning for better interactive tooling and ML-related libraries bridge this gap and started using some even in just the last week:
* GoNB (Golang-support for Jupyter notebooks, also from a Googler) https://github.com/janpfeifer/gonb
* That uses Go-Plotly for graphs/UI: https://github.com/MetalBlueberry/go-plotly
* GoMLX (GoNB author is also on that project, many thanks Jan!) https://github.com/gomlx/gomlx
* Hidden at the end of OP is LangChainGo for LLMs, which I haven't used yet: https://github.com/tmc/langchaingo
Pick those up and let's make the Go community stronger together!

qframe

1 385 5.1 Go

Immutable data frame for Go
goro

3 368 0.0 Go

A High-level Machine Learning Library for Go
terraform-provider-iterative

24 287 6.0 Go

☁️ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetes
Dataplane

1 183 8.3 Go

Dataplane is a data platform that makes it easy to construct a data mesh with automated data pipelines and workflows.
dud

14 166 6.3 Go

A lightweight CLI tool for versioning data alongside source code and building data pipelines.

Project mention: Ask HN: How do your ML teams version datasets and models? | news.ycombinator.com | 2023-09-28

I've used DVC in the past and generally liked its approach. That said, I wholeheartedly agree that it's clunky. It does a lot of things implicitly, which can make it hard to reason about. It was also extremely slow for medium-sized dataset (low 10s of GBs).
In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django.
[0]: https://github.com/kevin-hanselman/dud

wallet-tracker

48 109 2.7 Go

Detect real scammers with Wallet-Tracker CLI from anywhere.
igop

1 101 7.9 Go

The Go/Go+ Interpreter
beneath

2 78 0.0 Go

Beneath is a serverless real-time data platform ⚡️
go-dataframe

4 62 6.9 Go

A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.

Project mention: How is Go in data analytics? | /r/golang | 2023-05-25

Just use my library go-dataframe and you’ll be good to go!

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go Data Science related posts

Frawk: An efficient Awk-like programming language. (2021)
4 projects | news.ycombinator.com | 21 Apr 2024
Go Enums Suck
1 project | news.ycombinator.com | 1 Mar 2024
Fix: Hong Kong is not in China
1 project | news.ycombinator.com | 23 Jan 2024
Why bad scientific code beats code following "best practices"
3 projects | news.ycombinator.com | 6 Jan 2024
Jupyter Lab Extension to run your GPU-heavy stuff (for free for now) on somebody's else server without blocking yours
2 projects | /r/datascience | 22 Sep 2023
Fix: Hong Kong locale does not always mean China
1 project | news.ycombinator.com | 21 Jul 2023
packages similar to Pandas
2 projects | /r/golang | 10 May 2023
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Data Science projects in Go? This list will help you:

	Project	Stars
1	excelize	17,279
2	gop	8,777
3	pachyderm	6,074
4	flyte	4,761
5	gophernotes	3,766
6	determined	2,851
7	lgo	2,347
8	dataframe-go	1,112
9	FlowMeter	1,071
10	reflow	952
11	bacalhau	602
12	aqueduct	521
13	decimal	490
14	gonb	421
15	qframe	385
16	goro	368
17	terraform-provider-iterative	287
18	Dataplane	183
19	dud	166
20	wallet-tracker	109
21	igop	101
22	beneath	78
23	go-dataframe	62