SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Go Data Science Projects
-
excelize
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
flyte
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
-
determined
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
aqueduct
Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure. (by RunLLM)
-
terraform-provider-iterative
☁️ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetes
-
Dataplane
Dataplane is a data platform that makes it easy to construct a data mesh with automated data pipelines and workflows.
-
go-dataframe
A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Recommend a powerful excel processing library, @zurmokeeper/exceljs, which supports encryption and decryption of xlsx files and flexible setting of multiple table headers when exporting, etc. | /r/node | 2023-07-01Then I found out that WPS only supports ecma376 standard encrytion for xlsx files. Then I referred to the official documentation and libraries in other languages, such as msoffcrypto-tool written in python. msoffcrypto-tool) and go's excelize. Since I don't know much about encryption and decryption, the process of implementation is also a bit of a twist.
https://github.com/goplus/gop, but they go slightly too overboard imo.
20. Pachyderm | Github | tutorial
9. Flyte by Union AI | Github | tutorial
https://github.com/gopherdata/gophernotes
I've had this bookmarked for some time and just havent gotten around to it.
17. Determined AI | Github | tutorial
Numpy functionality is largely covered by https://www.gonum.org/ but for pandas I'm not sure if there is an equivalent as widely accepted. However, you might try https://github.com/rocketlaunchr/dataframe-go which I have not tried but it looks like it covers some of what you're looking for
This is really interesting - we’ve tried really hard to solve some of these with Bacalhau[1] - a much simpler distributed compute platform. Would love your feedback!
[1] https://github.com/bacalhau-project/bacalhau
Disclosure: I confounded Bacalhau
Project mention: Go, Python, Rust, and production AI applications | news.ycombinator.com | 2024-03-12I've had these strong feelings and the OP describes it really well. Despite being a polyglot programmer, I really struggle with Python, both in expression and performance (unless it's just config for GPUs).
Some of this frustration was recently an "Unpopular Opinion" on the Go Time Podcast regarding Python being great for "data exploration" but not for "data engineering": https://changelog.com/gotime/304#t=3196
I've been yearning for better interactive tooling and ML-related libraries bridge this gap and started using some even in just the last week:
* GoNB (Golang-support for Jupyter notebooks, also from a Googler) https://github.com/janpfeifer/gonb
* That uses Go-Plotly for graphs/UI: https://github.com/MetalBlueberry/go-plotly
* GoMLX (GoNB author is also on that project, many thanks Jan!) https://github.com/gomlx/gomlx
* Hidden at the end of OP is LangChainGo for LLMs, which I haven't used yet: https://github.com/tmc/langchaingo
Pick those up and let's make the Go community stronger together!
Project mention: Ask HN: How do your ML teams version datasets and models? | news.ycombinator.com | 2023-09-28I've used DVC in the past and generally liked its approach. That said, I wholeheartedly agree that it's clunky. It does a lot of things implicitly, which can make it hard to reason about. It was also extremely slow for medium-sized dataset (low 10s of GBs).
In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django.
[0]: https://github.com/kevin-hanselman/dud
Just use my library go-dataframe and you’ll be good to go!
Go Data Science related posts
- Frawk: An efficient Awk-like programming language. (2021)
- Go Enums Suck
- Fix: Hong Kong is not in China
- Why bad scientific code beats code following "best practices"
- Jupyter Lab Extension to run your GPU-heavy stuff (for free for now) on somebody's else server without blocking yours
- Fix: Hong Kong locale does not always mean China
- packages similar to Pandas
-
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024
Index
What are some of the best open-source Data Science projects in Go? This list will help you:
Project | Stars | |
---|---|---|
1 | excelize | 17,279 |
2 | gop | 8,777 |
3 | pachyderm | 6,074 |
4 | flyte | 4,761 |
5 | gophernotes | 3,766 |
6 | determined | 2,851 |
7 | lgo | 2,347 |
8 | dataframe-go | 1,112 |
9 | FlowMeter | 1,071 |
10 | reflow | 952 |
11 | bacalhau | 602 |
12 | aqueduct | 521 |
13 | decimal | 490 |
14 | gonb | 421 |
15 | qframe | 385 |
16 | goro | 368 |
17 | terraform-provider-iterative | 287 |
18 | Dataplane | 183 |
19 | dud | 166 |
20 | wallet-tracker | 109 |
21 | igop | 101 |
22 | beneath | 78 |
23 | go-dataframe | 62 |
Sponsored