Go Data Science

Open-source Go projects categorized as Data Science

Top 23 Go Data Science Projects

  • excelize

    Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets

  • Project mention: Recommend a powerful excel processing library, @zurmokeeper/exceljs, which supports encryption and decryption of xlsx files and flexible setting of multiple table headers when exporting, etc. | /r/node | 2023-07-01

    Then I found out that WPS only supports ecma376 standard encrytion for xlsx files. Then I referred to the official documentation and libraries in other languages, such as msoffcrypto-tool written in python. msoffcrypto-tool) and go's excelize. Since I don't know much about encryption and decryption, the process of implementation is also a bit of a twist.

  • gop

    The Go+ programming language is designed for engineering, STEM education, and data science

  • Project mention: Go Enums Suck | news.ycombinator.com | 2024-03-01

    https://github.com/goplus/gop, but they go slightly too overboard imo.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • pachyderm

    Data-Centric Pipelines and Data Versioning

  • Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

    20. Pachyderm | Github | tutorial

  • flyte

    Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

  • Project mention: First 15 Open Source Advent projects | dev.to | 2023-12-15

    9. Flyte by Union AI | Github | tutorial

  • gophernotes

    The Go kernel for Jupyter notebooks and nteract.

  • Project mention: Go: What We Got Right, What We Got Wrong | news.ycombinator.com | 2024-01-04

    https://github.com/gopherdata/gophernotes

    I've had this bookmarked for some time and just havent gotten around to it.

  • determined

    Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.

  • Project mention: Open Source Advent Fun Wraps Up! | dev.to | 2024-01-05

    17. Determined AI | Github | tutorial

  • lgo

    Interactive Go programming with Jupyter

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • dataframe-go

    DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration

  • Project mention: packages similar to Pandas | /r/golang | 2023-05-10

    Numpy functionality is largely covered by https://www.gonum.org/ but for pandas I'm not sure if there is an equivalent as widely accepted. However, you might try https://github.com/rocketlaunchr/dataframe-go which I have not tried but it looks like it covers some of what you're looking for

  • FlowMeter

    ⭐ ⭐ Use ML to classify flows and packets as benign or malicious. ⭐ ⭐

  • reflow

    A language and runtime for distributed, incremental data processing in the cloud

  • bacalhau

    Compute over Data framework for public, transparent, and optionally verifiable computation

  • Project mention: Deno Cron | news.ycombinator.com | 2023-11-29

    This is really interesting - we’ve tried really hard to solve some of these with Bacalhau[1] - a much simpler distributed compute platform. Would love your feedback!

    [1] https://github.com/bacalhau-project/bacalhau

    Disclosure: I confounded Bacalhau

  • aqueduct

    Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure. (by RunLLM)

  • decimal

    A high-performance, arbitrary-precision, floating-point decimal library. (by ericlagergren)

  • gonb

    GoNB, a Go Notebook Kernel for Jupyter

  • Project mention: Go, Python, Rust, and production AI applications | news.ycombinator.com | 2024-03-12

    I've had these strong feelings and the OP describes it really well. Despite being a polyglot programmer, I really struggle with Python, both in expression and performance (unless it's just config for GPUs).

    Some of this frustration was recently an "Unpopular Opinion" on the Go Time Podcast regarding Python being great for "data exploration" but not for "data engineering": https://changelog.com/gotime/304#t=3196

    I've been yearning for better interactive tooling and ML-related libraries bridge this gap and started using some even in just the last week:

    * GoNB (Golang-support for Jupyter notebooks, also from a Googler) https://github.com/janpfeifer/gonb

    * That uses Go-Plotly for graphs/UI: https://github.com/MetalBlueberry/go-plotly

    * GoMLX (GoNB author is also on that project, many thanks Jan!) https://github.com/gomlx/gomlx

    * Hidden at the end of OP is LangChainGo for LLMs, which I haven't used yet: https://github.com/tmc/langchaingo

    Pick those up and let's make the Go community stronger together!

  • qframe

    Immutable data frame for Go

  • goro

    A High-level Machine Learning Library for Go

  • terraform-provider-iterative

    ☁️ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetes

  • Dataplane

    Dataplane is a data platform that makes it easy to construct a data mesh with automated data pipelines and workflows.

  • dud

    A lightweight CLI tool for versioning data alongside source code and building data pipelines.

  • Project mention: Ask HN: How do your ML teams version datasets and models? | news.ycombinator.com | 2023-09-28

    I've used DVC in the past and generally liked its approach. That said, I wholeheartedly agree that it's clunky. It does a lot of things implicitly, which can make it hard to reason about. It was also extremely slow for medium-sized dataset (low 10s of GBs).

    In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django.

    [0]: https://github.com/kevin-hanselman/dud

  • wallet-tracker

    Detect real scammers with Wallet-Tracker CLI from anywhere.

  • igop

    The Go/Go+ Interpreter

  • beneath

    Beneath is a serverless real-time data platform ⚡️

  • go-dataframe

    A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.

  • Project mention: How is Go in data analytics? | /r/golang | 2023-05-25

    Just use my library go-dataframe and you’ll be good to go!

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go Data Science related posts

Index

What are some of the best open-source Data Science projects in Go? This list will help you:

Project Stars
1 excelize 17,279
2 gop 8,777
3 pachyderm 6,074
4 flyte 4,761
5 gophernotes 3,766
6 determined 2,851
7 lgo 2,347
8 dataframe-go 1,112
9 FlowMeter 1,071
10 reflow 952
11 bacalhau 602
12 aqueduct 521
13 decimal 490
14 gonb 421
15 qframe 385
16 goro 368
17 terraform-provider-iterative 287
18 Dataplane 183
19 dud 166
20 wallet-tracker 109
21 igop 101
22 beneath 78
23 go-dataframe 62

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com