Go Dataset

Open-source Go projects categorized as Dataset

Top 4 Go Dataset Projects

  • ChainWalker

    Rapid Smart Contract Crawler

  • dud

    A lightweight CLI tool for versioning data alongside source code and building data pipelines.

  • Project mention: Ask HN: How do your ML teams version datasets and models? | news.ycombinator.com | 2023-09-28

    I've used DVC in the past and generally liked its approach. That said, I wholeheartedly agree that it's clunky. It does a lot of things implicitly, which can make it hard to reason about. It was also extremely slow for medium-sized dataset (low 10s of GBs).

    In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django.

    [0]: https://github.com/kevin-hanselman/dud

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ndn-sync

    ndn-sync: A Go Library for NDN Distributed Dataset Synchronization "Sync" Protocols.

  • tinygpkg-data

    Small geographic datasets based on open data + tools

  • Project mention: Protomaps – A free and open source map of the world | news.ycombinator.com | 2023-10-23

    SVG is kind-of terrible for maps, but you can get pretty small with GeoPackage (read: sqlite). I recently spent a bit too long on exactly this problem and ended up with the following.

    116KB - 5MB for country borders

    16MB - 52MB for ~50K city/county level borders based on geoBoundaries

    The range of sizes depends on how much custom compression/simplification you put into it. The source files are about 10x bigger, but that's already pretty small.

    Topojson might be even smaller though.

    Check the repo for details /selfplug https://github.com/SmilyOrg/tinygpkg-data

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go Dataset related posts

  • 🐂 🌾 Oxen.ai - Blazing Fast Unstructured Data Version Control, built in Rust

    5 projects | /r/rust | 16 Feb 2023
  • Tup – an instrumenting file-based build system

    4 projects | news.ycombinator.com | 15 Aug 2022
  • Alternative to Git LFS or DVC

    1 project | /r/CKsTechNews | 3 Jan 2022
  • Show HN: A small and simple alternative to Git LFS or DVC

    1 project | news.ycombinator.com | 3 Jan 2022
  • Dud: a lightweight tool for versioning data alongside source code and building data pipelines.

    1 project | /r/datascience | 6 Sep 2021
  • Dud: a tool for versioning data alongside source code. A faster and simpler alternative to DVC.

    1 project | /r/mlops | 21 Jun 2021
  • A note from our sponsor - SaaSHub
    www.saashub.com | 1 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Dataset projects in Go? This list will help you:

Project Stars
1 ChainWalker 193
2 dud 166
3 ndn-sync 3
4 tinygpkg-data 2

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com