SaaSHub helps you find the best software and product alternatives Learn more →
Top 4 Go Dataset Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: Ask HN: How do your ML teams version datasets and models? | news.ycombinator.com | 2023-09-28I've used DVC in the past and generally liked its approach. That said, I wholeheartedly agree that it's clunky. It does a lot of things implicitly, which can make it hard to reason about. It was also extremely slow for medium-sized dataset (low 10s of GBs).
In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django.
[0]: https://github.com/kevin-hanselman/dud
Project mention: Protomaps – A free and open source map of the world | news.ycombinator.com | 2023-10-23SVG is kind-of terrible for maps, but you can get pretty small with GeoPackage (read: sqlite). I recently spent a bit too long on exactly this problem and ended up with the following.
116KB - 5MB for country borders
16MB - 52MB for ~50K city/county level borders based on geoBoundaries
The range of sizes depends on how much custom compression/simplification you put into it. The source files are about 10x bigger, but that's already pretty small.
Topojson might be even smaller though.
Check the repo for details /selfplug https://github.com/SmilyOrg/tinygpkg-data
Go Dataset related posts
-
🐂 🌾 Oxen.ai - Blazing Fast Unstructured Data Version Control, built in Rust
-
Tup – an instrumenting file-based build system
-
Alternative to Git LFS or DVC
-
Show HN: A small and simple alternative to Git LFS or DVC
-
Dud: a lightweight tool for versioning data alongside source code and building data pipelines.
-
Dud: a tool for versioning data alongside source code. A faster and simpler alternative to DVC.
-
A note from our sponsor - SaaSHub
www.saashub.com | 1 May 2024
Index
What are some of the best open-source Dataset projects in Go? This list will help you:
Project | Stars | |
---|---|---|
1 | ChainWalker | 193 |
2 | dud | 166 |
3 | ndn-sync | 3 |
4 | tinygpkg-data | 2 |
Sponsored