Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free. Learn more →
Top 4 Go Big Data Projects
-
There are a couple of other contenders in this space. DVC (https://dvc.org/) seems most similar.
If you're interested in something you can self-host... I work on Pachyderm (https://github.com/pachyderm/pachyderm), which doesn't have a Git-like interface, but also implements data versioning. Our approach de-duplicates between files (even very small files), and our storage algorithm doesn't create objects proportional to O(n) directory nesting depth as Xet appears to. (Xet is very much like Git in that respect.)
The data versioning system enables us to run pipelines based on changes to your data; the pipelines declare what files they read, and that allows us to schedule processing jobs that only reprocess new or changed data, while still giving you a full view of what "would" have happened if all the data had been reprocessed. This, to me, is the key advantage of data versioning; you can save hundreds of thousands of dollars on compute. Being able to undo an oopsie is just icing on the cake.
Xet's system for mounting a remote repo as a filesystem is a good idea. We do that too :)
-
-
InfluxDB
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
-
lake
DevLake: the open-source dev data platform & dashboard for your DevOps tools. *Note*: We have moved to Apache Software Foundation https://github.com/apache/incubator-devlake.
-
Go Big Data related posts
- pachyderm: Data-Centric Pipelines and Data Versioning
- "There's a Dashboard for That" New Open Source Template for Bug Tracking
- parquet-tools
- Jinja2 not formatting my text correctly. Any advice?
- We Built an Open Source DevOps Dashboard with Go (mostly!)
- Open Source & Free Dev Dashboard (900+ GitHub Stars in First Week!)
- Just Launched (900 GitHub Stars in One Week) Free Dev Dashboard!
-
A note from our sponsor - SonarQube
www.sonarqube.org | 6 Jun 2023
Index
What are some of the best open-source Big Data projects in Go? This list will help you:
Project | Stars | |
---|---|---|
1 | pachyderm | 5,919 |
2 | hazelcast-go-client | 176 |
3 | lake | 96 |
4 | rtdl | 41 |