go-mysql-server
lakeFS
Our great sponsors
- Onboard AI - Learn any GitHub repo in 59 seconds
- InfluxDB - Collect and Analyze Billions of Data Points in Real Time
- SaaSHub - Software Alternatives and Reviews
go-mysql-server | lakeFS | |
---|---|---|
21 | 48 | |
1,270 | 3,798 | |
2.8% | 2.4% | |
0.0 | 9.8 | |
about 20 hours ago | 6 days ago | |
Go | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
go-mysql-server
-
I created an in-memory SQL database called MemSQL as a learning project
Might be interested in https://github.com/dolthub/go-mysql-server, which also does this
-
Implementing the MySQL server protocol for fun and profit
https://github.com/dolthub/go-mysql-server
One item under "Scope of this project":
Provide a runnable server speaking the MySQL wire protocol, connected to data sources of your choice.
- MySQL-mimic - Python implementation of the MySQL server wire protocol.
- Parsing SQL
-
Litetree – SQLite with Branches
I just wanted to say thanks for https://github.com/dolthub/go-mysql-server
This is incredibly useful for anyone who wants to build their own DB or wrap another datasource so it's queryable via MySQL protocol.
-
Dolt Is Git for Data
a very cool project they also maintain is a MySQL server framework for arbitrary backends (in Go): https://github.com/dolthub/go-mysql-server
You can define a "virtual" table (schema, how to retrieve rows/columns) and then a MySQL client can connect and execute arbitrary queries on your table (which could just be an API or other source)
-
The world of PostgreSQL wire compatibility
Thanks for this write up! I've been really interested in postgres compatibility in the context of a tool I maintain (https://github.com/mergestat/mergestat) that uses SQLite. I've been looking for a way to expose the SQLite capabilities over a more commonly used wire-protocol like postgres (or mysql) so that existing BI and visualization tools can access the data.
This project is an interesting one: https://github.com/dolthub/go-mysql-server that provides a MySQL interface (wire and SQL) to arbitrary "backends" implemented in go.
It's really interesting how compatibility with existing protocols has become an important feature of new databases - there's so much existing tooling that already speaks postgres (or mysql), being able to leverage that is a huge advantage IMO
- calling Format() on a time struct in a golang program changes the default Location's timezone information in the rest of the program
- Let's write a compiler, part 5: A code generator
lakeFS
-
Jujutsu: A Git-compatible DVCS that is both simple and powerful
Might want to look at purpose built tools for that such as lakeFS (https://github.com/treeverse/lakeFS/)
* Disclaimer: I'm one of the creators/maintainers of the project.
-
Data diffs: Algorithms for explaining what changed in a dataset (2022)
Might want to checkout lakeFS: https://github.com/treeverse/lakeFS
(full disclosure: I'm one of the creators)
-
Dolt Is Git for Data
Also in the same vein, check out https://lakefs.io/
- [P] ArtiV: Version control system for large files
-
Data Science Workflows — Notebook to Production
Git was designed for managing software development projects and for versioning text/code files. Therefore, Git doesn’t handle large files. Git released Git LFS (Large File System) to overcome large file versioning, which is better than Git, but fails when scaling. Also, both Git and Git LFS are not optimized for data science workflow. To overcome this challenge, many powerful tools emerged in recent years, such as DVC, Delta Lake, LakeFS, and more.
-
Unstructured Data Governance for ML
LakeFS: https://lakefs.io/
-
LakeFS Turns 1 and Raises 15M in a Week: (Enable Git for Large-Scale Data Lakes)
Hello HN!
We're Oz and Einat, co-founders of lakeFS (https://lakefs.io/), an open-source project that allows the creation of performant git-like repositories over an object store (i.e. S3).
Prior to starting lakeFS we were VP of R&D and CTO at SimilarWeb, a (now-public) Israeli web analytics company whose business model is based on the collection and analysis of the internet's activity.
Recovering from a pernicious error in a million S3 files shouldn't require a full day or even week of work to fix… instead let's make it an instantaneous revert operation to a previous commit.
The challenge to implement this type of functionality is a technical one, one we took it upon ourselves to solve. It's been 1 year since the first public commit on lakeFS and we've now raised a $15M Series A to continue building and improving our vision.
We've evolved a ton in the past year, completely refactoring the data model to remove dependency on Postgres. Fittingly, we now use rocksDB on the object store to persist the metadata lakeFS manages (with the added benefit of simplifying the installation process). Check out the roadmap to follow our progress on building out native integrations with other important technologies in the open data stack such as Spark, Hive Metastore, and Delta Lake.
We encourage you to check out our Github repo: (https://github.com/treeverse/lakeFS) and documentation pages: (https://docs.lakefs.io/).
We're proud of how far we've come, but know there's lots more work to do. Please do let us know your thoughts!
-
Gopher Gold #14 - Wed Oct 07 2020
treeverse/lakeFS (Go): An open source platform that delivers resilience and manageability to object-storage based data lakes
What are some alternatives?
dvc - 🦉 ML Experiments Management with Git
vitess-sqlparser - simply SQL Parser for Go ( powered by vitess and TiDB )
delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
git-lfs - Git extension for versioning large files
Ory Kratos - Next-gen identity server (think Auth0, Okta, Firebase) with Ory-hardened authentication, PassKeys, MFA, FIDO2, TOTP, WebAuthn, profile management, identity schemas, social sign in, registration, account recovery, passwordless. Golang, headless, API-only - without templating or theming headaches. Available as a cloud service.
MLflow - Open source platform for the machine learning lifecycle
alasql - AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.
duf - Disk Usage/Free Utility - a better 'df' alternative
helm-operator - Successor: https://github.com/fluxcd/helm-controller — The Flux Helm Operator, once upon a time a solution for declarative Helming.
spark-on-k8s-operator - Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Hey - HTTP load generator, ApacheBench (ab) replacement
quilt - Quilt is a data mesh for connecting people with actionable data