Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression. Learn more →
Similar projects and alternatives to lakeFS
🦉Data Version Control | Git for Data & Models | ML Experiments Management
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)
Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.
Git extension for versioning large files
Successor: https://github.com/fluxcd/helm-controller — The Flux Helm Operator, once upon a time a solution for declarative Helming.
Next-gen identity server (think Auth0, Okta, Firebase) with Ory-hardened authentication, MFA, FIDO2, TOTP, WebAuthn, profile management, identity schemas, social sign in, registration, account recovery, passwordless. Golang, headless, API-only - without templating or theming headaches. Available as a cloud service. (by ory)
Disk Usage/Free Utility - a better 'df' alternative
HTTP load generator, ApacheBench (ab) replacement
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
An HTTP toolkit for security research.
A MySQL-compatible relational database with a storage agnostic query engine. Implemented in pure Go.
An extremely fast bundler for the web
Dolt – Git for Data
Open source platform for the machine learning lifecycle
Fast and Simple Serverless Functions for Kubernetes
Concourse is a container-based continuous thing-doer written in Go.
Kubebuilder - SDK for building Kubernetes APIs using CRDs
A horizontally scalable, highly available, multi-tenant, long term Prometheus. (by cortexproject)
Create beautiful system diagrams with Go
Enhancements tracking repo for Kubernetes
Discord Bot to automute Among Us players at round transitions, in conjunction with https://github.com/automuteus/amonguscapture
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
lakeFS reviews and mentions
Dolt Is Git for Data
5 projects | news.ycombinator.com | 23 Jun 2022
Also in the same vein, check out https://lakefs.io/
[P] ArtiV: Version control system for large files
2 projects | reddit.com/r/MachineLearning | 8 Mar 2022
Data Science Workflows — Notebook to Production
7 projects | dev.to | 8 Feb 2022
Git was designed for managing software development projects and for versioning text/code files. Therefore, Git doesn’t handle large files. Git released Git LFS (Large File System) to overcome large file versioning, which is better than Git, but fails when scaling. Also, both Git and Git LFS are not optimized for data science workflow. To overcome this challenge, many powerful tools emerged in recent years, such as DVC, Delta Lake, LakeFS, and more.
Unstructured Data Governance for ML
4 projects | reddit.com/r/dataengineering | 31 Dec 2021
LakeFS Turns 1 and Raises 15M in a Week: (Enable Git for Large-Scale Data Lakes)
2 projects | news.ycombinator.com | 8 Aug 2021
We're Oz and Einat, co-founders of lakeFS (https://lakefs.io/), an open-source project that allows the creation of performant git-like repositories over an object store (i.e. S3).
Prior to starting lakeFS we were VP of R&D and CTO at SimilarWeb, a (now-public) Israeli web analytics company whose business model is based on the collection and analysis of the internet's activity.2 projects | news.ycombinator.com | 8 Aug 2021
Recovering from a pernicious error in a million S3 files shouldn't require a full day or even week of work to fix… instead let's make it an instantaneous revert operation to a previous commit.
The challenge to implement this type of functionality is a technical one, one we took it upon ourselves to solve. It's been 1 year since the first public commit on lakeFS and we've now raised a $15M Series A to continue building and improving our vision.
We've evolved a ton in the past year, completely refactoring the data model to remove dependency on Postgres. Fittingly, we now use rocksDB on the object store to persist the metadata lakeFS manages (with the added benefit of simplifying the installation process). Check out the roadmap to follow our progress on building out native integrations with other important technologies in the open data stack such as Spark, Hive Metastore, and Delta Lake.
We encourage you to check out our Github repo: (https://github.com/treeverse/lakeFS) and documentation pages: (https://docs.lakefs.io/).
We're proud of how far we've come, but know there's lots more work to do. Please do let us know your thoughts!
Gopher Gold #14 - Wed Oct 07 2020
22 projects | dev.to | 7 Oct 2020
treeverse/lakeFS (Go): An open source platform that delivers resilience and manageability to object-storage based data lakes
A note from our sponsor - InfluxDB
www.influxdata.com | 31 Mar 2023
treeverse/lakeFS is an open source project licensed under Apache License 2.0 which is an OSI approved license.