lakeFS

lakeFS - Data version control for your data lake | Git for data (by treeverse)

lakeFS Alternatives

Similar projects and alternatives to lakeFS

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better lakeFS alternative or higher similarity.

lakeFS reviews and mentions

Posts with mentions or reviews of lakeFS. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-06-23.
  • Dolt Is Git for Data
    5 projects | news.ycombinator.com | 23 Jun 2022
    Also in the same vein, check out https://lakefs.io/
  • [P] ArtiV: Version control system for large files
    2 projects | reddit.com/r/MachineLearning | 8 Mar 2022
  • Data Science Workflows — Notebook to Production
    7 projects | dev.to | 8 Feb 2022
    Git was designed for managing software development projects and for versioning text/code files. Therefore, Git doesn’t handle large files. Git released Git LFS (Large File System) to overcome large file versioning, which is better than Git, but fails when scaling. Also, both Git and Git LFS are not optimized for data science workflow. To overcome this challenge, many powerful tools emerged in recent years, such as DVC, Delta Lake, LakeFS, and more.
  • Unstructured Data Governance for ML
    4 projects | reddit.com/r/dataengineering | 31 Dec 2021
    LakeFS: https://lakefs.io/
  • LakeFS Turns 1 and Raises 15M in a Week: (Enable Git for Large-Scale Data Lakes)
    2 projects | news.ycombinator.com | 8 Aug 2021
    Hello HN!

    We're Oz and Einat, co-founders of lakeFS (https://lakefs.io/), an open-source project that allows the creation of performant git-like repositories over an object store (i.e. S3).

    Prior to starting lakeFS we were VP of R&D and CTO at SimilarWeb, a (now-public) Israeli web analytics company whose business model is based on the collection and analysis of the internet's activity.

    2 projects | news.ycombinator.com | 8 Aug 2021
    Recovering from a pernicious error in a million S3 files shouldn't require a full day or even week of work to fix… instead let's make it an instantaneous revert operation to a previous commit.

    The challenge to implement this type of functionality is a technical one, one we took it upon ourselves to solve. It's been 1 year since the first public commit on lakeFS and we've now raised a $15M Series A to continue building and improving our vision.

    We've evolved a ton in the past year, completely refactoring the data model to remove dependency on Postgres. Fittingly, we now use rocksDB on the object store to persist the metadata lakeFS manages (with the added benefit of simplifying the installation process). Check out the roadmap to follow our progress on building out native integrations with other important technologies in the open data stack such as Spark, Hive Metastore, and Delta Lake.

    We encourage you to check out our Github repo: (https://github.com/treeverse/lakeFS) and documentation pages: (https://docs.lakefs.io/).

    We're proud of how far we've come, but know there's lots more work to do. Please do let us know your thoughts!

  • Gopher Gold #14 - Wed Oct 07 2020
    22 projects | dev.to | 7 Oct 2020
    treeverse/lakeFS (Go): An open source platform that delivers resilience and manageability to object-storage based data lakes
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 31 Mar 2023
    Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression. Learn more →

Stats

Basic lakeFS repo stats
44
3,307
9.4
1 day ago
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com