LakeFS Turns 1 and Raises 15M in a Week: (Enable Git for Large-Scale Data Lakes)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • lakeFS

    lakeFS - Data version control for your data lake | Git for data

  • Recovering from a pernicious error in a million S3 files shouldn't require a full day or even week of work to fix… instead let's make it an instantaneous revert operation to a previous commit.

    The challenge to implement this type of functionality is a technical one, one we took it upon ourselves to solve. It's been 1 year since the first public commit on lakeFS and we've now raised a $15M Series A to continue building and improving our vision.

    We've evolved a ton in the past year, completely refactoring the data model to remove dependency on Postgres. Fittingly, we now use rocksDB on the object store to persist the metadata lakeFS manages (with the added benefit of simplifying the installation process). Check out the roadmap to follow our progress on building out native integrations with other important technologies in the open data stack such as Spark, Hive Metastore, and Delta Lake.

    We encourage you to check out our Github repo: (https://github.com/treeverse/lakeFS) and documentation pages: (https://docs.lakefs.io/).

    We're proud of how far we've come, but know there's lots more work to do. Please do let us know your thoughts!

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts