Delta Lake file compaction / optimization of small files

This page summarizes the projects mentioned and recommended in the original post on reddit.com/r/apachespark

Our great sponsors
  • Scout APM - Truly a developer’s best friend
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • Zigi - Workflow assistant built for devs & their teams
  • SonarQube - Static code analysis for 29 languages.
  • delta

    An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)

    Thet have just announced the open source version of optimize and z order https://github.com/delta-io/delta/releases/tag/v2.0.0rc1

  • Scout APM

    Truly a developer’s best friend. Scout APM is great for developers who want to find and fix performance issues in their applications. With Scout, we'll take care of the bugs so you can focus on building great things 🚀.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts