Nessie: Transactional Catalog for Data Lakes with Git-like semantics
The Dremio Sonar query engine can query your data where it exists whether it's AWS Glue, S3, Nessie Catalogs, MySQL, Postgres, RedShift and an ever growing list of sources.
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)
You can query data organized in many open table formats like Apache Iceberg and Delta Lake. (Here is a good article on what is a table format and the differences between different ones)
Delete the most useless function ever: context switching.. Zigi monitors Jira and GitHub updates, pings you when PRs need approval and lets you take fast actions - all directly from Slack! Plus it reduces cycle time by up to 75%.
The Evolution of the Data Engineer Role
2 projects | news.ycombinator.com | 24 Oct 2022
Delta 2.0 - The Foundation of your Data Lakehouse is Open
2 projects | reddit.com/r/apachespark | 5 Aug 2022
How do we bridge SQL and Python.
1 project | reddit.com/r/datascience | 29 Jul 2022
Performance aside, what's the difference between Iceberg, Hudi & Delta
1 project | reddit.com/r/apachespark | 5 Jul 2022
[D] How do you share big datasets with your team and others?
1 project | reddit.com/r/MachineLearning | 4 Jul 2022