Databricks platform for small data, is it worth it?

This page summarizes the projects mentioned and recommended in the original post on reddit.com/r/dataengineering

Our great sponsors
  • Zigi - Close all those tabs. Zigi will handle your updates.
  • Scout APM - Truly a developer’s best friend
  • SonarQube - Static code analysis for 29 languages.
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • delta

    An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)

    Currently the infrastructure we have is some custom made pipelines that load the data on S3, and I use Delta Tables here and there for its convenience: ACID, time travel, merges, CDC etc...

  • Rudderstack

    Privacy and Security focused Segment-alternative, in Golang and React

    Disclaimer: I work for this company. You should check out Rudderstack. It’s free for up to 5M api calls and it supports sending data to S3 or databricks. I’m at the databricks conference as I’m typing this.

  • Zigi

    Close all those tabs. Zigi will handle your updates.. Zigi monitors Jira and GitHub updates, pings you when PRs need approval and lets you take fast actions - all directly from Slack! Plus it reduces cycle time by up to 75%.

  • dbt-duckdb

    dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)

    I like the idea of using duckdb + dbt-duckdb

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts