dbt

Top 23 dbt Open-Source Projects

  • data-engineering-zoomcamp

    Free Data Engineering course!

  • Project mention: Data Engineering Zoomcamp Week 6 - using redpanda 1 | dev.to | 2024-04-09

    References: Data engineering zoomcamp week 6 course and homework notes: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/cohorts/2024/06-streaming

  • doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

  • Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27

    As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Mage

    🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai

  • Project mention: FLaNK AI-April 22, 2024 | dev.to | 2024-04-22
  • OpenMetadata

    Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.

  • Project mention: How to Dynamically Adjust the Height of a Textarea in ReactJS | dev.to | 2023-10-25

    In this blog post, I have demonstrated how I addressed the challenge of dynamically adjusting the height of a textarea element based on its content, preventing the need for vertical scrolling in the title section of the OpenMetadata Knowledge article page.

  • lightdash

    Self-serve BI to 10x your data team ⚡️

  • Project mention: Apache Superset | news.ycombinator.com | 2024-02-26

    > YAML, pivoting being done in the frontend, no symmetric aggregates

    (one of the maintainers of Lightdash) You touched on some of our most interesting problems here! Would be especially interested to hear about what you liked / didn't like about symmetric aggregates in Looker and how you find dev with YAML. If you have an idea of how you'd like these to look in Lightdash, the team would be really open to making that a reality.

    For pivoting in the backend, this is coming! Issue here: https://github.com/lightdash/lightdash/issues/2907

  • evidence

    Business intelligence as code: build fast, interactive data visualizations in pure SQL and markdown

  • Project mention: SQLPage – Building a full web application with nothing but SQL queries [video] | news.ycombinator.com | 2024-03-11

    It’s interesting to me how far you have pushed the SQL language in this framework, such that it truly is “SQL only”.

    The challenge as I see it with enabling analysts to build websites is that you need to build abstractions to get from familiar (SQL, yaml) - the language of analytics, to new (HTML, CSS, JS) - the language of the web browser

    As one of the maintainers of Evidence (https://evidence.dev), one of the things I’ve often considered is how accessible our syntax is to analysts. Our syntax combines SQL and Markdown, with MDX style components e.g.

    The are inherently webdev-ey, and I do think they put off potential users.

    On the flip-side, by adhering to web standards, you get extensibility out of the box, and working out what to do is just a Google search away.

    Anyway, thanks for the thought provoking piece.

  • data-diff

    Compare tables within or across databases

  • Project mention: How to Check 2 SQL Tables Are the Same | news.ycombinator.com | 2023-07-26

    If the issue happen a lot, there is also: https://github.com/datafold/data-diff

    That is a nice tool to do it cross database as well.

    I think it's based on checksum method.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • soda-core

    :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

  • elementary

    The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

  • re_data

    re_data - fix data issues before your users & CEO would discover them 😊

  • sqlmesh

    Efficient data transformation and modeling framework that is backwards compatible with dbt.

  • Project mention: Launch HN: Serra (YC S23) – Open-source, Python-based dbt alternative | news.ycombinator.com | 2023-08-14

    There is also sqlmesh (https://sqlmesh.com/). Pretty new as well. It introduces some interesting concepts. For smaller dbt projects it could be a drop-in replacement as it allows importing dbt projects.

  • dbt-expectations

    Port(ish) of Great Expectations to dbt test macros

  • Project mention: Dbt tests vs Soda SQL | /r/dataengineering | 2023-05-26

    Have not used Soda, but dbt indeed is pretty good especially when adding dbt-expectations

  • awesome-dbt

    A curated list of awesome dbt resources

  • Project mention: Good list of dbt tools/resources | /r/dataengineering | 2023-12-05
  • kuwala

    Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demograp

  • Project mention: Show HN: GeoSage – A ETL Webtool for Geo and Demographics Data from the Open Web | news.ycombinator.com | 2023-10-05

    --> Google Trends Data for Regions (Coming Soon)

    The tool goes beyond our previously published CLI tool (https://github.com/kuwala-io/kuwala/tree/master/kuwala) by providing a hostable solution with a user-friendly interface. We have not open-sourced it yet but a demo is available here: https://geosage.kuwala.io/.

    Urban planners can utilize movement data to analyze foot traffic in different city zones. Marketers can leverage demographic data to tailor campaigns more effectively. Developers can build their apps on top of it.

    To round it up .... GeoSage brings...

    Unified Data Management: Access data from OSM, Facebook, and soon Google, all in one place.

  • dbt-duckdb

    dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)

  • multiwoven

    🔥 Open Source Reverse ETL and Customer Data Platform (CDP). An open-source alternative to Hightouch, Census, and RudderStack.

  • Project mention: Multiwoven Reverse ETL (0.2.0) – Open-Source Alternative to Hightouch and Census | news.ycombinator.com | 2024-04-19

    Multiwoven is now a leading Open Source Alternative to Hightouch, Census, and Rudderstack.

    It's been a great journey so far, and we are excited to announce a major update to Multiwoven - our new release, Multiwoven 0.2.0, is now available!

    Repo: https://github.com/Multiwoven/multiwoven

    This release brings a host of new features, enhancements, and bug fixes to streamline data syncs and user experience.

    From new connectors to advanced reporting dashboards, as a team, we have been working hard on these updates based on the feedback and requests from our customers and the community.

    - 10+ new connectors added to Multiwoven, including

  • streamify

    A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!

  • piperider

    Code review for data in dbt

  • Project mention: Show HN: PipeRider – open-source Data Impact Analysis for dbt changes | news.ycombinator.com | 2023-09-06
  • automate-dv

    A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)

  • astronomer-cosmos

    Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code

  • Project mention: Run dbt projects as Apache Airflow DAGs and Task Groups with a few lines of code | news.ycombinator.com | 2023-05-01
  • dbt-metabase

    dbt + Metabase integration

  • faros-community-edition

    BI, API and Automation layer for your Engineering Operations data

  • airflow-dbt

    Apache Airflow integration for dbt

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

dbt related posts

Index

What are some of the best open-source dbt projects? This list will help you:

Project Stars
1 data-engineering-zoomcamp 22,446
2 doris 11,314
3 Mage 7,001
4 OpenMetadata 4,140
5 lightdash 3,399
6 evidence 3,320
7 data-diff 2,842
8 soda-core 1,751
9 elementary 1,739
10 re_data 1,521
11 sqlmesh 1,249
12 dbt-expectations 939
13 awesome-dbt 917
14 kuwala 755
15 dbt-duckdb 729
16 multiwoven 617
17 streamify 474
18 piperider 467
19 automate-dv 456
20 astronomer-cosmos 449
21 dbt-metabase 425
22 faros-community-edition 403
23 airflow-dbt 379

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com