SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 dbt Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Mage
🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
-
OpenMetadata
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
-
evidence
Business intelligence as code: build fast, interactive data visualizations in pure SQL and markdown
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demograp
-
multiwoven
🔥 Open Source Reverse ETL and Customer Data Platform (CDP). An open-source alternative to Hightouch, Census, and RudderStack.
-
streamify
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
-
automate-dv
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
-
astronomer-cosmos
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
References: Data engineering zoomcamp week 6 course and homework notes: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/cohorts/2024/06-streaming
Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.
Project mention: How to Dynamically Adjust the Height of a Textarea in ReactJS | dev.to | 2023-10-25In this blog post, I have demonstrated how I addressed the challenge of dynamically adjusting the height of a textarea element based on its content, preventing the need for vertical scrolling in the title section of the OpenMetadata Knowledge article page.
> YAML, pivoting being done in the frontend, no symmetric aggregates
(one of the maintainers of Lightdash) You touched on some of our most interesting problems here! Would be especially interested to hear about what you liked / didn't like about symmetric aggregates in Looker and how you find dev with YAML. If you have an idea of how you'd like these to look in Lightdash, the team would be really open to making that a reality.
For pivoting in the backend, this is coming! Issue here: https://github.com/lightdash/lightdash/issues/2907
Project mention: SQLPage – Building a full web application with nothing but SQL queries [video] | news.ycombinator.com | 2024-03-11It’s interesting to me how far you have pushed the SQL language in this framework, such that it truly is “SQL only”.
The challenge as I see it with enabling analysts to build websites is that you need to build abstractions to get from familiar (SQL, yaml) - the language of analytics, to new (HTML, CSS, JS) - the language of the web browser
As one of the maintainers of Evidence (https://evidence.dev), one of the things I’ve often considered is how accessible our syntax is to analysts. Our syntax combines SQL and Markdown, with MDX style components e.g.
The are inherently webdev-ey, and I do think they put off potential users.
On the flip-side, by adhering to web standards, you get extensibility out of the box, and working out what to do is just a Google search away.
Anyway, thanks for the thought provoking piece.
If the issue happen a lot, there is also: https://github.com/datafold/data-diff
That is a nice tool to do it cross database as well.
I think it's based on checksum method.
Project mention: Launch HN: Serra (YC S23) – Open-source, Python-based dbt alternative | news.ycombinator.com | 2023-08-14There is also sqlmesh (https://sqlmesh.com/). Pretty new as well. It introduces some interesting concepts. For smaller dbt projects it could be a drop-in replacement as it allows importing dbt projects.
Have not used Soda, but dbt indeed is pretty good especially when adding dbt-expectations
Project mention: Show HN: GeoSage – A ETL Webtool for Geo and Demographics Data from the Open Web | news.ycombinator.com | 2023-10-05--> Google Trends Data for Regions (Coming Soon)
The tool goes beyond our previously published CLI tool (https://github.com/kuwala-io/kuwala/tree/master/kuwala) by providing a hostable solution with a user-friendly interface. We have not open-sourced it yet but a demo is available here: https://geosage.kuwala.io/.
Urban planners can utilize movement data to analyze foot traffic in different city zones. Marketers can leverage demographic data to tailor campaigns more effectively. Developers can build their apps on top of it.
To round it up .... GeoSage brings...
Unified Data Management: Access data from OSM, Facebook, and soon Google, all in one place.
Project mention: Multiwoven Reverse ETL (0.2.0) – Open-Source Alternative to Hightouch and Census | news.ycombinator.com | 2024-04-19Multiwoven is now a leading Open Source Alternative to Hightouch, Census, and Rudderstack.
It's been a great journey so far, and we are excited to announce a major update to Multiwoven - our new release, Multiwoven 0.2.0, is now available!
Repo: https://github.com/Multiwoven/multiwoven
This release brings a host of new features, enhancements, and bug fixes to streamline data syncs and user experience.
From new connectors to advanced reporting dashboards, as a team, we have been working hard on these updates based on the feedback and requests from our customers and the community.
- 10+ new connectors added to Multiwoven, including
Project mention: Show HN: PipeRider – open-source Data Impact Analysis for dbt changes | news.ycombinator.com | 2023-09-06
Project mention: Run dbt projects as Apache Airflow DAGs and Task Groups with a few lines of code | news.ycombinator.com | 2023-05-01
dbt related posts
- Data Engineering Zoomcamp Week 6 - using redpanda 1
- Final project part 5
- Building a project in DBT
- Testing and documenting DBT models
- Extracting data with dlt
- Data engineering at home?
- Rockstar Data Engineers making big bucks: what are you doing exactly?
-
A note from our sponsor - SaaSHub
www.saashub.com | 26 Apr 2024
Index
What are some of the best open-source dbt projects? This list will help you:
Project | Stars | |
---|---|---|
1 | data-engineering-zoomcamp | 22,446 |
2 | doris | 11,314 |
3 | Mage | 7,001 |
4 | OpenMetadata | 4,140 |
5 | lightdash | 3,399 |
6 | evidence | 3,320 |
7 | data-diff | 2,842 |
8 | soda-core | 1,751 |
9 | elementary | 1,739 |
10 | re_data | 1,521 |
11 | sqlmesh | 1,249 |
12 | dbt-expectations | 939 |
13 | awesome-dbt | 917 |
14 | kuwala | 755 |
15 | dbt-duckdb | 729 |
16 | multiwoven | 617 |
17 | streamify | 474 |
18 | piperider | 467 |
19 | automate-dv | 456 |
20 | astronomer-cosmos | 449 |
21 | dbt-metabase | 425 |
22 | faros-community-edition | 403 |
23 | airflow-dbt | 379 |
Sponsored