Top 10 dbt Open-Source Projects
Data profiling, testing, and monitoring for SQL accessible data.Project mention: Being constantly shut down by more senior team members when I mention adding some QA in our work | reddit.com/r/dataengineering | 2022-01-10
As many have said, there might be business side of things to deliver. Somebody above promised delivery with tight deadlines. Trust me, I am not a fan, but this how the world works and it sucks. I would say in your free time, explore tools like greatexpectations.io https://greatexpectations.io/ or https://github.com/sodadata/soda-sql which are modern ways of testing in your learning curve
An open source alternative to Looker built using dbt. Made for analysts ❤️Project mention: Launch HN: Metaplane (YC W20) – Datadog for Data | news.ycombinator.com | 2021-11-15
1) An integration with Metabase Cloud is on our roadmap for Q1! We'd love to integrate with Lightdash, but they don't have a public API just yet.
2) Several of our customers use us to alert on schema changes in Postgres, specifically so they can get ahead of application database changes that will end up in the warehouse, so you're definitely not alone! Here's a link on how to connect postgres: https://docs.metaplane.dev/docs/postgres
That's an excellent stack and one we kept front and center when building out Metaplane, so definitely let us know if you have any feedback or suggestions here!
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
do more with dbt. fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning models.Project mention: Do I need orchestration for a Fivetran-dbt stack? | reddit.com/r/dataengineering | 2021-12-05
Yes I agree with you that having fivetran/airbyte and dbt covers a lot of the airflow use cases.. That being said you might still want to run some scripts after the DBT transformation is over, we ran into this exact problem and built a useful CLI tool for running python scripts alongside the dbt run.
The metrics layer for your data. Join us at https://metriql.com/slackProject mention: Open source Business intelligence platform made with Python | news.ycombinator.com | 2021-11-28
We're using Superset to enable our analysts to explore our clients' SEM/SEO/analytics data. It also posts alerts to Slack when, say, the daily session count of a website isn't what was expected given the historical data.
Yeah, it's a little rough to get going, but once it is, we've found it to be a really powerful (and actively developed!) BI tool. It's even better with dbt + MetriQL , which can automatically sync Superset's dataset metadata directly with properties you set up in dbt.
Adding custom visualizations is much harder than it should be, but they're very much aware of that, and working to address it. Their Slack community is super-helpful, too.
Containerized end-to-end analytics of Spotify data using Python, dbt, Postgres, and MetabaseProject mention: Has anyone taken Coursera Data Engineering Foundations Course? | reddit.com/r/dataengineering | 2021-06-03
One-stop-shop for docs and test coverage of dbt projects.Project mention: dbt Coalesce 2021 takeaways | reddit.com/r/dataengineering | 2021-12-10
Slido develop dbt-coverage to get a documentation coverage number you could put in your CI/CD to naively push back analysts merge requests :D
Using DBT for Creating Session Abstractions on RudderStack - an open-source, warehouse-first customer data pipeline and Segment alternative.Project mention: Send Form Data From Marketo to Multiple Destinations Using RudderStack | dev.to | 2022-01-13
By using RudderStack to understand how users are finding and interacting with your site and then combining that with the data collected by your Marketo forms, you'll get deeper insights about your potential customers and provide higher quality leads to your sales team.
Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.
Proof of concept on how to gain insights with Trino across different databases from a distributed data meshProject mention: What even is data mesh | news.ycombinator.com | 2021-07-29
Not central to the main ideas of this article, but if you want to have a data mesh that is self-service, why force folks to use a particular storage medium like a data warehouse? That still requires centralization of the data.
Why not instead have a tool like Trino (https://trino.io) that allows you to let different domains use whatever datastore they happen to use. You still would need to enforce schema, but this can be done in tools like schema registry as mentioned in the article along with a data cataloging tool.
These tools facilitate the distributed nature of the problem nicely and encourage healthy standards to be discussed and the formalized in schema definitions and catalogs that remove the ambiguity of discourse and documentation.
Nice example is laid out in this repo of how Trino can accomplish data mesh principles 1 and 3 (https://github.com/findinpath/trino_data_mesh).
Some examples of dbt schema tests and data tests inside a simple dbt model. This is for anyone interested in learning how to implement dbt tests and the limitations around them.Project mention: 2 Critical Fixes For Installing dbt 0.19.0 | dev.to | 2021-03-24
I hope these quick facts about dbt installation are helpful for you. If you'd like to see a dbt project in action, please feel free to clone my dbtTestExamples repository on Github and learn how to connect a dbt model and tests to a Google Big Query instance.
Using DBT for Customer Journey Analysis on RudderStack - an open-source, warehouse-first customer data pipeline and Segment alternative.Project mention: Customer Session Analysis Using dbt and RudderStack | dev.to | 2022-01-03
dbt_project.yml - Every dbt project has a dbt_project.yml file. These are written in YAML and define common conventions and properties. For our project, the highlights from this page include the name, version and that we want our models to be materialized as views.
dbt related posts
Send Form Data From Marketo to Multiple Destinations Using RudderStack
1 project | dev.to | 13 Jan 2022
Data Warehouse Integration: Refining Your Customer Data Stack
1 project | dev.to | 4 Jan 2022
How To Event Stream Data From Your Nuxt.Js App Using RudderStack
4 projects | dev.to | 22 Dec 2021
Do I need orchestration for a Fivetran-dbt stack?
1 project | reddit.com/r/dataengineering | 5 Dec 2021
Why is Data Build Tool (DBT) is so popular? What are some other alternatives?
4 projects | reddit.com/r/dataengineering | 4 Dec 2021
CI/CD in data engineering - help a noob
2 projects | reddit.com/r/dataengineering | 3 Dec 2021
Clickstream Data Mining Techniques: An Introduction
3 projects | dev.to | 16 Sep 2021
What are some of the best open-source dbt projects? This list will help you:
Are you hiring? Post a new remote job listing for free.