dbt-data-reliability
sqllineage
Our great sponsors
dbt-data-reliability | sqllineage | |
---|---|---|
2 | 3 | |
338 | 1,120 | |
4.7% | - | |
9.7 | 8.6 | |
7 days ago | 10 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dbt-data-reliability
-
How to store dbt run and test results in tables + code example
The entire implementation is available in our open source dbt package.
-
Launch HN: Elementary (YC W22) – Open-source data observability
For any dbt users, their reliability package has the best and most comprehensive way to upload artifacts directly to the warehouse after a dbt invocation.
https://github.com/elementary-data/dbt-data-reliability
sqllineage
- FLaNK Stack Weekly for 12 September 2023
-
Dependency Lineage & Scripting
For the open source there is this library https://github.com/reata/sqllineage.
-
Launch HN: Elementary (YC W22) – Open-source data observability
Is the idea here that it's inspired by re_data due to using dbt transformations underneath or because it's reposted looking nearly the same? (or both?)
Looks like much of the lineage code is also largely a wrapper around this library: https://github.com/reata/sqllineage
Would be curious to understand the project's purpose and unique contributions vs. the underlying dependencies powering it as there seems to be some ambiguity. Is this just a wrapper around dbt transformations and a lineage library in one package? Can I just use them directly?
What are some alternatives?
deequ - Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
elementary - The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
re_data - re_data - fix data issues before your users & CEO would discover them 😊
soda-core - :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
hrequests - 🚀 Web scraping for humans
versatile-data-kit - One framework to develop, deploy and operate data workflows with Python and SQL.
open-interpreter - A natural language interface for computers
dbt-documentor - ✍️ dbt doc generator for advanced data teams
rivet - The open-source visual AI programming environment and TypeScript library