Our great sponsors
-
elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
dbt-data-reliability
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Sure, please compare: https://re-data.github.io/dbt-re-data/#!/overview?g_v=1 and https://docs.elementary-data.com/ graph png.
Elementary models like data_monitors_thread1, data_monitors_thread2, data_monitors_thread3, data_monitors_thread4, data_monitoring_metrics, latest_metrics, metrics_stats_for_anomalies, z_score, anomaly_detection, schema_schenages, etc.
Nice project, at re_data we just got over a lot of your new updates and it seems a quite large part of your project is "inspired" by code from our library https://github.com/re-data/re-data. Even with parts, we are not especially proud of ;)
If you decide to copy not only ideas but a big part of internal implementation, I think you should include that information in your LICENSE.
Cheers
For any dbt users, their reliability package has the best and most comprehensive way to upload artifacts directly to the warehouse after a dbt invocation.
https://github.com/elementary-data/dbt-data-reliability
Is the idea here that it's inspired by re_data due to using dbt transformations underneath or because it's reposted looking nearly the same? (or both?)
Looks like much of the lineage code is also largely a wrapper around this library: https://github.com/reata/sqllineage
Would be curious to understand the project's purpose and unique contributions vs. the underlying dependencies powering it as there seems to be some ambiguity. Is this just a wrapper around dbt transformations and a lineage library in one package? Can I just use them directly?
Does this in essence similar to the aws deeque project but fancier and more inclusive of edge cases, common scenarios? (https://github.com/awslabs/deequ)