Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more โ
Top 10 data-observability Open-Source Projects
-
OpenMetadata
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
-
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
dbt-data-reliability
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
swiple
Swiple enables you to easily observe, understand, validate and improve the quality of your data
-
soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
-
dqo
Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.
Project mention: How to Dynamically Adjust the Height of a Textarea in ReactJS | dev.to | 2023-10-25In this blog post, I have demonstrated how I addressed the challenge of dynamically adjusting the height of a textarea element based on its content, preventing the need for vertical scrolling in the title section of the OpenMetadata Knowledge article page.
Project mention: Show HN: PipeRider โ open-source Data Impact Analysis for dbt changes | news.ycombinator.com | 2023-09-06
Project mention: Open-Source Observability for the Semantic Layer | news.ycombinator.com | 2024-01-16Think of Datadrift as a simple & open-source Monte Carlo for the semantic layer era. The repo is at https://github.com/data-drift/data-drift
Datadrift started as an internal tool built at our former company, a large European B2B Fintech. We had data reliability challenges impacting key metrics used for financial and regulatory reporting.
However, when we tried existing data quality tools we where always frustrated. They provide row-level static testing (eg. uniqueness or nullness) which does not address time-varying metrics like revenues. And commercial observability solutions costs $manyK a month and brings compliance and security overhead.
We designed Datadrift to solve these problems. Datadrift works by simply adding a monitor where your metric is computed. It then understands how your metric is computed and on which upstream tables it depends. When an issue occurs, it pinpoints exactly which rows have been updated and introducing the change.
You can also set up alerting and customise it. For example, you can decide to open and assign an Github issue to the analyst owning the revenue metric when a +10% change is detected. We tried to make it easy to customise and developer friendly.
We are thinking of adding features around root cause analysis automation/issues pattern analysis to help data teams improve metrics quality overtime. Weโd love to hear your feature requests.
Datadrift is built with Python and Go, and licensed under GPL. Our docs are here: https://github.com/data-drift/data-drift?tab=readme-ov-file#...
Dev set up and demo : https://app.claap.io/sammyt/drift-db-demo-a18-c-ApwBh9kt4p-0...
Weโre very eager to get your feedback!
DQOps is an open-source data quality platform (GitHub: https://github.com/dqops/dqo) that supports all stages of building or operating a data platform.
data-observability related posts
-
Open-Source Observability for the Semantic Layer
-
Would learn Go to contribute to an OS project ? Or should I stick to python ?
-
Ask HN: Dear startup founders, what have you developed in-house?
-
Show HN: Lineage X Snapshot Tooling
-
Non-moving data is a journey
-
Snowflake SQL AST parser?
-
Launch HN: Elementary (YC W22) โ Open-source data observability
-
A note from our sponsor - InfluxDB
www.influxdata.com | 21 May 2024
Index
What are some of the best open-source data-observability projects? This list will help you:
Project | Stars | |
---|---|---|
1 | OpenMetadata | 4,271 |
2 | soda-core | 1,776 |
3 | elementary | 1,746 |
4 | re_data | 1,526 |
5 | piperider | 469 |
6 | dbt-data-reliability | 348 |
7 | data-drift | 301 |
8 | swiple | 78 |
9 | soda-spark | 60 |
10 | dqo | 57 |
Sponsored