data-observability

Open-source projects categorized as data-observability

Top 10 data-observability Open-Source Projects

  • OpenMetadata

    Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.

  • Project mention: How to Dynamically Adjust the Height of a Textarea in ReactJS | dev.to | 2023-10-25

    In this blog post, I have demonstrated how I addressed the challenge of dynamically adjusting the height of a textarea element based on its content, preventing the need for vertical scrolling in the title section of the OpenMetadata Knowledge article page.

  • soda-core

    :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • elementary

    The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

  • re_data

    re_data - fix data issues before your users & CEO would discover them ๐Ÿ˜Š

  • piperider

    Code review for data in dbt

  • Project mention: Show HN: PipeRider โ€“ open-source Data Impact Analysis for dbt changes | news.ycombinator.com | 2023-09-06
  • dbt-data-reliability

    dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

  • data-drift

    Metrics Observability & Troubleshooting

  • Project mention: Open-Source Observability for the Semantic Layer | news.ycombinator.com | 2024-01-16

    Think of Datadrift as a simple & open-source Monte Carlo for the semantic layer era. The repo is at https://github.com/data-drift/data-drift

    Datadrift started as an internal tool built at our former company, a large European B2B Fintech. We had data reliability challenges impacting key metrics used for financial and regulatory reporting.

    However, when we tried existing data quality tools we where always frustrated. They provide row-level static testing (eg. uniqueness or nullness) which does not address time-varying metrics like revenues. And commercial observability solutions costs $manyK a month and brings compliance and security overhead.

    We designed Datadrift to solve these problems. Datadrift works by simply adding a monitor where your metric is computed. It then understands how your metric is computed and on which upstream tables it depends. When an issue occurs, it pinpoints exactly which rows have been updated and introducing the change.

    You can also set up alerting and customise it. For example, you can decide to open and assign an Github issue to the analyst owning the revenue metric when a +10% change is detected. We tried to make it easy to customise and developer friendly.

    We are thinking of adding features around root cause analysis automation/issues pattern analysis to help data teams improve metrics quality overtime. Weโ€™d love to hear your feature requests.

    Datadrift is built with Python and Go, and licensed under GPL. Our docs are here: https://github.com/data-drift/data-drift?tab=readme-ov-file#...

    Dev set up and demo : https://app.claap.io/sammyt/drift-db-demo-a18-c-ApwBh9kt4p-0...

    Weโ€™re very eager to get your feedback!

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • swiple

    Swiple enables you to easily observe, understand, validate and improve the quality of your data

  • soda-spark

    Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes

  • dqo

    Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.

  • Project mention: Launching end-to-end data quality platform | news.ycombinator.com | 2024-03-27

    DQOps is an open-source data quality platform (GitHub: https://github.com/dqops/dqo) that supports all stages of building or operating a data platform.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

data-observability related posts

Index

What are some of the best open-source data-observability projects? This list will help you:

Project Stars
1 OpenMetadata 4,271
2 soda-core 1,776
3 elementary 1,746
4 re_data 1,526
5 piperider 469
6 dbt-data-reliability 348
7 data-drift 301
8 swiple 78
9 soda-spark 60
10 dqo 57

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com