Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge. Learn more →
Grai-core Alternatives
Similar projects and alternatives to grai-core
-
awesome-data-catalogs
📙 Awesome Data Catalogs and Observability Platforms.
-
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
-
Mage
🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
-
-
django-dbbackup
Management commands to help backup and restore your project database and media files
-
dbt-snowflake-monitoring
A dbt package from SELECT to help you monitor Snowflake performance and costs
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
-
grai-core reviews and mentions
-
Launch HN: Grai (YC S22) – Open-Source Data Observability Platform
Elastic v2 if one is interested in such things: https://github.com/grai-io/grai-core/blob/v0.1.33/LICENSE
Hi HN, my name is Ian. My co-founder Edward and I started Grai (https://grai.io), an open-source data observability platform. It helps prevent production data outages by evaluating changes to your data pipelines in CI, rather than at runtime.
Ever experienced a production outage due to changes in upstream data sources? That's a problem we regularly encountered whether deploying machine learning or keeping a datawarehouse operational and it led us to create Grai.
Systematically testing the impact of data changes on the rest of your stack turns out to be quite difficult when the same data is copied and used across many different services and applications. Simple changes like renaming a column in a database can result in broken BI dashboards, incorrect training data for ML models, and data pipeline failure. For example, business users regularly deal with questions like "why does revenue look different in different dashboards".
These sort of problems are commonly dealt with by passively monitoring application execution logs for anomalies that might indicate an outage. Our goal was to move that task out of runtime where an outage has already occurred back into testing.
At its core, Grai is a graph of the relationships between the data in your organization, from columns in a database to JSON fields in an API. This graph allows Grai to analyze the downstream impact of proposed changes during CI and before they go live.
It includes a variety of pre-built integrations with common data tools such as PostgreSQL, Snowflake, dbt, and Fivetran, which automatically extract metadata and synchronize the state of your graph. It's built on a flexible data model backed by REST and GraphQL APIs and a Python client library. This way, users can directly build on top of Grai as they see fit. For example, because every object in Grai serializes to a yaml definition file, sort of like a CRD in Kubernetes, even if a pre-built integration doesn't exist it's fairly easy to manually create or script a custom solution.
We made the decision to build open-source from the beginning in part because we believe lineage is underutilized both organizationally and technologically. We hope to provide a foundation for the community to build cool concepts on top and have already had companies come to us with amazing ideas, like optimizing their real-time query pipelines to take advantage of spot price arbitrage between cloud and on-prem.
We try not to be overly opinionated about how organizations work, so whether you maintain a development database or run service containers in GitHub Actions it doesn't really matter. When your tests are triggered we evaluate the new state of the environment and check for any impacts, before reporting back as a comment in the pull request.
Data observability can have unexpected benefits. One of our customers uses us because we make on-boarding new engineers easier. Because we render an infinitely zoomable Figma-like graph of the entire data stack it's possible for them to visually explore end-to-end data flows and application dependencies.
You can find a quick demo here: https://vimeo.com/824026569, we've also put together an example getting started guide if you want to try things out yourself: https://docs.grai.io/examples/enhanced-dbt. Since everything is open source, you can always explore the code (https://github.com/grai-io/grai-core) and docs (https://docs.grai.io), where we have example deployment configurations for docker-compose and Kubernetes.
We would love to hear your feedback. If there's a feature we're missing, we'll build it. If you have a UX or developer experience suggestion, we'll fix it. If it's something else, we want to hear about it. We can't wait to hear your feedback and thank you in advance!
The license chosen [1] (Elastic License 2.0) is one that isn't considered open source by many, due to not being OSD [2] compatible. Are you were of this before marketing as open source and, out of interest, does the license & usage of "open source" come into conversation when going through the YC process?
[1] https://github.com/grai-io/grai-core/blob/master/LICENSE
-
Standalone lineage tool
I’m not sure if this is precisely what you’re looking for but Grai might serve your needs. The backend data model allows you to push any arbitrary metadata you want / need onto the lineage graph and retrieve it either through the rest or graph API. I’m one of the authors so happy to answer any questions you might have.
-
Data Load Diagram
We've been looking at building something like this for Grai specifically to support Airflow but haven't yet prioritized it.
-
A note from our sponsor - InfluxDB
www.influxdata.com | 26 Sep 2023
Stats
grai-io/grai-core is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.
The primary programming language of grai-core is Python.