great_expectations
seldon-core
Our great sponsors
great_expectations | seldon-core | |
---|---|---|
15 | 14 | |
9,466 | 4,212 | |
2.0% | 1.7% | |
9.9 | 7.8 | |
about 11 hours ago | 2 days ago | |
Python | HTML | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
great_expectations
-
Data Quality at Scale with Great Expectations, Spark, and Airflow on EMR
Great Expectations (GE) is an open-source data validation tool that helps ensure data quality.
- Looking for Unit Testing framework in Database Migration Process
-
Soda Core (OSS) is now GA! So, why should you add checks to your data pipelines?
GE is arguably the most well known OSS alternative to Soda Core. The third option is deequ, originally developed and released in OSS by AWS. Our community has told us that Soda Core is different because it’s easy to get going and embed into data pipelines. And it also allows some of the check authoring work to be moved to other members of the data team. I'm sure there are also scenarios where Soda Core is not the best option. For example, when you only use Pandas dataframes or develop in Scala.
- Greatexpectations - Always know what to expect from your data.
- Greatexpectations – Always know what to expect from your data
-
Package for drift detection
great_expectations: https://github.com/great-expectations/great_expectations
-
[D] Do you use data engineering pipelines for real life projects?
For example I just found "Great Expectations" and "Kedro", "Flyte" and I was wondering at which point in time and project complexity should we choose one of these tools instead of the ancient cave man way?
-
Data pipeline suggestions
Testing: GreatExpectations
-
Where can I find free data engineering ( big data) projects online?
Ingestion / ETL: Airbyte, Singer, Jitsu Transformation: dbt Orchestration: Airflow, Dagster Testing: GreatExpectations Observability: Monosi Reverse ETL: Grouparoo, Castled Visualization: Lightdash, Superset
- [P] Deepchecks: an open-source tool for high standards validations for ML models and data.
seldon-core
-
seldon-core VS MLDrop - a user suggested alternative
2 projects | 20 Feb 2023
-
[D] Feedback on a worked Continuous Deployment Example (CI/CD/CT)
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows. Seldon Core is a production grade open source model serving platform. It packs a wide range of features built around deploying models to REST/GRPC microservices that include monitoring and logging, model explainers, outlier detectors and various continuous deployment strategies such as A/B testing, canary deployments and more.
-
[D] BentoML's Compatibility with Seldon;
I am using BentoML to build the docker container for a BERT model, and then deploy that using Seldon on GKE. The model's REST API endpoint works fine. at terms of compatibility with Seldon, the metrics are being scraped by Prometheus and visualized on Grafana. The only Seldon component that doesn't appear to be working is the request logging, which I have working for other applications that were deployed on Seldon. I am using the elastic stack from here. From my understanding, request logging should still be compatible and the ⠀only lost functionality should be Seldon's model metadata. Any insight on how to get the centralized request logging working? No errors were shown; it's just that the logs aren't being captured and sent to ElasticSearch. Anyone have any success using BentoML with Seldon and not losing any of Seldon's features?
-
Building a Responsible AI Solution - Principles into Practice
While tools in the model experimentation space normally include diagnostic charts on a model's performance, there are also specialised solutions that help ensure that the deployed model continues to perform as they are expected to. This includes the likes of seldon-core, why-labs and fiddler.ai.
-
Ask HN: Who is hiring? (January 2022)
Seldon | Multiple positions | London/Cambridge UK | Onsite/Remote | Full time | seldon.io
At Seldon we are building industry leading solutions for deploying, monitoring, and explaining machine learning models. We are an open-core company with several successful open source projects like:
* https://github.com/SeldonIO/seldon-core
* https://github.com/SeldonIO/mlserver
* https://github.com/SeldonIO/alibi
* https://github.com/SeldonIO/alibi-detect
* https://github.com/SeldonIO/tempo
We are hiring for a range of positions, including software engineers(go, k8s), ml engineers (python, go), frontend engineers (js), UX designer, and product managers. All open positions can be found at https://www.seldon.io/careers/
- Ask HN: Who is hiring? (December 2021)
-
Has anyone implemented Seldon?
Also note our github repo has a link to our slack where you can ask active users: https://github.com/SeldonIO/seldon-core
-
[Discussion] Look for service to upload a model and receive a REST API endpoint, for serving predictions
If you want to serve your model at scale, with a bunch of production features you should have a look at the open-source framework Seldon Core. It does what you're asking for plus a bunch of other cool stuff like routing, logging and monitoring.
- Seldon Core : Open-source platform for rapidly deploying machine learning models on Kubernetes
-
Looking for open-source model serving framework with dashboard for test data quality
Seldon ticks most of those boxes if you already have some experience with kubernetes. You can set up a/b tests, do payload logging to elastic and then do monitoring on top of that, and it has drift detection and model explainer modules too. Idk about great expectations integration, but you could probably do something with a custom transformer module as part of the inference graph.
What are some alternatives?
evidently - Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
BentoML - The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!
kedro-great - The easiest way to integrate Kedro and Great Expectations
MLServer - An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
deepchecks - Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
re_data - re_data - fix data issues before your users & CEO would discover them 😊
alibi-detect - Algorithms for outlier, adversarial and drift detection
streamlit - Streamlit — A faster way to build and share data apps.
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
fastapi - FastAPI framework, high performance, easy to learn, fast to code, ready for production
huggingface_hub - The official Python client for the Huggingface Hub.