seldon-core
transformers
Our great sponsors
seldon-core | transformers | |
---|---|---|
14 | 175 | |
4,212 | 125,021 | |
1.7% | 3.1% | |
7.8 | 10.0 | |
5 days ago | 4 days ago | |
HTML | Python | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
seldon-core
-
seldon-core VS MLDrop - a user suggested alternative
2 projects | 20 Feb 2023
-
[D] Feedback on a worked Continuous Deployment Example (CI/CD/CT)
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. Built for data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that are catered towards ML workflows. Seldon Core is a production grade open source model serving platform. It packs a wide range of features built around deploying models to REST/GRPC microservices that include monitoring and logging, model explainers, outlier detectors and various continuous deployment strategies such as A/B testing, canary deployments and more.
-
[D] BentoML's Compatibility with Seldon;
I am using BentoML to build the docker container for a BERT model, and then deploy that using Seldon on GKE. The model's REST API endpoint works fine. at terms of compatibility with Seldon, the metrics are being scraped by Prometheus and visualized on Grafana. The only Seldon component that doesn't appear to be working is the request logging, which I have working for other applications that were deployed on Seldon. I am using the elastic stack from here. From my understanding, request logging should still be compatible and the â €only lost functionality should be Seldon's model metadata. Any insight on how to get the centralized request logging working? No errors were shown; it's just that the logs aren't being captured and sent to ElasticSearch. Anyone have any success using BentoML with Seldon and not losing any of Seldon's features?
-
Building a Responsible AI Solution - Principles into Practice
While tools in the model experimentation space normally include diagnostic charts on a model's performance, there are also specialised solutions that help ensure that the deployed model continues to perform as they are expected to. This includes the likes of seldon-core, why-labs and fiddler.ai.
-
Ask HN: Who is hiring? (January 2022)
Seldon | Multiple positions | London/Cambridge UK | Onsite/Remote | Full time | seldon.io
At Seldon we are building industry leading solutions for deploying, monitoring, and explaining machine learning models. We are an open-core company with several successful open source projects like:
* https://github.com/SeldonIO/seldon-core
* https://github.com/SeldonIO/mlserver
* https://github.com/SeldonIO/alibi
* https://github.com/SeldonIO/alibi-detect
* https://github.com/SeldonIO/tempo
We are hiring for a range of positions, including software engineers(go, k8s), ml engineers (python, go), frontend engineers (js), UX designer, and product managers. All open positions can be found at https://www.seldon.io/careers/
- Ask HN: Who is hiring? (December 2021)
-
Has anyone implemented Seldon?
Also note our github repo has a link to our slack where you can ask active users: https://github.com/SeldonIO/seldon-core
-
[Discussion] Look for service to upload a model and receive a REST API endpoint, for serving predictions
If you want to serve your model at scale, with a bunch of production features you should have a look at the open-source framework Seldon Core. It does what you're asking for plus a bunch of other cool stuff like routing, logging and monitoring.
- Seldon Core : Open-source platform for rapidly deploying machine learning models on Kubernetes
-
Looking for open-source model serving framework with dashboard for test data quality
Seldon ticks most of those boxes if you already have some experience with kubernetes. You can set up a/b tests, do payload logging to elastic and then do monitoring on top of that, and it has drift detection and model explainer modules too. Idk about great expectations integration, but you could probably do something with a custom transformer module as part of the inference graph.
transformers
-
Maxtext: A simple, performant and scalable Jax LLM
Is t5x an encoder/decoder architecture?
Some more general options.
The Flax ecosystem
https://github.com/google/flax?tab=readme-ov-file
or dm-haiku
https://github.com/google-deepmind/dm-haiku
were some of the best developed communities in the Jax AI field
Perhaps the “trax” repo? https://github.com/google/trax
Some HF examples https://github.com/huggingface/transformers/tree/main/exampl...
Sadly it seems much of the work is proprietary these days, but one example could be Grok-1, if you customize the details. https://github.com/xai-org/grok-1/blob/main/run.py
-
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
The HuggingFace transformers library already has support for a similar method called prompt lookup decoding that uses the existing context to generate an ngram model: https://github.com/huggingface/transformers/issues/27722
I don't think it would be that hard to switch it out for a pretrained ngram model.
-
AI enthusiasm #6 - Finetune any LLM you wantđź’ˇ
Most of this tutorial is based on Hugging Face course about Transformers and on Niels Rogge's Transformers tutorials: make sure to check their work and give them a star on GitHub, if you please ❤️
-
Schedule-Free Learning – A New Way to Train
* Superconvergence + LR range finder + Fast AI's Ranger21 optimizer was the goto optimizer for CNNs, and worked fabulously well, but on transformers, the learning rate range finder sadi 1e-3 was the best, whilst 1e-5 was better. However, the 1 cycle learning rate stuck. https://github.com/huggingface/transformers/issues/16013
-
Gemma doesn't suck anymore – 8 bug fixes
Thanks! :) I'm pushing them into transformers, pytorch-gemma and collabing with the Gemma team to resolve all the issues :)
The RoPE fix should already be in transformers 4.38.2: https://github.com/huggingface/transformers/pull/29285
My main PR for transformers which fixes most of the issues (some still left): https://github.com/huggingface/transformers/pull/29402
- HuggingFace Transformers: Qwen2
- HuggingFace Transformers Release v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2
- HuggingFace: Support for the Mixtral Moe
-
Paris-Based Startup and OpenAI Competitor Mistral AI Valued at $2B
If you want to tinker with the architecture Hugging Face has a FOSS implementation in transformers: https://github.com/huggingface/transformers/blob/main/src/tr...
If you want to reproduce the training pipeline, you couldn't do that even if you wanted to because you don't have access to thousands of A100s.
-
Fail to reproduce the same evaluation metrics score during inference.
I am aware that using mixed precision reduces the stability of weight and there will be little consistency but don't expect it to be this much. I have attached the graph of evaluation metrics. If someone can give me some insight into this issue, that would be great.
What are some alternatives?
BentoML - The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!
fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MLServer - An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
evidently - Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
llama - Inference code for Llama models
great_expectations - Always know what to expect from your data.
transformer-pytorch - Transformer: PyTorch Implementation of "Attention Is All You Need"
alibi-detect - Algorithms for outlier, adversarial and drift detection
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
huggingface_hub - The official Python client for the Huggingface Hub.