optimum-intel
optimum
optimum-intel | optimum | |
---|---|---|
1 | 8 | |
328 | 2,174 | |
8.8% | 4.9% | |
9.6 | 9.5 | |
5 days ago | 2 days ago | |
Jupyter Notebook | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
optimum-intel
-
If Stable Diffusion "stores" images in lossy compression, as per the lawsuit's claim, how can you retrieve the original training images?
No I haven't. There's an article from Intel about doing it with some of their tools though (code is here).
optimum
-
FastEmbed: Fast and Lightweight Embedding Generation for Text
Shout out to Huggingface's Optimum – which made it easier to quantize models.
-
[D] Is ML doomed to end up closed-source?
Optimum to accelerate inference of transformers with hardware optimization
-
[P] BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models
Yes Optimum lib's documentation is unfortunately not yet in best shape. I would be really thankful if you fill an issue detailing where the doc can be improved: https://github.com/huggingface/optimum/issues . Also, if you have features request, such as having a more flexible API, we are eager for community contributions or suggestions!
-
BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models
In order to support BetterTransformer with the canonical Transformer models from Transformers library, an integration was done with the open-source library Optimum as a one-liner:
- Why are self attention not as deployment friendly?
-
[P] Accelerated Inference with Optimum and Transformers Pipelines
It’s Lewis here from the open-source team at Hugging Face 🤗. I'm excited to share the latest release of our Optimum library, which provides a suite of performance optimization tools to make Transformers run fast on accelerated hardware!
-
[N] Hugging Face raised $100M at $2B to double down on community, open-source & ethics
Create libraries to optimize ML models during training and inference for specific hardware https://github.com/huggingface/optimum
-
[P] Python library to optimize Hugging Face transformer for inference: < 0.5 ms latency / 2850 infer/sec
Have you seen this article from HF https://huggingface.co/blog/bert-cpu-scaling-part-2 , there is also a lib https://github.com/huggingface/optimum? is the gain worth the tweaking? is OneDNN stuff easy to deploy on Triton?
What are some alternatives?
WhitenBlackBox - Towards Reverse-Engineering Black-Box Neural Networks, ICLR'18
FasterTransformer - Transformer related optimization, including BERT, GPT
transformer-deploy - Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
safetensors - Simple, safe way to store and distribute tensors
TensorRT - NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
text-generation-inference - Large Language Model Text Generation Inference
kernl - Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
serve - Serve, optimize and scale PyTorch models in production
Open-Assistant - OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
noisyopt - Python library for optimizing noisy functions.
onnxruntime - ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator