kernl vs onnxruntime

kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable. (by ELS-RD)

Source Code

kernl.ai

Suggest alternative

Edit details

onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator (by microsoft)

Deep Learning Onnx neural-networks Machine Learning ai-framework hardware-acceleration Pytorch Tensorflow scikit-learn

Source Code

onnxruntime.ai

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

kernl		onnxruntime
	Project
8	Mentions	53
1,458	Stars	12,583
1.9%	Growth	4.0%
1.5	Activity	10.0
2 months ago	Latest Commit	3 days ago
Jupyter Notebook	Language	C++
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

kernl

Posts with mentions or reviews of kernl. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-08.

[P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl
7 projects | /r/MachineLearning | 8 Feb 2023

I periodically check kernl.ai to see whether the documentation and tutorial sections have been expanded. My advice is put some real effort and focus in to examples and tutorials. It is key for an optimization/acceleration library. 10x-ing the users of a library like this is much more likely to come from spending 10 out of every 100 developer hours writing tutorials, as opposed to spending those 8 or 9 of those tutorial-writing hours on developing new features which only a small minority understand how to apply.
[P] BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models
3 projects | /r/MachineLearning | 22 Nov 2022

FlashAttention + quantization has to the best of knowledge not yet been explored, but I think it would a great engineering direction. I would not expect to see this any time soon natively in PyTorch's BetterTransformer though. /u/pommedeterresautee & folks at ELS-RD made an awesome work releasing kernl where custom implementations (through OpenAI Triton) could maybe easily live.
[D] How to get the fastest PyTorch inference and what is the "best" model serving framework?
8 projects | /r/MachineLearning | 28 Oct 2022

Check https://github.com/ELS-RD/kernl/blob/main/src/kernl/optimizer/linear.py for an example.
[P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels
8 projects | /r/MachineLearning | 25 Oct 2022

https://github.com/ELS-RD/kernl/issues/141 > Would it be possible to use kernl to speed up Stable Diffusion?

onnxruntime

Posts with mentions or reviews of onnxruntime. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-16.

AI Inference now available in Supabase Edge Functions
4 projects | dev.to | 16 Apr 2024

Embedding generation uses the ONNX runtime under the hood. This is a cross-platform inferencing library that supports multiple execution providers from CPU to specialized GPUs.
Deep Learning in JavaScript
11 projects | news.ycombinator.com | 28 Mar 2024

tfjs is dead, looking at the commit history. The standard now is to convert PyTorch to onnx, then use onnxruntime (https://github.com/microsoft/onnxruntime/tree/main/js/web) to run the model on the browsdr.
FLaNK Stack 05 Feb 2024
49 projects | dev.to | 5 Feb 2024
Vcc – The Vulkan Clang Compiler
9 projects | news.ycombinator.com | 9 Jan 2024

- slang[2] has the potential, but the meta programming part is not as strong as C++, existing libraries cannot be used.
The above conclusion is drawn from my work https://github.com/microsoft/onnxruntime/tree/dev/opencl, purely nightmare to work with thoes drivers and jit compilers. Hopefully Vcc can take compute shader more seriously.
[1]: https://www.circle-lang.org/
Oracle-samples/sd4j: Stable Diffusion pipeline in Java using ONNX Runtime
2 projects | news.ycombinator.com | 1 Jan 2024

I did. It depends what you want, for an overview of how ONNX Runtime works then Microsoft have a bunch of things on https://onnxruntime.ai, but the Java content is a bit lacking on there as I've not had time to write much. Eventually I'll probably write something similar to the C# SD tutorial they have on there but for the Java API.
For writing ONNX models from Java we added an ONNX export system to Tribuo in 2022 which can be used by anything on the JVM to export ONNX models in an easier way than writing a protobuf directly. Tribuo doesn't have full coverage of the ONNX spec, but we're happy to accept PRs to expand it, otherwise it'll fill out as we need it.
Mamba-Chat: A Chat LLM based on State Space Models
6 projects | /r/LocalLLaMA | 7 Dec 2023
VectorDB: Vector Database Built by Kagi Search
9 projects | news.ycombinator.com | 26 Nov 2023

What about models besides GPT? Most of the popular vector encoding models aren't using this architecture.
If you really didn't want PyTorch/Transformers, you could consider exporting your models to ONNX (https://github.com/microsoft/onnxruntime).
ONNX runtime: Cross-platform accelerated machine learning
1 project | /r/hackernews | 27 Jul 2023
Onnx Runtime: “Cross-Platform Accelerated Machine Learning”
1 project | /r/hypeurls | 27 Jul 2023

5 projects | news.ycombinator.com | 25 Jul 2023

What are some alternatives?

When comparing kernl and onnxruntime you can also consider the following projects:

openai-whisper-cpu - Improving transcription performance of OpenAI Whisper for CPU based deployment

onnx - Open standard for machine learning interoperability

flash-attention - Fast and memory-efficient exact attention

onnx-tensorrt - ONNX-TensorRT: TensorRT backend for ONNX

diffusers - 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

onnx-simplifier - Simplify your onnx model

stable-diffusion-webui - Stable Diffusion web UI

ONNX-YOLOv7-Object-Detection - Python scripts performing object detection using the YOLOv7 model in ONNX.

BentoML - The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!

onnx-tensorflow - Tensorflow Backend for ONNX

deepsparse - Sparsity-aware deep learning inference runtime for CPUs

MLflow - Open source platform for the machine learning lifecycle

kernl vs openai-whisper-cpu onnxruntime vs onnx kernl vs flash-attention onnxruntime vs onnx-tensorrt kernl vs diffusers onnxruntime vs onnx-simplifier kernl vs stable-diffusion-webui onnxruntime vs ONNX-YOLOv7-Object-Detection kernl vs BentoML onnxruntime vs onnx-tensorflow kernl vs deepsparse onnxruntime vs MLflow

Compare kernl vs onnxruntime and see what are their differences.

kernl

onnxruntime

kernl

onnxruntime

What are some alternatives?