llama.onnx
openvino
llama.onnx | openvino | |
---|---|---|
2 | 17 | |
324 | 5,996 | |
- | 4.4% | |
7.3 | 10.0 | |
10 months ago | 5 days ago | |
Python | C++ | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama.onnx
-
Qnap TS-264
You can find LLM models in the onnx format here: https://github.com/tpoisonooo/llama.onnx
-
Langchain question and answer without openai
You also need a LLM to do this. Please check this out to pick one up from the llama family. Other works like llama.onnx, alpaca-native and llama model on hugging face are also worth checking.
openvino
- FLaNK Stack 05 Feb 2024
- QUIK is a method for quantizing LLM post-training weights to 4 bit precision
- Intel OpenVINO 2023.1.0 released
- Intel OpenVINO 2023.1.0 released, open-source toolkit for optimizing and deploying AI inference
- OpenVINO 2023.1.0 released
- [N] Intel OpenVINO 2023.1.0 released, open-source toolkit for optimizing and deploying AI inference
-
Powering Anomaly Detection for Industry 4.0
Anomalib is an open-source deep learning library developed by Intel that makes it easy to benchmark different anomaly detection algorithms on both public and custom datasets, all by simply modifying a config file. As the largest public collection of anomaly detection algorithms and datasets, it has a strong focus on image-based anomaly detection. It’s a comprehensive, end-to-end solution that includes cutting-edge algorithms, relevant evaluation methods, prediction visualizations, hyperparameter optimization, and inference deployment code with Intel’s OpenVINO Toolkit.
What are some alternatives?
llama.cpp - LLM inference in C/C++
TensorRT - NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Chinese-LLaMA-Alpaca - 中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
deepsparse - Sparsity-aware deep learning inference runtime for CPUs
fastT5 - ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.
mediapipe - Cross-platform, customizable ML solutions for live and streaming media.
motorhead - 🧠 Motorhead is a memory and information retrieval server for LLMs.
stable-diffusion - Go to lstein/stable-diffusion for all the best stuff and a stable release. This repository is my testing ground and it's very likely that I've done something that will break it.
AST-1 - Join the movement led by IZX.ai to create the world's best open-source LLM.
neural-compressor - SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
llama2.openvino - This sample shows how to implement a llama-based model with OpenVINO runtime
nebuly - The user analytics platform for LLMs