Orange Pi 5 Plus Koboldcpp Demo (MPT, Falcon, Mini-Orca, Openllama)

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

babyagi4all

3 252 8.3 Python

BabyAGI to run with GPT4All

16gb fits 30b q3_k_s. Maybe try making it work with an IQ script and have it run overnight! https://github.com/kroll-software/babyagi4all

whisper.cpp

187 31,174 9.8 C

Port of OpenAI's Whisper model in C/C++

Kobold uses llama.cpp under the hood if I remember correctly. That means you need to set the compiler flags for the hardware accelerator you want to use. There are unfortunately a bunch of options for that on arm platforms. I found a good overview here https://github.com/ggerganov/whisper.cpp/issues/7. Whisper.cpp is for running speech to text models but it is made by the same author as llama.cpp and all the compiler flags I found are identical so it might be worth a shot.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
mlc-llm

89 16,955 9.9 Python

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Vulkan is a low level API that theoretically have very good performance. You can use mlc-llm, to run LLMs on Vulkan enabled GPUs. Unfortunately is the documentation and driver support from Rochchip spotty at best.

mmdeploy

4 2,511 7.9 Python

OpenMMLab Model Deployment Framework

The RK3588 also has a NPU for accelerating neural networks. The bad news is the API is not supported by any of the inference engines (afaik), but the NPU can run models directly that have been converted to the RKNN format. It is a long shot, but you can find details here.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project