Orange Pi 5 Plus Koboldcpp Demo (MPT, Falcon, Mini-Orca, Openllama)

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • babyagi4all

    BabyAGI to run with GPT4All

  • 16gb fits 30b q3_k_s. Maybe try making it work with an IQ script and have it run overnight! https://github.com/kroll-software/babyagi4all

  • whisper.cpp

    Port of OpenAI's Whisper model in C/C++

  • Kobold uses llama.cpp under the hood if I remember correctly. That means you need to set the compiler flags for the hardware accelerator you want to use. There are unfortunately a bunch of options for that on arm platforms. I found a good overview here https://github.com/ggerganov/whisper.cpp/issues/7. Whisper.cpp is for running speech to text models but it is made by the same author as llama.cpp and all the compiler flags I found are identical so it might be worth a shot.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • mlc-llm

    Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

  • Vulkan is a low level API that theoretically have very good performance. You can use mlc-llm, to run LLMs on Vulkan enabled GPUs. Unfortunately is the documentation and driver support from Rochchip spotty at best.

  • mmdeploy

    OpenMMLab Model Deployment Framework

  • The RK3588 also has a NPU for accelerating neural networks. The bad news is the API is not supported by any of the inference engines (afaik), but the NPU can run models directly that have been converted to the RKNN format. It is a long shot, but you can find details here.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts