Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
coriander
- How to run Llama 13B with a 6GB graphics card
-
Is it possible to virtualize a CUDA processor?
It’s not a full implementation of CUDA and requires some contortions to use but https://github.com/hughperkins/coriander is as good as anything else I’ve tried. It has been a few years though.
-
EVGA will no longer make NVIDIA GPUs due to “disrespectful treatment” - Dexerto
It’s possible to run cuda on anything . There have been attempts to do this. https://github.com/hughperkins/coriander Unfortunately it seems development stalled.
llama.cpp
What are some alternatives?
mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
intel-extension-for-pytorch - A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
gptq - Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
RadeonClockEnforcer - AHK script that forces maximum clocks while important applications are open. Automates OverdriveNTool's clock/voltage switching functionality for GPU and VRAM, with the purpose of enforcing maximum clocks while whitelisted applications are in focus.
HIPIFY - HIPIFY: Convert CUDA to Portable C++ Code [Moved to: https://github.com/ROCm/HIPIFY]
sparsegpt - Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
FastChat - An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
llama-cpp-python - Python bindings for llama.cpp
openai-whisper-cpu - Improving transcription performance of OpenAI Whisper for CPU based deployment
llama.cpp - LLM inference in C/C++