Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Why do you think that https://github.com/BruceMacD/chatd is a good alternative to gpu_poor
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Why do you think that https://github.com/BruceMacD/chatd is a good alternative to gpu_poor