An endpoint server for efficiently serving quantized open-source LLMs for code.
Why do you think that https://github.com/ymcui/Chinese-LLaMA-Alpaca is a good alternative to llm-vscode-inference-server
An endpoint server for efficiently serving quantized open-source LLMs for code.
Why do you think that https://github.com/ymcui/Chinese-LLaMA-Alpaca is a good alternative to llm-vscode-inference-server