Run inference on MPT-30B using CPU
Why do you think that https://github.com/vllm-project/vllm is a good alternative to mpt-30B-inference
Run inference on MPT-30B using CPU
Why do you think that https://github.com/vllm-project/vllm is a good alternative to mpt-30B-inference