Our great sponsors
-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
For setting things up, follow the instructions on oobabooga's page, but replace the PyTorch installation line with the nightly build instead. (conda install pytorch torchvision torchaudio -c pytorch-nightly) This gives better performance on the Mac in CPU mode for some reason.
There is also some hope of things using the GPU on the M1/M2 as well. I did some testing and actually got it hooked up with some caveats. Not all PyTorch functions are mapped to work properly in the new MPS functionality Apple has provided so far. It looks like both PyTorch and Apple are working on things so this will improve. It also seems that the memory requirements of loading the models with GPU functionality are crazy high. That could be a side effect of the prototyping I did, but not sure. If you're interested, more detail can be found here.