Our great sponsors
- ONLYOFFICE ONLYOFFICE Docs — document collaboration in your environment
- Sonar - Write Clean Python Code. Always.
- CodiumAI - TestGPT | Generating meaningful tests for busy devs
- InfluxDB - Access the most powerful time series database as a service
-
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Git Clone the deepspeed repo, https://github.com/microsoft/DeepSpeed We need this to finetune without using more vram than any consumer model has. Build Deepspeed, in the deepspeed directory, DS_BUILD_OPS=1 DS_BUILD_AIO=0 pip install .
-
finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
git clone the finetuning repo https://github.com/Xirider/finetune-gpt2xl go into the finetuning repo, install the rest of the requirements, pip install -r requirements.txt
-
ONLYOFFICE
ONLYOFFICE Docs — document collaboration in your environment. Powerful document editing and collaboration in your app or environment. Ultimate security, API and 30+ ready connectors, SaaS or on-premises
Related posts
- Using --deepspeed requires lots of manual tweaking
- DeepSpeed Hybrid Engine for reinforcement learning with human feedback (RLHF)
- Whether the ML computation engineering expertise will be valuable, is the question.
- I'm Stephen Gou, Manager of ML / Founding Engineer at Cohere. Our team specializes in developing large language models. Previously at Uber ATG on perception models for self-driving cars. AMA!
- Microsoft AI Open-Sources DeepSpeed Chat: An End-To-End RLHF Pipeline To Train ChatGPT-like Models