Our great sponsors
Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
I'm following this closely, together with other efforts like GPTQ Quantization and Microsoft's DeepSpeed, all of which are bringing down the hardware requirements of these advanced AI models.
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.
Using --deepspeed requires lots of manual tweaking
3 projects | /r/Oobabooga | 11 May 2023
DeepSpeed Hybrid Engine for reinforcement learning with human feedback (RLHF)
1 project | /r/u_waynerad | 26 Apr 2023
I'm Stephen Gou, Manager of ML / Founding Engineer at Cohere. Our team specializes in developing large language models. Previously at Uber ATG on perception models for self-driving cars. AMA!
1 project | /r/IAmA | 19 Apr 2023
Microsoft AI Open-Sources DeepSpeed Chat: An End-To-End RLHF Pipeline To Train ChatGPT-like Models
1 project | /r/machinelearningnews | 13 Apr 2023
DeepSpeed Chat: Easy, fast and affordable RLHF training of ChatGPT-like models
1 project | /r/patient_hackernews | 12 Apr 2023