Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev. Learn more →
DeepSpeed Alternatives
Similar projects and alternatives to DeepSpeed
-
-
fairscale
PyTorch extensions for high performance and large scale training.
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
-
-
accelerate
🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
-
TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
-
-
Onboard AI
Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev.
-
mesh-transformer-jax
Model parallel transformers in JAX and Haiku
-
-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
-
petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
-
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
-
-
Pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
-
ollama
Get up and running with Llama 2 and other large language models locally
-
AnimatedDrawings
Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
-
FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
-
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch (by ShivamShrirao)
-
phasellm
Large language model evaluation and workflow framework from Phase AI.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
DeepSpeed reviews and mentions
- Why async gradient update doesn't get popular in LLM community?
-
A comprehensive guide to running Llama 2 locally
While on the surface, a 192GB Mac Studio seems like a great deal (it's not much more than a 48GB A6000!), there are several reasons why this might not be a good idea:
* I assume most people have never used llama.cpp Metal w/ large models. It will drop to CPU speeds whenever the context window is full: https://github.com/ggerganov/llama.cpp/issues/1730#issuecomm... - while sure this might be fixed in the future, it's been an issue since Metal support was added, and is a significant problem if you are actually trying to actually use it for inferencing. With 192GB of memory, you could probably run larger models w/o quantization, but I've never seen anyone post benchmarks of their experiences. Note that at that point, the limited memory bandwidth will be a big factor.
* If you are planning on using Apple Silicon for ML/training, I'd also be wary. There are multi-year long open bugs in PyTorch[1], and most major LLM libs like deepspeed, bitsandbytes, etc don't have Apple Silicon support[2][3].
You can see similar patterns w/ Stable Diffusion support [4][5] - support lagging by months, lots of problems and poor performance with inference, much less fine tuning. You can apply this to basically any ML application you want (srt, tts, video, etc)
Macs are fine to poke around with, but if you actually plan to do more than run a small LLM and say "neat", especially for a business, recommending a Mac for anyone getting started w/ ML workloads is a bad take. (In general, for anyone getting started, unless you're just burning budget, renting cloud GPU is going to be the best cost/perf, although on-prem/local obviously has other advantages.)
[1] https://github.com/pytorch/pytorch/issues?q=is%3Aissue+is%3A...
[2] https://github.com/microsoft/DeepSpeed/issues/1580
[3] https://github.com/TimDettmers/bitsandbytes/issues/485
[4] https://github.com/AUTOMATIC1111/stable-diffusion-webui/disc...
[5] https://forums.macrumors.com/threads/ai-generated-art-stable...
-
Microsoft Research proposes new framework, LongMem, allowing for unlimited context length along with reduced GPU memory usage and faster inference speed. Code will be open-sourced
And https://github.com/microsoft/deepspeed
-
April 2023
DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales (https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat)
-
Using --deepspeed requires lots of manual tweaking
Filed a discussion item on the deepspeed project: https://github.com/microsoft/DeepSpeed/discussions/3531
Solution: I don't know; this is where I am stuck. https://github.com/microsoft/DeepSpeed/issues/1037 suggests that I just need to 'apt install libaio-dev', but I've done that and it doesn't help.
-
Whether the ML computation engineering expertise will be valuable, is the question.
There could be some spectrum of this expertise. For instance, https://github.com/NVIDIA/FasterTransformer, https://github.com/microsoft/DeepSpeed
- FLiPN-FLaNK Stack Weekly for 17 April 2023
- DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-Like Models
- DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-Like Models
-
A note from our sponsor - Onboard AI
getonboard.dev | 4 Dec 2023
Stats
microsoft/DeepSpeed is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of DeepSpeed is Python.