Solutions like Dependabot or Renovate update but don't merge dependencies. You need to do it manually while it could be fully automated! Add a Merge Queue to your workflow and stop caring about PR management & merging. Try Mergify for free. Learn more →
Megatron-LM Alternatives
Similar projects and alternatives to Megatron-LM
-
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
-
TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
-
server
The Triton Inference Server provides an optimized cloud and edge inferencing solution. (by triton-inference-server)
-
DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
-
-
ChatGPT-Siri
Shortcuts for Siri using ChatGPT API gpt-3.5-turbo & gpt-4 model, supports continuous conversations, configure the API key & save chat records. 由 ChatGPT API gpt-3.5-turbo & gpt-4 模型驱动的智能 Siri,支持连续对话,配置API key,配置系统prompt,保存聊天记录。
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
Megatron-LM reviews and mentions
-
Why Did Google Brain Exist?
GPU cluster scaling has come a long way. Just checkout the scaling plot here: https://github.com/NVIDIA/Megatron-LM
-
I asked ChatGPT to rate the intelligence level of current AI systems out there.
Google's PaLM, Facebook's LLaMA, Nvidia's Megatron, I am missing some surely and Apple sure has something cooking as well but these are the big ones, of course none of them are publicly available, but research papers are reputable. All of the ones mentioned should beat GPT-3 although GPT-3.5 (chatGPT) should be bit better and ability to search (Bing) should level the playing field even further, but Google's PaLM with search functionality should be clearly ahead. This is why people are excited about GPT-4, GPT-3 was way ahead of anyone else when it came out but others were able to catch up since, we'll see if GPT-4 will be another bing jump among LLMs.
-
Nvidia Fiscal Q3 2022 Financial Result
Described a collaboration involving NVIDIA Megatron-LM and Microsoft DeepSpeed to create an efficient, scalable, 3D parallel system capable of combining data, pipeline and tensor-slicing-based parallelism.
-
Microsoft and NVIDIA AI Introduces MT-NLG: The Largest and Most Powerful Monolithic Transformer Language NLP Model
Microsoft and NVIDIA present the Megatron-Turing Natural Language Generation model (MT-NLG), powered by DeepSpeed and Megatron, the largest and robust monolithic transformer language model trained with 530 billion parameters.
-
[R] Data Movement Is All You Need: A Case Study on Optimizing Transformers
Nvidia's own implementation of Transformers, i.e, Megatron on NVIDIA's Selene supercomputer (where GPT-3 is possible too) -https://github.com/NVIDIA/Megatron-LM
-
A note from our sponsor - Mergify
blog.mergify.com | 24 Sep 2023
Stats
NVIDIA/Megatron-LM is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.
The primary programming language of Megatron-LM is Python.