Our great sponsors
-
TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
server
The Triton Inference Server provides an optimized cloud and edge inferencing solution. (by triton-inference-server)
-
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
New capabilities in the open source NVIDIA Triton Inference Server software, which provides cross-platform inference on all AI models and frameworks, and NVIDIA TensorRT™, which optimizes AI models.
Described a collaboration involving NVIDIA Megatron-LM and Microsoft DeepSpeed to create an efficient, scalable, 3D parallel system capable of combining data, pipeline and tensor-slicing-based parallelism.
Tools for developing and deploying large language models: NVIDIA NeMo Megatron, for training models with trillions of parameters; the Megatron 530B customizable LLM that can be trained for new domains and languages; and NVIDIA Triton Inference Server™ with multi-GPU, multinode distributed inference functionality.
Described a collaboration involving NVIDIA Megatron-LM and Microsoft DeepSpeed to create an efficient, scalable, 3D parallel system capable of combining data, pipeline and tensor-slicing-based parallelism.