GPUs for Deep Learning in 2023 – An In-depth Analysis

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

  • Not exactly an answer to your question, but from a pytorch standpoint, there are still many operations that are not supported on MPS[1]. I can't recall a circumstance where an architecture I wanted to train was fully supported on MPS, so some of the training ends up happening in CPU at that point.

    1 = https://github.com/pytorch/pytorch/issues/77764

  • TransformerEngine

    A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

  • Would be curious to see your benchmarks. Btw, Nvidia will be providing support for fp8 in a future release of CUDA - https://github.com/NVIDIA/TransformerEngine/issues/15

    I think TMA may not matter as much for consumer cards given the disproportionate amount of fp32 / int32 compute that they have.

    Would be interesting to see how close to theoretical folks are able to get once CUDA support comes through.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • nanoGPT

    The simplest, fastest repository for training/finetuning medium-sized GPTs.

  • Can you please run bench from https://github.com/karpathy/nanoGPT ?

  • Whisper

    High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model (by Const-me)

  • My implementation of Whisper uses slightly over 4GB VRAM running their large multilingual model: https://github.com/Const-me/Whisper

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts