GPUs for Deep Learning in 2023 – An In-depth Analysis

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Not exactly an answer to your question, but from a pytorch standpoint, there are still many operations that are not supported on MPS[1]. I can't recall a circumstance where an architecture I wanted to train was fully supported on MPS, so some of the training ends up happening in CPU at that point.

    1 = https://github.com/pytorch/pytorch/issues/77764

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • TransformerEngine

    A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

    Would be curious to see your benchmarks. Btw, Nvidia will be providing support for fp8 in a future release of CUDA - https://github.com/NVIDIA/TransformerEngine/issues/15

    I think TMA may not matter as much for consumer cards given the disproportionate amount of fp32 / int32 compute that they have.

    Would be interesting to see how close to theoretical folks are able to get once CUDA support comes through.

  • nanoGPT

    The simplest, fastest repository for training/finetuning medium-sized GPTs.

    Can you please run bench from https://github.com/karpathy/nanoGPT ?

  • Whisper

    High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model (by Const-me)

    My implementation of Whisper uses slightly over 4GB VRAM running their large multilingual model: https://github.com/Const-me/Whisper

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

Did you konow that C++ is
the 6th most popular programming language
based on number of metions?