Accelerate PyTorch with Taichi: Data Preprocessing & High-performance ML Operator Customization

This page summarizes the projects mentioned and recommended in the original post on /r/Python

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • RWKV-CUDA

    The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )

  • This repo introduces an interesting example of customizing an ML operator in CUDA. The author developed an RWKV language model using sort of a one-dimensional depthwise convolution custom operator. The model in itself does not involve large amounts of computation, but still runs slow because PyTorch does not have native support for it. So, the author customized the operator in CUDA and used a set of optimization techniques, such as loop fusion and Shared Memory, achieving a performance 20x better than he did with PyTorch.

  • blog_code

  • Pure PyTorch padding

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts