Image convolution optimisation strategies.

This page summarizes the projects mentioned and recommended in the original post on /r/OpenCL

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • Halide

    a language for fast, portable data-parallel computation

  • You might want to create a kernel in Halide to check a reasonable tuned kernel: https://github.com/halide/Halide/blob/master/apps/blur/halide_blur_generator.cpp

  • blislab

    BLISlab: A Sandbox for Optimizing GEMM

  • Efficient matrix multiplications or convolutions on CPU will use layered tiling to optimize registers, L1, L2 and TLB, L3 cache (if it exist). This improve speed by over 150x vs naive triple for-loop matrix multiplication and the same thing applies to convolution. See overview https://www.cs.utexas.edu/users/flame/laff/pfhp/week3-goto.html and actual exercises https://github.com/flame/blislab

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • maxas

    Discontinued Assembler for NVIDIA Maxwell architecture

  • As explained in https://github.com/NervanaSystems/maxas/wiki/SGEMM you need to do the same on GPUs.

  • Thanks, here is the code : https://github.com/Omeganx/Image-Convolutaion-OpenCL (I removed the other code I was using to make it focused around the convolution code)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts