Image convolution optimisation strategies.

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Halide

43 5,703 9.5 C++

a language for fast, portable data-parallel computation

You might want to create a kernel in Halide to check a reasonable tuned kernel: https://github.com/halide/Halide/blob/master/apps/blur/halide_blur_generator.cpp

blislab

1 416 0.0 C

BLISlab: A Sandbox for Optimizing GEMM

Efficient matrix multiplications or convolutions on CPU will use layered tiling to optimize registers, L1, L2 and TLB, L3 cache (if it exist). This improve speed by over 150x vs naive triple for-loop matrix multiplication and the same thing applies to convolution. See overview https://www.cs.utexas.edu/users/flame/laff/pfhp/week3-goto.html and actual exercises https://github.com/flame/blislab

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
maxas

3 784 0.0 Sass

Discontinued Assembler for NVIDIA Maxwell architecture

As explained in https://github.com/NervanaSystems/maxas/wiki/SGEMM you need to do the same on GPUs.

Image-Convolutaion-OpenCL

1 - -

Thanks, here is the code : https://github.com/Omeganx/Image-Convolutaion-OpenCL (I removed the other code I was using to make it focused around the convolution code)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project