YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • Sonar - Write Clean Python Code. Always.
  • SaaSHub - Software Alternatives and Reviews
  • deepsparse

    Inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application

    Disclosure: I work for Neural Magic.

    Hi deepnotderp, as noted by others the speeds listed here are combining throughput for GPU from Ultralytics to latency for GPU from Neural Magic. We did also include throughput measurements, though, where YOLOv5s was around 3 ms per image on a V100 at fp16 in our testing. All benchmarks were run on AWS instances for repeatability and availability and is likely where the 2 ms vs 3 ms discrepancy comes from (slower memory transfer on the AWS machine vs the one Ultralytics used). Note, though, a slower overall machine will also affect CPU results as well.

    We benchmarked using the available PyTorch APIs mimicking what was done for Ultralytics benchmarking. This code is open sourced for viewing and use here: https://github.com/neuralmagic/deepsparse/blob/main/examples...

  • yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

    I'm pretty sure this isn't using the Tensor cores on the GPU.

    If you see here (https://github.com/ultralytics/yolov5/blob/master/README.md), the speed of inference on a V100 for YOLOv5s should be 2 ms per image, or 500 imgs/s, not the 44.6 img/s being reported here.

    This is important as it is more than an order of magnitude off.

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts