vision
TensorRT
Our great sponsors
vision | TensorRT | |
---|---|---|
19 | 5 | |
15,268 | 2,286 | |
1.3% | 2.3% | |
9.5 | 9.6 | |
5 days ago | 6 days ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
vision
-
Transitioning From PyTorch to Burn
Let's start by defining the ResNet module according to the Residual Network architecture, as replicated[1] by the torchvision implementation of the model we will import. Detailed architecture variants with a depth of 18, 34, 50, 101 and 152 layers can be found in the table below.
-
Reading a DL paper: YOLO summary and discussion
Found relevant code at https://github.com/pytorch/vision + all code implementations here
-
Open discussion and useful links people trying to do Object Detection
* Why doesnt Pytorch have YOLO! https://github.com/pytorch/vision/issues/6341
-
My Neural Net is stuck, I've run out of ideas
Sorry to be annoying but I thought it was nice to give you some news as well. I was confused as to why there isnt yolo in pytorch, here it is why https://github.com/pytorch/vision/issues/6341
-
[Discussion] Stochastic Depth with BatchNorm ?
My question is more related to the variance of the batchs. If one batch contains samples that skip a connection and samples that do not ('row' mode in the Torchvision implementation), even if the values are ajusted to preserve the expected value, the variance will be much higher because we have in practice two distributions (for x_n and x_n + f(x_n)/p), which will mess up with the update of the batch normalization. Also, at inference time, all forward passes will be done as x_{n+1} = x_n + f(x_n), which has a different variance. The torchvision implementation also offers a 'batch' mode that kinda reduce this issue (because the global variance computed this way will be the mean of both distribution variances, instead of the variance of the joint distribution) but it does not seem to be the default mode (it does not even exist in the timm implementation).
-
Solution for "RuntimeError: Couldn't load custom C++ ops"
RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation for the compatibility matrix. Please check your PyTorch version with torch.version and your torchvision version with torchvision.version and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.
-
[D] My experience with running PyTorch on the M1 GPU
$ python vgg16-cifar10.py --device "cuda" torch 1.11.0+cu102 device cuda Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/cifar-10-python.tar.gz 170499072it [00:46, 3628424.66it/s] Extracting data/cifar-10-python.tar.gz to data Downloading: "https://github.com/pytorch/vision/archive/v0.11.0.zip" to /home/md/.cache/torch/hub/v0.11.0.zip Epoch: 001/001 | Batch 0000/1406 | Loss: 2.6563 Epoch: 001/001 | Batch 0100/1406 | Loss: 2.4686 Epoch: 001/001 | Batch 0200/1406 | Loss: 2.1224 Epoch: 001/001 | Batch 0300/1406 | Loss: 2.1879 Epoch: 001/001 | Batch 0400/1406 | Loss: 2.1733 Epoch: 001/001 | Batch 0500/1406 | Loss: 2.2413 Epoch: 001/001 | Batch 0600/1406 | Loss: 2.0518 Epoch: 001/001 | Batch 0700/1406 | Loss: 2.1621 Epoch: 001/001 | Batch 0800/1406 | Loss: 1.9033 Epoch: 001/001 | Batch 0900/1406 | Loss: 1.8379 Epoch: 001/001 | Batch 1000/1406 | Loss: 1.9572 Epoch: 001/001 | Batch 1100/1406 | Loss: 1.8823 Epoch: 001/001 | Batch 1200/1406 | Loss: 1.7978 Epoch: 001/001 | Batch 1300/1406 | Loss: 2.0239 Epoch: 001/001 | Batch 1400/1406 | Loss: 1.8389 Time / epoch without evaluation: 6.75 min <------------------ Epoch: 001/001 | Train: 25.52% | Validation: 26.40% | Best Validation (Ep. 001): 26.40% Time elapsed: 9.03 min Total Training Time: 9.03 min Test accuracy 26.54% Total Time: 9.48 min
-
Hey, I'm trying to find some materials about object detection in Pytorch but I'm having a hard time finding it.
And there are explanations: it's the research articles as well as the blog articles talking about them. And 99% of the code you'll find is open sourced: - The official torchvision has various models described here with their reference papers and the code of these models is found on the gitub page - You can find almost-official YOLO implementation on this github page.
-
[D] Efficiently loading videos in PyTorch without extracting frames
Maybe VideoClips? see the discussion here: https://github.com/pytorch/vision/issues/1446.
-
PyTorch 1.10
haha, yes, but that requires you to modify existing code to do so (which isn't always possible!).
There might also be other things you want to do (like add profiling after each op) that would be tedious to do manually, but can easily automated with FX (https://pytorch.org/tutorials/intermediate/fx_profiling_tuto...).
Another example is the recent support from torchvision for extracting intermediate feature activations (https://github.com/pytorch/vision/releases/tag/v0.11.0). Like, sure, it was probably possible to refactor all of their code to enable users to specify extracting an intermediate feature, but it's much cleaner to do with FX.
TensorRT
- Learn TensorRT optimization
- I made TensorRT example. I hope this will help beginners. And I also have a question about TensorRT best practice.
- [P] [D] I made TensorRT example. I hope this will help beginners. And I also have a question about TensorRT best practice.
-
[P] 4.5 times faster Hugging Face transformer inference by modifying some Python AST
Have you tried the new Torch-TensorRT compiler from NVIDIA?
-
PyTorch 1.10
You can quantize your model to FP16 or Int8 using PTQ as well and it should give you an additional speed up inference wise.
Here is a tutorial[2] to leverage TRTorch.
What are some alternatives?
torch2trt - An easy to use PyTorch to TensorRT converter
onnxruntime - ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
cutlass - CUDA Templates for Linear Algebra Subroutines
onnx-simplifier - Simplify your onnx model
yolov5 - YOLOv5 ๐ in PyTorch > ONNX > CoreML > TFLite
TensorRT - NVIDIAยฎ TensorRTโข is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
apple_m1_pro_python - A collection of ML scripts to test the M1 Pro MacBook Pro
transformer-deploy - Efficient, scalable and enterprise-grade CPU/GPU inference server for ๐ค Hugging Face transformer models ๐
jetson - Self-driving AI toy car ๐ค๐.
nn - ๐งโ๐ซ 60 Implementations/tutorials of deep learning papers with side-by-side notes ๐; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), ๐ฎ reinforcement learning (ppo, dqn), capsnet, distillation, ... ๐ง
functorch - functorch is JAX-like composable function transforms for PyTorch.