nn
torch2trt
Our great sponsors
nn | torch2trt | |
---|---|---|
26 | 5 | |
47,503 | 4,376 | |
7.6% | 1.4% | |
7.7 | 3.1 | |
27 days ago | 25 days ago | |
Jupyter Notebook | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
nn
-
Can't remember name of website that has explanations side-by-side with code
Hey are you talking about https://nn.labml.ai/ ?
- [D] Recent ML papers to implement from scratch
-
[P] GPT-NeoX inference with LLM.int8() on 24GB GPU
Implementation & LM Eval Harness Results
-
[P] Fine-tuned the GPT-Neox Model to Generate Quotes
Github: https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/neox
-
Best resources to learn recent transformer papers and stay updated [D]
Regarding implementations this helps me: https://nn.labml.ai/
- Introductory papers to implement
- How to convert research papers to code?
-
[D] How to convert papers to code?
Dunno if this is directly helpful, but this website has implementation with the math side by side https://nn.labml.ai/
- [D] Looking for open source projects to contribute
- Resource for papers explanation
torch2trt
- [D] How you deploy your ML model?
-
PyTorch 1.10
Main thing you want for server inference is auto batching. It's a feature that's included in onnxruntime, torchserve, nvidia triton inference server and ray serve.
If you have a lot of preprocessing and post logic in your model it can be hard to export it for onnxruntime or triton so I usually recommend starting with Ray Serve (https://docs.ray.io/en/latest/serve/index.html) and using an actor that runs inference with a quantized model or optimized with tensorrt (https://github.com/NVIDIA-AI-IOT/torch2trt)
-
Jetson Nano: TensorFlow model. Possibly I should use PyTorch instead?
https://github.com/NVIDIA-AI-IOT/torch2trt <- pretty straightforward https://github.com/jkjung-avt/tensorrt_demos <- this helped me a lot
-
How to get TensorFlow model to run on Jetson Nano?
I find Pytorch easier to work with generally. Nvidia has a Pytorch --> TensorRT converter which yields some significant speedups and has a simple Python API. Convert the Pytorch model on the Nano.
What are some alternatives?
GFPGAN-for-Video-SR - A colab notebook for video super resolution using GFPGAN
TensorRT - PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
labml - 🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
onnx-simplifier - Simplify your onnx model
functorch - functorch is JAX-like composable function transforms for PyTorch.
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
ZoeDepth - Metric depth estimation from a single image
transformer-deploy - Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
onnxruntime - ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Basic-UI-for-GPT-J-6B-with-low-vram - A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.
tensorrt_demos - TensorRT MODNet, YOLOv4, YOLOv3, SSD, MTCNN, and GoogLeNet