Top 18 C++ Nvidium Projects
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.Project mention: Jetson Nano 2GB Issues During Training (Out Of Memory / Process Killed) & Other Questions! | reddit.com/r/JetsonNano | 2021-11-05
I’m trying to do the tutorial, where they retrain the neural network to detect fruits (jetson-inference/pytorch-ssd.md at master · dusty-nv/jetson-inference · GitHub 1)
TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.Project mention: [P] What we learned by making T5-large 2X faster than Pytorch (and any autoregressive transformer) | reddit.com/r/MachineLearning | 2022-05-23
Nvidia TensorRT demo from Nvidia heavily optimizes computation graph (through aggressive kernel fusions), making T5 inference very fast (they report 10X speedup on small-T5). The trick is that it doesn't use any cache, so it's very fast on short sequence and small models, as it avoids many memory bounded operations by redoing full computation again and again... but as several users have already found (1, 2, 3, 4, ...), this approach doesn't scale when the computation intensity increases, i.e., when base or large models are used instead of a small one, when generation is done on a moderately long sequence of few hundreds of tokens or if beam search is used instead of a greedy search. The graph above show the same behavior with Onnx Runtime;
Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.
The C++ parallel algorithms library.Project mention: A vision of a multi-threaded Emacs | reddit.com/r/emacs | 2022-05-20
Users should work with higher level primitives like tasks, parallel loops, asynchronous functions etc. Think TBB, Thrust, Taskflow, lparallel for CL, etc.
GameStream client for PCs (Windows, Mac, Linux, and Steam Link)Project mention: X11 and monitor refresh rates? | reddit.com/r/linux | 2022-05-17
Try checking this thread: https://github.com/moonlight-stream/moonlight-qt/issues/557
StreamFX is a plugin for OBS® Studio which adds many new effects, filters, sources, transitions and encoders - all for free! Be it 3D Transform, Blur, complex Masking, or even custom shaders, you'll find it all here.Project mention: I'm using the best settings I can find or at least I think so and my video quality is still bad on youtube, how do I fix this.(Really pixelated videos) | reddit.com/r/obs | 2022-05-04
The C++ Standard Library for your entire system.Project mention: Is it better to learn c or c++ for cuda? | reddit.com/r/CUDA | 2022-04-17
If you are thinking of using new features through https://github.com/NVIDIA/libcudacxx , you'll have to learn c++
ONNX-TensorRT: TensorRT backend for ONNXProject mention: [P] [D]How to get TensorFlow model to run on Jetson Nano? | reddit.com/r/MachineLearning | 2021-06-02
Conversion was done from Keras Tensorflow using to ONNX https://github.com/onnx/keras-onnx followed by ONNX to TensorRT using https://github.com/onnx/onnx-tensorrt The Python code used for inference using TensorRT can be found at https://github.com/jonnor/modeld/blob/tensorrt/tensorrtutils.py
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
Improved fork of Waifu2X C++ using OpenCL and OpenCVProject mention: [School Days] I wanted to upload some upscaled wallpaper photos from the School Days website to share. When upscaled to a contemporary resolution I think the art style looks really pretty to be honest. | reddit.com/r/visualnovels | 2022-04-15
http://waifu2x.udp.jp/ is another upscaling website if you go over the other website's liimit. There's also offline versions like waifu2x-converter-cpp that run directly on your computer and is only limited by your graphics card. I've been a little too obsessed lately and I've even went so far as to upscale entire anime episodes frame by frame. (upscalling some 40,000 images can take a day or two depending on how you do it)
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technologyProject mention: More than 128 GB of ram | reddit.com/r/linuxhardware | 2022-01-13
I think vram swap will be fine for your case. I'm not sure how vram can be used exactly as normal ram, but at least scum Nvidia are claiming that we can map vram to ram, so that.s an extra option for some gpus. https://github.com/NVIDIA/gdrcopy
Thin C++-flavored wrappers for the CUDA APIsProject mention: Integrating the CUDA APIs (Driver, Runtime, JIT) in pleasant modern-C++ wrappers | news.ycombinator.com | 2022-03-26
Image-processing software for cryo-electron microscopy
Homebrew low level graphics API for Nintendo Switch (Nvidia Tegra X1)
Moonlight port for Nintendo SwitchProject mention: Finally got Moonlight working on Switch. But no controller input? | reddit.com/r/SwitchPirates | 2022-03-20
An interface for Optimus Manager that allows to switch GPUs on Optimus laptops.Project mention: Any GUI Tool for Switching Between Intel and AMD Graphics Easily? | reddit.com/r/linuxquestions | 2021-12-12
I'm just looking for a GUI tool to switch between Intel and AMD graphics. For Nvidia, there are many tools to choose from like optimus-manager-qt and plasma-optimus, but I can't find any equivalent for AMD cards. Any suggestions please?
OpenGL sample on various rendering approaches for typical CAD scenes
Task Manager for Linux for Nvidia graphics cardsProject mention: About the default System Monitor. | reddit.com/r/pop_os | 2021-09-28
Nvidia System Monitor
Achieve peak performance on x86 CPUs and NVIDIA GPUs
Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!Project mention: Failing to Reach 204 GB/S DDR4 Bandwidth | news.ycombinator.com | 2022-02-02
For the single threaded version, they have a data hazard on the sums that could be smoothed out with a little loop unrolling and separate variables.
But in the [threaded version](https://github.com/unum-cloud/ParallelReductions/blob/fd16d9...) they have separate slots for an accumulator but it's still in a shared vector, which most likely has the issue I described.
C++ Nvidia related posts
X11 and monitor refresh rates?
1 project | reddit.com/r/linux | 17 May 2022
Trying to play on my TV with parsec, but I have multiple problems
2 projects | reddit.com/r/cloudygamer | 15 May 2022
SteamLink Streaming in 4K with full Xbox Controller Support?
1 project | reddit.com/r/Steam_Link | 14 May 2022
Apple Maps location scan spikes WiFi latency every 60 seconds
3 projects | news.ycombinator.com | 12 May 2022
moonlight woes with 4k HDR.
1 project | reddit.com/r/cloudygamer | 8 May 2022
Designing dock for the Deck on a Deck... in the gaming mode
4 projects | reddit.com/r/SteamDeck | 7 May 2022
I made TensorRT example. I hope this will help beginners. And I also have a question about TensorRT best practice.
3 projects | reddit.com/r/learnmachinelearning | 1 May 2022
What are some of the best open-source Nvidium projects in C++? This list will help you:
Are you hiring? Post a new remote job listing for free.