Our great sponsors
-
nvitop
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
DeepView.Profile
๐ Interactive performance profiling and debugging tool for PyTorch neural networks.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
zenith
Zenith - sort of like top or htop but with zoom-able charts, CPU, GPU, network, and disk usage
That's why the authors recommend pipx for installing nvitop. I am not a sysadmin, but I prefer pipx over relying on the (often outdated) distro sources.
https://github.com/XuehaiPan/nvitop?tab=readme-ov-file#insta...
you can also profile AI/ML performance without actually running it https://github.com/CentML/DeepView.Profile
Thereโs also asitop https://github.com/tlkh/asitop
its not a terminal app like bottom or nvtop but I use https://github.com/exelban/stats and it has iGPU stats
Now that I use Home Assistant, I want all my data sources to plug into there. It can handle the rendering for me as I see fit, and it's where data comes to integrate.
It's one of those things which I wish existed, but I can't imagine anyone would have written. Until I do a web search.
https://github.com/koriwi/sensors2mqtt/tree/main
I have not used it yet, but that seems like how I'd want to do it.
My favorite would be gpustat [1]. This shows the bare minimum amount of information to let's me know that the training has problems/running well
[1] https://github.com/wookayin/gpustat