-
Torch-Pruning
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I also tried structured pruning from https://github.com/VainF/Torch-Pruning, as they report EfficientNetV2 to be "prunable", but got much worse results. However, the advantage of this approach is that it keeps the model dense, and you can get a real speed-up with common GPUs, while unstructured pruning sparsifies the model and you need hardware that can exploit such sparsity.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
-
Llama33B vs Falcon40B vs MPT30B
-
Has anyone tried out Squeezellm?
-
SqueezeLLM: Dense-and-Sparse Quantization
-
New quantization method SqueezeLLM allows for loseless compression for 3-bit and outperforms GPTQ and AWQ in both 3-bit and 4-bit. Quantized Vicuna and LLaMA models have been released.
-
[R] 🤖🌟 Unlock the Power of Personal AI: Introducing ChatLLaMA, Your Custom Personal Assistant! 🚀💬