llm-awq
localsend
llm-awq | localsend | |
---|---|---|
7 | 64 | |
1,902 | 35,812 | |
10.9% | 6.1% | |
8.0 | 9.7 | |
8 days ago | 5 days ago | |
Python | Dart | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llm-awq
-
TinyChat: Large Language Model on the Edge
TinyChat is an efficient, lightweight, Python-native serving framework for 4-bit LLMs by AWQ. It delivers 2.3x generation speed up on RTX4090.
Code: https://github.com/mit-han-lab/llm-awq/tree/main/tinychat
- FLaNK Stack Weekly 23 Oct 2023
-
New base model InternLM 7B weights released, with 8k context window.
I am having trouble finding any 8bit GPTQ models at all, there don't seem to be any on HF it's almost all 4bit with the odd 3bit of the big ones. Suspect I will have to make my own for eval purposes but it's lower priority on my list then finding a 4bit that's GPU friendly but doesn't have such a performance penalty... Looking at AWQ they have 3 and 4bit versions.
-
Llama33B vs Falcon40B vs MPT30B
Using the currently popular gptq the 3bit quantization hurts performance much more than 4bit, but there's also awq (https://github.com/mit-han-lab/llm-awq) and squishllm (https://github.com/SqueezeAILab/SqueezeLLM) which are able to manage 3bit without as much performance drop - I hope to see them used more commonly.
- New hardware-friendly quantization method
-
Activation-Aware Weight Quantization for LLM Compression Outperforms GPTQ
Better quantization would have a direct and meaningful impact for everyone running local LLMs. The technique has already been applied to both Vicuna and the multimodal LLaMA variant LLaVA.
https://github.com/mit-han-lab/llm-awq
-
New quantization method AWQ outperforms GPTQ in 4-bit and 3-bit with 1.45x speedup and works with multimodal LLMs
GitHub: https://github.com/mit-han-lab/llm-awq
localsend
-
The Rise and Fall of 3M's Floppy Disk
I agree and get your point.
But localsend has worked well for me. Yes, it requires an app but if we could get vendors to bundle that rather than a boatload of bloatware.
I know that it would be to optimistic to hope for Google.
See https://localsend.org/
Spread the word.
-
LocalSend: Open-source, cross-platform file sharing to nearby devices
https://github.com/localsend/localsend
Something to consider, although I'm not sure how much it practically matters.
- How to copy a file between devices?
- Free and Open Source Alternative to Airdrop
-
YouTransfer: Self-hosted file transfer and sharing solution
It works like a charm, and is really easy to use
https://github.com/localsend/localsend
-
How do I share folder between my Linux mint laptops?
Use local send to send files and folders.
I've recently discovered and can recommend LocalSend. It's FOSS, cross-platform, and it just works. Github page with more details here
- Transferring a video from fire stick to IOS or uploading in general.
- Better way to transfer files between Windows <> Mac?
-
iPhone Copy to PC Shortcut
Edit: This is very simple implementation, even a POC, for personal use between our own devices at home or other personal private network. Even works on hotspot. But no HTTPS, so don's use on non-private networks. LocalSend, as suggested by u/Fede777 seems to be better for every other use.
What are some alternatives?
SqueezeLLM - [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
warpinator - Share files across the LAN
GPTQ-for-LLaMa - 4 bits quantization of LLaMA using GPTQ
snapdrop - A Progressive Web App for local file sharing
Voyager - An Open-Ended Embodied Agent with Large Language Models
LANDrop - Drop any files to any devices on your LAN.
langchain4j-examples
PairDrop - PairDrop: Local file sharing in your browser. Inspired by Apple's AirDrop. Fork of Snapdrop.
CML_AMP_AI_Text_Summarization_with_Amazon_Bedrock - CML_AMP_AI_Text_Summarization_with_Amazon_Bedrock
syncthing-android - Wrapper of syncthing for Android.
kafka-streams-dashboards - showcases Grafana dashboards for Kafka Stream applications leveraging client JMX metrics.
photon - Photon is a cross-platform file-sharing application built using flutter.