The Era of 1-bit LLMs: ternary parameters for cost-effective computing

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

unilm

40 18,319 9.0 Python

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

+1 On this, the real proof would have been testing both models side-by-side.
It seems that it may be published on GitHub [1] according to HuggingFace [2].
[1] https://github.com/microsoft/unilm/tree/master/bitnet
[2] https://huggingface.co/papers/2402.17764

llama.cpp

769 56,891 10.0 C++

LLM inference in C/C++

It does result in a significant degradation relative to unquantized model of the same size, but even with simple llama.cpp K-quantization, it's still worth it all the way down to 2-bit. The chart in this llama.cpp PR speaks for itself:
https://github.com/ggerganov/llama.cpp/pull/1684#issue-17396...

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
quantized-nets

1 22 - Python

Contains code for Binary, Ternary, N-bit Quantized and Hybrid CNNs for low precision experiments.

People have been doing this 6 years ago.
    https://github.com/yashkant/quantized-nets

trained-ternary-quantization

1 103 - Jupyter Notebook

Reducing the size of convolutional neural networks

https://github.com/TropComplique/trained-ternary-quantization

Ternary-Weights-Network

1 31 - Python

Ternay-Weights-Network/Pytorch

https://github.com/buaabai/Ternary-Weights-Network

StableLM

43 15,853 5.0 Jupyter Notebook

StableLM: Stability AI Language Models

https://github.com/Stability-AI/StableLM?tab=readme-ov-file#...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

The Era of 1-Bit LLMs: Training_Tips, Code And_FAQ [pdf]
1 project | news.ycombinator.com | 21 Mar 2024
The Era of 1-Bit LLMs: Training Tips, Code and FAQ
1 project | news.ycombinator.com | 20 Mar 2024
I'm an Old Fart and AI Makes Me Sad
2 projects | news.ycombinator.com | 16 Feb 2024
On building a semantic search engine
3 projects | news.ycombinator.com | 6 Jan 2024
Microsoft Publishes LongNet: Scaling Transformers to 1,000,000,000 Tokens
1 project | /r/ArtificialInteligence | 8 Jul 2023

The Era of 1-bit LLMs: ternary parameters for cost-effective computing

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
NLP LM pre-trained-model AI unilm
Post date: 28 Feb 2024

unilm

llama.cpp

WorkOS

quantized-nets

trained-ternary-quantization

Ternary-Weights-Network

StableLM

Related posts

The Era of 1-bit LLMs: ternary parameters for cost-effective computing

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com NLP LM pre-trained-model AI unilm Post date: 28 Feb 2024

unilm

llama.cpp

WorkOS

quantized-nets

trained-ternary-quantization

Ternary-Weights-Network

StableLM

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
NLP LM pre-trained-model AI unilm
Post date: 28 Feb 2024