Llama-cpu Alternatives

Similar projects and alternatives to llama-cpu

text-generation-webui

876 35,862 9.9 Python llama-cpu VS text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
llama

184 52,603 8.1 Python llama-cpu VS llama

Inference code for Llama models
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
KoboldAI-Client

185 3,344 6.3 Python llama-cpu VS KoboldAI-Client
transformers

175 125,021 10.0 Python llama-cpu VS transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
petals

98 8,661 8.5 Python llama-cpu VS petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
GPTQ-for-LLaMa

75 2,913 8.6 Python llama-cpu VS GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ
DeepSpeed

51 32,550 9.8 Python llama-cpu VS DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
FlexGen

39 8,999 3.0 Python llama-cpu VS FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.
awesome-ml

27 1,383 9.1 llama-cpu VS awesome-ml

Curated list of useful LLM / Analytics / Datascience resources
one-click-installers

18 470 8.9 Python llama-cpu VS one-click-installers

Discontinued Simplified installers for oobabooga/text-generation-webui.
bitsandbytes-win-prebuilt

4 75 10.0 llama-cpu VS bitsandbytes-win-prebuilt
wrapyfi-examples_llama

2 128 4.0 Python llama-cpu VS wrapyfi-examples_llama

Inference code for facebook LLaMA models with Wrapyfi support
fickling

7 322 8.6 Python llama-cpu VS fickling

A Python pickling decompiler and static analyzer
llama-int8

6 1,044 3.6 Python llama-cpu VS llama-int8

Quantized inference code for LLaMA models
llama-mps

4 83 3.8 Python llama-cpu VS llama-mps

Experimental fork of Facebooks LLaMa model which runs it with GPU acceleration on Apple Silicon M1/M2
llama

3 35 1.6 llama-cpu VS llama

Inference code for LLaMA models (by gmorenz)
safer_unpickle

1 5 10.0 Python llama-cpu VS safer_unpickle
text-g

1 - - llama-cpu VS text-g
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better llama-cpu alternative or higher similarity.

Suggest an alternative to llama-cpu

llama-cpu reviews and mentions

Posts with mentions or reviews of llama-cpu. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-08.

Why is ChatGPT 3.5 API 10x cheaper than GPT3?
2 projects | /r/OpenAI | 8 Mar 2023

You've probably heard, but LLaMA just released, and its 13B parameter model outperforms GPT-3 on most metrics (because they trained it on a lot more data). Someone's already quantized it to 4 and 3 bits and it performs virtually the same. It also apparently performs well on CPUs (several words per second on a 7900X). Running something equivalent to GPT3.5 on a phone is not out that far out.
Fork of Facebook’s LLaMa model to run on CPU
1 project | /r/patient_hackernews | 8 Mar 2023

1 project | /r/hackernews | 8 Mar 2023
Llama-CPU: Fork of Facebooks LLaMa model to run on CPU
1 project | /r/hypeurls | 8 Mar 2023

8 projects | news.ycombinator.com | 7 Mar 2023
[D] Tutorial: Run LLaMA on 8gb vram on windows (thanks to bitsandbytes 8bit quantization)
8 projects | /r/MachineLearning | 7 Mar 2023

I tried to port the llama-cpu version to a gpu-accelerated mps version for macs, it runs, but the outputs are not as good as expected and it often gives "-1" tokens. Any help and contributions on fixing it are welcome!
Facebook LLAMA is being openly distributed via torrents | Hacker News
1 project | /r/GPT3 | 6 Mar 2023

You can run it with only a CPU and 32 gigs of RAM: https://github.com/markasoftware/llama-cpu
[D] Is it possible to run Meta's LLaMA 65B model on consumer-grade hardware?
9 projects | /r/MachineLearning | 4 Mar 2023
Facebook LLAMA is being openly distributed via torrents
15 projects | news.ycombinator.com | 3 Mar 2023

I was able to run 7B on a CPU, inferring several words per second: https://github.com/markasoftware/llama-cpu
A note from our sponsor - WorkOS
workos.com | 26 Apr 2024

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →