ctransformers
llama-cpp-python
ctransformers | llama-cpp-python | |
---|---|---|
4 | 55 | |
1,718 | 6,658 | |
- | - | |
8.6 | 9.8 | |
4 months ago | 4 days ago | |
C | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ctransformers
-
Refact LLM: New 1.6B code model reaches 32% HumanEval and is SOTA for the size
Does ctransformer (https://github.com/marella/ctransformers#supported-models) support running refact?
I see that model type "gpt_refact" in https://huggingface.co/smallcloudai/Refact-1_6B-fim/blob/mai...
-
How do I utilize these quantized models being uploaded?
You can also use ctransformers with the ggml models if you want to use python rather than c++.
-
Langchain and self hosted LLaMA hosted API
For ggml https://github.com/marella/ctransformers/ and https://github.com/abetlen/llama-cpp-python has a decent server. https://github.com/go-skynet/LocalAI is very active too.
- Also reconnecting with Scala. Interested in LLMs
llama-cpp-python
-
Ollama v0.1.33 with Llama 3, Phi 3, and Qwen 110B
There's a Python binding for llama.cpp which is actively maintained and has worked well for me: https://github.com/abetlen/llama-cpp-python
- FLaNK AI for 11 March 2024
-
OpenAI: Memory and New Controls for ChatGPT
I'll share the core bit that took a while to figure out the right format, my main script is a hot mess using embeddings with SentenceTransformer, so I won't share that yet. E.g: last night I did a PR for llama-cpp-python that shows how Phi might be used with JSON only for the author to write almost exactly the same code at pretty much the same time. https://github.com/abetlen/llama-cpp-python/pull/1184
-
TinyLlama LLM: A Step-by-Step Guide to Implementing the 1.1B Model on Google Colab
Python Bindings for llama.cpp
- Mistral-8x7B-Chat
-
Running Mistral LLM on Apple Silicon Using Apple's MLX Framework Is Much Faster
If the model could be made to work with llama.cpp, then https://github.com/abetlen/llama-cpp-python might be more compact. llama.cpp only supports a limited list of model types though.
- Run ChatGPT-like LLMs on your laptop in 3 lines of code
-
Code Llama, a state-of-the-art large language model for coding
https://github.com/abetlen/llama-cpp-python has a web server mode that replicates openai's API iirc and the readme shows it has docker builds already.
-
Meta: Code Llama, an AI Tool for Coding
LocalAI https://localai.io/ and LMStudio https://lmstudio.ai/ both have fairly complete OpenAI compatibility layers. llama-cpp-python has a FastAPI server as well: https://github.com/abetlen/llama-cpp-python/blob/main/llama_... (as of this moment it hasn't merged GGUF update yet though)
-
First steps with llama
I went with Python, llama-cpp-python, since my goal is just to get a small project up and running locally.
What are some alternatives?
LangChain_PDFChat_Oobabooga - oobaboga -text-generation-webui implementation of wafflecomposite - langchain-ask-pdf-local
LocalAI - :robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
text-generation-inference - Large Language Model Text Generation Inference
intel-extension-for-pytorch - A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
artificial-nose - Instructions, source code, and misc. resources needed for building a Tiny ML-powered artificial nose.
llama.cpp - LLM inference in C/C++
kendryte-standalone-sdk - Standalone SDK for kendryte K210
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
lamp - deep learning and scientific computing framework with native CPU and GPU backend for the Scala programming language
FastChat - An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.