FlexGen
impersonator
FlexGen | impersonator | |
---|---|---|
19 | 1 | |
5,350 | 15 | |
- | - | |
10.0 | 3.8 | |
about 1 year ago | about 1 year ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
FlexGen
-
Training LLaMA-65B with Stanford Code
#1: Progress Update | 4 comments #2: the default UI on the pinned Google Colab is buggy so I made my own frontend - YAFFOA. | 18 comments #3: Paper reduces resource requirement of a 175B model down to 16GB GPU | 19 comments
-
Replika users fell in love with their AI chatbot companions. Then they lost them
It's really just a gpu vram limitation: affordable GPUs are rather memory starved.
Fortunately people have started writing implementations for pipelining across multiple gpus.
https://github.com/Ying1123/FlexGen
- Same as with Stable Diffusion, new AI based LAION, are coming up slowly but surely: Paper reduces resource requirement of a 175B model down to 16GB GPU
- And Here..We..Go: Running large language models like ChatGPTon a single GPU. Up to 100x faster than other offloading systems
-
When, how and why will this Stable Diffusion spring stop?
Actually there's a solution : read this paper https://github.com/Ying1123/FlexGen/blob/main/docs/paper.pdf
-
Exciting new shit.
Flexgen - Run big models on your small GPU https://github.com/Ying1123/FlexGen
- Paper reduces resource requirement of a 175B model down to 16GB GPU
- FlexGen - Run 175B Parameter Models on consumer hardware
- Running large language models like ChatGPT on a single GPU
- FlexGen: Running large language models like ChatGPT/GPT-3/OPT-175B on a single GPU
impersonator
-
Replika users fell in love with their AI chatbot companions. Then they lost them
I do not know if t has been done but one could resuscitate the chatbot's minds by copying the chat history into a GPT based program.
My own impersonator[0] is not designed for that (no persistent chat and a text based interface) but one can already dump the text in a folder and see if the personality if properly reproduced.
[0]: https://github.com/nestordemeure/impersonator
What are some alternatives?
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
sidekick - Universal APIs for unstructured data. Sync documents from SaaS tools to a SQL or vector database, where they can be easily queried by AI applications [Moved to: https://github.com/psychic-api/psychic]
CTranslate2 - Fast inference engine for Transformer models
langchain-production-starter - Deploy LangChain Agents and connect them to Telegram
ggml - Tensor library for machine learning
ChatGPT-RedditBot - The ChatGPT-RedditBot is a Reddit bot that uses the ChatGPT large language model to generate engaging responses to Reddit threads and submissions.
accelerate - 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
openai-gpt4 - decentralising the Ai Industry, free gpt-4/3.5 scripts through several reverse engineered api's ( poe.com, phind.com, chat.openai.com, phind.com, writesonic.com, sqlchat.ai, t3nsor.com, you.com etc...) [Moved to: https://github.com/xtekky/gpt4free]
rust-bert - Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
ChatRWKV - ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
stanford_alpaca - Code and documentation to train Stanford's Alpaca models, and generate the data.
langchain-visualizer - Visualization and debugging tool for LangChain workflows