FlexGen Alternatives

Similar projects and alternatives to FlexGen

text-generation-webui

876 35,862 9.9 Python FlexGen VS text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
Open-Assistant

329 36,622 9.1 Python FlexGen VS Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
transformers

175 124,557 10.0 Python FlexGen VS transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
stanford_alpaca

108 28,761 2.0 Python FlexGen VS stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.
ggml

69 9,642 9.8 C FlexGen VS ggml

Tensor library for machine learning
bitsandbytes

61 5,389 9.4 Python FlexGen VS bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
FlexGen

39 8,999 3.0 Python FlexGen VS FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
PaLM-rlhf-pytorch

25 7,590 4.6 Python FlexGen VS PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
accelerate

18 6,948 9.7 Python FlexGen VS accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
CTranslate2

13 2,776 9.0 C++ FlexGen VS CTranslate2

Fast inference engine for Transformer models
rust-bert

7 2,415 6.8 Rust FlexGen VS rust-bert

Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
rwkvstic

2 144 6.7 Python FlexGen VS rwkvstic

Framework agnostic python runtime for RWKV models
impersonator

1 15 3.8 Python FlexGen VS impersonator

Chat with an AI simulation of anyone as easily as copy-pasting text into a folder! (by nestordemeure)
stable-horde-notebook

1 5 10.0 Jupyter Notebook FlexGen VS stable-horde-notebook

A Jupyter notebook for Stable Horde, for use in Google Colab, etc.
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better FlexGen alternative or higher similarity.

Suggest an alternative to FlexGen

FlexGen reviews and mentions

Posts with mentions or reviews of FlexGen. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-16.

Training LLaMA-65B with Stanford Code
3 projects | /r/Oobabooga | 16 Mar 2023

#1: Progress Update | 4 comments #2: the default UI on the pinned Google Colab is buggy so I made my own frontend - YAFFOA. | 18 comments #3: Paper reduces resource requirement of a 175B model down to 16GB GPU | 19 comments
Replika users fell in love with their AI chatbot companions. Then they lost them
2 projects | news.ycombinator.com | 2 Mar 2023

It's really just a gpu vram limitation: affordable GPUs are rather memory starved.
Fortunately people have started writing implementations for pipelining across multiple gpus.
https://github.com/Ying1123/FlexGen
Same as with Stable Diffusion, new AI based LAION, are coming up slowly but surely: Paper reduces resource requirement of a 175B model down to 16GB GPU
1 project | /r/StableDiffusion | 21 Feb 2023
And Here..We..Go: Running large language models like ChatGPTon a single GPU. Up to 100x faster than other offloading systems
1 project | /r/Newsoku_L | 21 Feb 2023
When, how and why will this Stable Diffusion spring stop?
2 projects | /r/StableDiffusion | 20 Feb 2023

Actually there's a solution : read this paper https://github.com/Ying1123/FlexGen/blob/main/docs/paper.pdf
Exciting new shit.
3 projects | /r/PygmalionAI | 20 Feb 2023

Flexgen - Run big models on your small GPU https://github.com/Ying1123/FlexGen
Paper reduces resource requirement of a 175B model down to 16GB GPU
2 projects | /r/ChatGPTforall | 20 Feb 2023
FlexGen - Run 175B Parameter Models on consumer hardware
1 project | /r/ChatGPT | 20 Feb 2023
Running large language models like ChatGPT on a single GPU
1 project | /r/patient_hackernews | 20 Feb 2023
FlexGen: Running large language models like ChatGPT/GPT-3/OPT-175B on a single GPU
1 project | /r/mlscaling | 20 Feb 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 25 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Stats

Basic FlexGen repo stats

Mentions

Stars

5,350

Activity

10.0

Last Commit

about 1 year ago

Ying1123/FlexGen is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of FlexGen is Python.

Popular Comparisons