mlc-llm
EasyLM
mlc-llm | EasyLM | |
---|---|---|
89 | 8 | |
17,053 | 2,247 | |
3.7% | - | |
9.9 | 7.7 | |
4 days ago | 4 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mlc-llm
- FLaNK 04 March 2024
-
Ai on a android phone?
This one uses gpu, it doesn't support Mistral yet: https://github.com/mlc-ai/mlc-llm
-
MLC vs llama.cpp
I have tried running mistral 7B with MLC on my m1 metal. And it kept crushing (git issue with description). Memory inefficiency problems.
-
[Project] Scaling LLama2 70B with Multi NVIDIA and AMD GPUs under 3k budget
Project: https://github.com/mlc-ai/mlc-llm
- Scaling LLama2-70B with Multi Nvidia/AMD GPU
-
AMD May Get Across the CUDA Moat
For LLM inference, a shoutout to MLC LLM, which runs LLM models on basically any API that's widely available: https://github.com/mlc-ai/mlc-llm
-
ROCm Is AMD's #1 Priority, Executive Says
One of your problems might be that gfx1032 is not supported by AMD's ROCm packages, which has a laughably short list of supported hardware: https://rocm.docs.amd.com/en/latest/release/gpu_os_support.h...
The normal workaround is to assign the closest architecture, eg gfx1030, so `HSA_OVERRIDE_GFX_VERSION=10.3.0` might help
Also, it looks like some of your tested projects are OpenCL? For me, I do something like: `yay -S rocm-hip-sdk rocm-ml-sdk rocm-opencl-sdk` to cover all the bases.
My recent interest has been LLMs and this is my general step by step for those (llama.cpp, exllama) for those interested: https://llm-tracker.info/books/howto-guides/page/amd-gpus
I didn't port the docs back in, but also here's a step-by-step w/ my adventures getting TVM/MLC working w/ an APU: https://github.com/mlc-ai/mlc-llm/issues/787
From my experience, ROCm is improving, but there's a good reason that Nvidia has 90% market share even at big price premiums.
-
Show HN: Ollama for Linux – Run LLMs on Linux with GPU Acceleration
Maybe they're talking about https://github.com/mlc-ai/mlc-llm which is used for web-llm (https://github.com/mlc-ai/web-llm)? Seems to be using TVM.
-
Show HN: Fine-tune your own Llama 2 to replace GPT-3.5/4
you already have TVM for the cross platform stuff
see https://tvm.apache.org/docs/how_to/deploy/android.html
or https://octoml.ai/blog/using-swift-and-apache-tvm-to-develop...
or https://github.com/mlc-ai/mlc-llm
- Ask HN: Are you training and running custom LLMs and how are you doing it?
EasyLM
- Maxtext: A simple, performant and scalable Jax LLM
- How To Fine-Tune LLaMA, OpenLLaMA, And XGen, With JAX On A GPU Or A TPU
-
Open-sourced LLMs are adept at mimicking ChatGPT’s style but not its factuality. There exists a substantial capabilities gap, which requires better base LM.
Title: The False Promise of Imitating Proprietary LLLs Authors: Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao Liu, Pieter Abbeel, Sergey Levine, Dawn Song Word Count: 3400 Average Reading Time: 18-20 minutes Source Code: https://github.com/young-geng/EasyLM Additional Links: https://huggingface.co/young-geng/koala-eval, https://huggingface.co/young-geng/koala
-
Paid dev gig: develop a basic LLM PEFT finetuning utility
Check out easyLM https://github.com/young-geng/EasyLM
-
OpenLLaMA Releases 7B/3B Checkpoints with 700B/600B Tokens
We release the weights in two formats: an EasyLM format to be use with our EasyLM framework, and a PyTorch format to be used with the Hugging Face transformers library.
-
OpenLLaMA: An Open Reproduction of LLaMA
I am quite new to this, I would like to get it running. Would the process roughly be:
1. Get a machine with decent GPU, probably rent cloud GPU.
2. On that machine download the weights/model/vocab files from https://huggingface.co/openlm-research/open_llama_7b_preview...
3. Install Anaconda. Clone https://github.com/young-geng/EasyLM/.
4. Install EasyLM:
conda env create -f scripts/gpu_environment.yml
- Koala: A Dialogue Model for Academic Research [Finetuned Llama-13B on a dataset generated by ChatGPT]
What are some alternatives?
llama.cpp - LLM inference in C/C++
camel - 🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society (NeruIPS'2023) https://www.camel-ai.org
ggml - Tensor library for machine learning
Open-Llama - The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
tvm - Open deep learning compiler stack for cpu, gpu and specialized accelerators
brev-cli - Connect your laptop to cloud computers. Follow to stay updated about our product
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
RWKV-LM - RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
llama-cpp-python - Python bindings for llama.cpp
modal-examples - Examples of programs built using Modal
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.