StableLM
txtinstruct
StableLM | txtinstruct | |
---|---|---|
43 | 13 | |
15,853 | 215 | |
0.2% | 2.8% | |
5.0 | 5.0 | |
about 1 month ago | 8 months ago | |
Jupyter Notebook | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
StableLM
-
The Era of 1-bit LLMs: ternary parameters for cost-effective computing
https://github.com/Stability-AI/StableLM?tab=readme-ov-file#...
-
Stable LM 3B: Bringing Sustainable, High-Performance LMs to Smart Devices
https://mistral.ai/news/announcing-mistral-7b/
looking at the 3b results (here https://github.com/Stability-AI/StableLM#stablelm-alpha-v2 ?), it looks like Mistral (which outperforms Llama-2 13b) is far more powerful
-
FreeWilly 1 and 2, two new open-access LLMs
Does this mean Stability gave up on StableLM?
I notice that the repo hasn’t been updated since April, and a question asking for an update has been ignored for at least a month: https://github.com/Stability-AI/StableLM/issues/83
-
In five years, there will be no programmers left, believes Stability AI CEO
I'm not "ignoring" StableLM, if anything it's the impetus for my post. The alpha models were so bad and unusable that it seems they may have simply abandoned the project. It's clear they basically didn't know what they were doing, which is silly for a company of their size and specialization.
-
Losing the plot
1) StableLM released a checkpoint at 800B for their 3B and 7B at 800B tokens with 4096 context size, but perform very poorly on different benchmarks and finetuning is discouraged with such a weak base model
-
UAE's Technology Innovation Institute Launches Open-Source "Falcon 40B" Large Language Model for Research & Commercial Utilization
It is the best open-source model currently available. Falcon-40B outperforms LLaMA, StableLM, RedPajama, MPT, etc. See the OpenLLM Leaderboard.
- Consulta API GPT
- Google "We Have No Moat, And Neither Does OpenAI"
-
New to StableLM--is it possible to use this locally to fine-tune on a small subset of documents yet?
Someone shared this link on another recent post
-
[N] Stability AI releases StableVicuna: the world's first open source chatbot trained via RLHF
Github: https://github.com/Stability-AI/StableLM
txtinstruct
-
Questions about memory, tree-of-thought, planning
I tried cromadb but had terrible performance and could not pin down the cause (likely a problem on my end). Weaviate was easy to setup and had excellent performance, this is probably what I will use in the future. Next on my list is txtinstruct, to finetune a model with data that does not change and using a vector db for everything else seems promising.
-
[R] Let Language Models be Language Models
The closest thing I've seen to this is txtinstruct
-
Create a ChatGPT-like program using an open source model and custom data.
txtinstruct is a framework for training instruction-tuned models
-
Stability AI Launches the First of Its StableLM Suite of Language Models
Great to see the continued release of open models. The only disappointing thing is that models keep building on CC-BY-NC licensed datasets, which severely limits their use.
Hopefully, people consider txtinstruct (https://github.com/neuml/txtinstruct) and other approaches to generate instruction-tuning datasets without the baggage.
- Build open instruction-tuned datasets and models (r/MachineLearning)
- Build open instruction-tuned datasets and models
- [P] Build open instruction-tuned datasets and models
- Create open instruction-tuned datasets and LLM models
- Show HN: Build open instruction-tuned datasets and models
What are some alternatives?
text-generation-webui - A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
safetensors - Simple, safe way to store and distribute tensors
lm-evaluation-harness - A framework for few-shot evaluation of language models.
AlpacaDataCleaned - Alpaca dataset from Stanford, cleaned and curated
llama.cpp - LLM inference in C/C++
geov - The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER). We have shared a pre-trained 9B parameter model.
ggml - Tensor library for machine learning
cataclysm - Cataclysm - Code generation library for the end game
Open-Assistant - OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
tree-of-thought-llm - [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
alpaca_lora_4bit
instruct-eval - This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.