Lm-evaluation-harness Alternatives

Similar projects and alternatives to lm-evaluation-harness

llama.cpp

773 57,463 10.0 C++ lm-evaluation-harness VS llama.cpp

LLM inference in C/C++
transformers

176 125,369 10.0 Python lm-evaluation-harness VS transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
stanford_alpaca

108 28,816 2.0 Python lm-evaluation-harness VS stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.
StableLM

43 15,852 5.0 Jupyter Notebook lm-evaluation-harness VS StableLM

StableLM: Stability AI Language Models
alpaca_lora_4bit

41 529 8.6 Python lm-evaluation-harness VS alpaca_lora_4bit
lm-evaluation-harness

34 5,070 9.9 Python lm-evaluation-harness VS lm-evaluation-harness

A framework for few-shot evaluation of language models.
flash-attention

26 10,888 9.4 Python lm-evaluation-harness VS flash-attention

Fast and memory-efficient exact attention
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
safetensors

31 2,442 8.2 Python lm-evaluation-harness VS safetensors

Simple, safe way to store and distribute tensors
txtinstruct

13 215 5.0 Python lm-evaluation-harness VS txtinstruct

📚 Datasets and models for instruction-tuning
sparsegpt

16 626 2.4 Python lm-evaluation-harness VS sparsegpt

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
AlpacaDataCleaned

14 1,394 7.6 Python lm-evaluation-harness VS AlpacaDataCleaned

Alpaca dataset from Stanford, cleaned and curated
instruct-eval

6 468 8.0 Python lm-evaluation-harness VS instruct-eval

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
geov

2 122 5.0 Jupyter Notebook lm-evaluation-harness VS geov

The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER). We have shared a pre-trained 9B parameter model.
lm-eval2

1 13 10.0 Python lm-evaluation-harness VS lm-eval2
cformers

1 6 6.7 C lm-evaluation-harness VS cformers

SoTA Transformers with C-backend for fast inference on your CPU. (by antimatter15)
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better lm-evaluation-harness alternative or higher similarity.

Suggest an alternative to lm-evaluation-harness

lm-evaluation-harness reviews and mentions

Posts with mentions or reviews of lm-evaluation-harness. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-19.

Stability AI Launches the First of Its StableLM Suite of Language Models
24 projects | news.ycombinator.com | 19 Apr 2023

Yeah, although looks like it currently has some issues with coqa: https://github.com/EleutherAI/lm-evaluation-harness/issues/2...
There's also the bigscience fork, but I ran into even more problems (although I didn't try too hard) https://github.com/bigscience-workshop/lm-evaluation-harness
And there's https://github.com/EleutherAI/lm-eval2/ (not sure if it's just starting over w/ a new repo or what?) but it has limited tests available