-
llm-colosseum
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM (by aws-banjo)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
llm-colosseum is another repo that takes a more creative look at benchmarking your LLM's, this time using a classic video arcade fighting game. My colleague Banjo has put together this repo, together with a supporting blog post, 14 LLMs fought 314 Street Fighter matches. Here's who won, which is a must read this week. Check out the repo and post for videos of these LLMs playing games.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.