Eval Alternatives
Similar projects and alternatives to eval
-
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
code-interpreter-packages
A list of all packages and their descriptions in code interpreter as of 12 July 2023
eval reviews and mentions
- Show HN: Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B
-
Claude 2
While that's what the Technical Report (https://arxiv.org/pdf/2303.08774v3.pdf) says, GPT-4 out in the wild's (reproducible) performance appears to be much higher now. Testing from 3/15 (presumably on the 0314 model) seems to be at 85.36% (https://twitter.com/amanrsanger/status/1635751764577361921). And the linked paper from my post(https://doi.org/10.48550/arXiv.2305.01210) got a pass@1 of 88.4 from GPT-4 recently (May? June?).
Out of curiousity, I was trying out gpt-4-0613 and claude-v2 with https://github.com/getcursor/eval, but sadly I'm getting hangs at 3% with both of them (maybe hitting rate limits?).
Stats
getcursor/eval is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of eval is Python.
Popular Comparisons
Sponsored