eval

By getcursor

Eval Alternatives

Similar projects and alternatives to eval

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better eval alternative or higher similarity.

eval reviews and mentions

Posts with mentions or reviews of eval. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-25.
  • Show HN: Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B
    8 projects | news.ycombinator.com | 25 Aug 2023
  • Claude 2
    6 projects | news.ycombinator.com | 11 Jul 2023
    While that's what the Technical Report (https://arxiv.org/pdf/2303.08774v3.pdf) says, GPT-4 out in the wild's (reproducible) performance appears to be much higher now. Testing from 3/15 (presumably on the 0314 model) seems to be at 85.36% (https://twitter.com/amanrsanger/status/1635751764577361921). And the linked paper from my post(https://doi.org/10.48550/arXiv.2305.01210) got a pass@1 of 88.4 from GPT-4 recently (May? June?).

    Out of curiousity, I was trying out gpt-4-0613 and claude-v2 with https://github.com/getcursor/eval, but sadly I'm getting hangs at 3% with both of them (maybe hitting rate limits?).

Stats

Basic eval repo stats
2
93
3.0
12 months ago

getcursor/eval is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of eval is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com