Official WizardCoder-15B-V1.0 Released! Can Achieve 59.8% Pass@1 on HumanEval!

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • WizardLM

    Discontinued Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath

  • The project repo: WizardCoder

  • llm-humaneval-benchmarks

  • ❗Note: In this study, we copy the scores for HumanEval and HumanEval+ from the LLM-Humaneval-Benchmarks. Notably, all the mentioned models generate code solutions for each problem utilizing a single attempt, and the resulting pass rate percentage is reported. Our WizardCoder generates answers using greedy decoding and tests with the same code.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • evalplus

    EvalPlus for rigourous evaluation of LLM-synthesized code

  • ❗Note: In this study, we copy the scores for HumanEval and HumanEval+ from the LLM-Humaneval-Benchmarks. Notably, all the mentioned models generate code solutions for each problem utilizing a single attempt, and the resulting pass rate percentage is reported. Our WizardCoder generates answers using greedy decoding and tests with the same code.

  • human-eval

    Code for the paper "Evaluating Large Language Models Trained on Code"

  • ❗Note: The above table conducts a comprehensive comparison of our WizardCoder with other models on the HumanEval and MBPP benchmarks. We adhere to the approach outlined in previous studies by generating 20 samples for each problem to estimate the pass@1 score and evaluate it with the same code. The scores of GPT4 and GPT3.5 reported by OpenAI are 67.0 and 48.1 (maybe these are the early version of GPT4&3.5).

  • ggml

    Tensor library for machine learning

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • The AI Reproducibility Crisis in GPT-3.5/GPT-4 Research

    4 projects | news.ycombinator.com | 25 Aug 2023
  • GPT 4 new limits only 40 messages in 3 days

    2 projects | /r/ChatGPT | 10 Dec 2023
  • ChatGPT needs its own desktop application

    1 project | /r/ChatGPT | 10 Dec 2023
  • GPT Message limit is lying?

    1 project | /r/OpenAI | 10 Dec 2023
  • Enhance Speed of AnkiBrain Addon

    1 project | /r/ankibrain | 6 Dec 2023