Just put together a programming performance ranking for popular LLaMAs using the HumanEval+ Benchmark!

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • llm-humaneval-benchmarks

  • Some details on the prompting side - some of the models I wasn’t sure of whether to use Alpaca or Vicuna style prompting, so I just tried both and recorded whichever performed best. I tried several different prompt variations, but found a longer prompt to generally give the best results. You can find the long prompt formats I used here: https://github.com/my-other-github-account/llm-humaneval-benchmarks/blob/main/prompt_formats.txt

  • codealpaca

  • CodeAlpaca 7B

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • poe-api

    Discontinued [UNMAINTAINED] A reverse engineered Python API wrapper for Quora's Poe, which provides free access to ChatGPT, GPT-4, and Claude.

  • openplayground-api

    Discontinued A reverse engineered Python API wrapper for OpenPlayground (nat.dev)

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • I am not positive - would be a good question for the folks at https://github.com/oobabooga/text-generation-webui

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts