TheoremQA

The dataset and code for paper: TheoremQA: A Theorem-driven Question Answering dataset (by wenhuchen)

TheoremQA Alternatives

Similar projects and alternatives to TheoremQA

  • ollama

    Get up and running with Llama 3, Mistral, Gemma, and other large language models.

  • LocalAI

    :robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • evals

    49 TheoremQA VS evals

    Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

  • litellm

    Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

  • promptfoo

    20 TheoremQA VS promptfoo

    Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.

  • ChainForge

    14 TheoremQA VS ChainForge

    An open-source visual programming environment for battle-testing prompts to LLMs.

  • GodMode

    7 TheoremQA VS GodMode

    AI Chat Browser: Fast, Full webapp access to ChatGPT / Claude / Bard / Bing / Llama2! I use this 20 times a day.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • fiddler-auditor

    Fiddler Auditor is a tool to evaluate language models.

  • generative-manim

    5 TheoremQA VS generative-manim

    🎨 GPT-4 for video generation ⚡️

  • bench

    1 TheoremQA VS bench

    A tool for evaluating LLMs (by arthur-ai)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better TheoremQA alternative or higher similarity.

TheoremQA reviews and mentions

Posts with mentions or reviews of TheoremQA. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-04.
  • Intuitive Guide to Convolution
    2 projects | news.ycombinator.com | 4 Dec 2023
    https://github.com/360macky/generative-manim :

    > Generative Manim is a prototype of a web app that uses GPT-4 to generate videos with Manim. The idea behind this project is taking advantage of the power of GPT-4 in programming, the understanding of human language and the animation capabilities of Manim to generate a tool that could be used by anyone to create videos. Regardless of their programming or video editing skills.

    "TheoremQA: A Theorem-driven [STEM] Question Answering dataset" (2023) https://github.com/wenhuchen/TheoremQA#leaderboard

    How do you score memory retention and video watching comprehension? The classic educators' optimization challenge

    "Khan Academy’s 7-Step Approach to Prompt Engineering for Khanmigo"

  • I asked 60 LLMs a set of 20 questions
    10 projects | news.ycombinator.com | 9 Sep 2023
    Additional benchmarks:

    - "TheoremQA: A Theorem-driven Question Answering dataset" (2023) https://github.com/wenhuchen/TheoremQA#leaderboard

    - legalbench

Stats

Basic TheoremQA repo stats
2
152
7.6
15 days ago

wenhuchen/TheoremQA is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of TheoremQA is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com