SaaSHub helps you find the best software and product alternatives Learn more →
Promptbench Alternatives
Similar projects and alternatives to promptbench
-
FLiPStackWeekly
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
-
ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
-
seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Stirling-PDF
#1 Locally hosted web application that allows you to perform various operations on PDF files
-
TornadoVM
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
-
awesome-gpt-prompt-engineering
A curated list of awesome resources, tools, and other shiny things for GPT prompt engineering.
-
opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
promptbench reviews and mentions
-
Show HN: Times faster LLM evaluation with Bayesian optimization
Fair question.
Evaluate refers to the phase after training to check if the training is good.
Usually the flow goes training -> evaluation -> deployment (what you called inference). This project is aimed for evaluation. Evaluation can be slow (might even be slower than training if you're finetuning on a small domain specific subset)!
So there are [quite](https://github.com/microsoft/promptbench) [a](https://github.com/confident-ai/deepeval) [few](https://github.com/openai/evals) [frameworks](https://github.com/EleutherAI/lm-evaluation-harness) working on evaluation, however, all of them are quite slow, because LLM are slow if you don't have infinite money. [This](https://github.com/open-compass/opencompass) one tries to speed up by parallelizing on multiple computers, but none of them takes advantage of the fact that many evaluation queries might be similar and all try to evaluate on all given queries. And that's where this project might come in handy.
- FLaNK Weekly 31 December 2023
- FLaNK 25 December 2023
- Promptbench: A Unified Library for Evaluating and Understanding LLMs
-
A note from our sponsor - SaaSHub
www.saashub.com | 29 Apr 2024
Stats
microsoft/promptbench is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of promptbench is Python.
Popular Comparisons
Sponsored