Evalplus Alternatives
Similar projects and alternatives to evalplus
-
WizardLM
Discontinued Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
-
llm_oracle
LLM Oracle is a GPT-4 powered tool for predicting future events. It's like a Magic 8 Ball that is able to perform basic research, calculations, and reasoning.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
chatgpt_academic
Discontinued 为GPT/GLM提供图形交互界面,特别优化论文阅读润色体验,模块化设计支持自定义快捷按钮&函数插件,支持代码块表格显示,Tex公式双显示,新增Python和C++项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持清华chatglm等本地模型 [Moved to: https://github.com/binary-husky/gpt_academic]
evalplus reviews and mentions
-
The AI Reproducibility Crisis in GPT-3.5/GPT-4 Research
*Further Reading*:
- [GPT-4's decline over time (HackerNews)](https://news.ycombinator.com/item?id=36786407)
- [GPT-4 downgrade discussions (OpenAI Forums)](https://community.openai.com/t/gpt-4-has-been-severely-downg...)
- [Behavioral changes in ChatGPT (arXiv)](https://arxiv.org/abs/2307.09009)
- [Zero-Shot Replication Effort (Github)](https://github.com/emrgnt-cmplxty/zero-shot-replication)
- [Inconsistencies in GPT-4 HumanEval (Github)](https://github.com/evalplus/evalplus/issues/15)
- [Early experiments with GPT-4 (arXiv)](https://arxiv.org/abs/2303.12712)
- [GPT-4 Technical Report (arXiv)](https://arxiv.org/abs/2303.08774)
-
Official WizardCoder-15B-V1.0 Released! Can Achieve 59.8% Pass@1 on HumanEval!
❗Note: In this study, we copy the scores for HumanEval and HumanEval+ from the LLM-Humaneval-Benchmarks. Notably, all the mentioned models generate code solutions for each problem utilizing a single attempt, and the resulting pass rate percentage is reported. Our WizardCoder generates answers using greedy decoding and tests with the same code.
Stats
evalplus/evalplus is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of evalplus is Python.
Popular Comparisons
Sponsored