evalplus
chatgpt_academic
evalplus | chatgpt_academic | |
---|---|---|
3 | 1 | |
918 | 31,116 | |
13.4% | - | |
9.3 | 10.0 | |
6 days ago | about 1 year ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
evalplus
-
The AI Reproducibility Crisis in GPT-3.5/GPT-4 Research
*Further Reading*:
- [GPT-4's decline over time (HackerNews)](https://news.ycombinator.com/item?id=36786407)
- [GPT-4 downgrade discussions (OpenAI Forums)](https://community.openai.com/t/gpt-4-has-been-severely-downg...)
- [Behavioral changes in ChatGPT (arXiv)](https://arxiv.org/abs/2307.09009)
- [Zero-Shot Replication Effort (Github)](https://github.com/emrgnt-cmplxty/zero-shot-replication)
- [Inconsistencies in GPT-4 HumanEval (Github)](https://github.com/evalplus/evalplus/issues/15)
- [Early experiments with GPT-4 (arXiv)](https://arxiv.org/abs/2303.12712)
- [GPT-4 Technical Report (arXiv)](https://arxiv.org/abs/2303.08774)
-
Official WizardCoder-15B-V1.0 Released! Can Achieve 59.8% Pass@1 on HumanEval!
❗Note: In this study, we copy the scores for HumanEval and HumanEval+ from the LLM-Humaneval-Benchmarks. Notably, all the mentioned models generate code solutions for each problem utilizing a single attempt, and the resulting pass rate percentage is reported. Our WizardCoder generates answers using greedy decoding and tests with the same code.
chatgpt_academic
What are some alternatives?
gpt_academic - 为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
Qwen-VL - The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
llm_oracle - LLM Oracle is a GPT-4 powered tool for predicting future events. It's like a Magic 8 Ball that is able to perform basic research, calculations, and reasoning.
zotero-gpt - GPT Meet Zotero.
zero-shot-replication
Baichuan-13B - A 13B large language model developed by Baichuan Intelligent Technology
ChatClipboard - Boost Your ChatGPT Productivity with ChatClipboard - The Ultimate Desktop App for Effortless Text Response!
human-eval - Code for the paper "Evaluating Large Language Models Trained on Code"
ChatGLM2-6B - ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
WizardLM - Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder and WizardMath
slidev-theme-academic - Academic presentations with Slidev made simple 🎓