-
llm-jeopardy
Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
Multiple leaderboard evaluations for Llama 2 are in and overall it seems quite impressive. https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard This is the most popular leaderboard, but not sure it can be trusted right now since it's been under revision for the past month because apparently both its MMLU and ARC scores are inaccurate. But nonetheless, they did add Llama 2, and the 70b-chat version has taken 1st place. Each version of Llama 2 on this leaderboard is about equal to the best finetunes of Llama. https://github.com/aigoopy/llm-jeopardy On this leaderboard the Llama 2 models are actually some of the worst models on the list. Does this just mean base Llama 2 doesn't have trivia-like knowledge? https://docs.google.com/spreadsheets/d/1NgHDxbVWJFolq8bLvLkuPWKC7i_R6I6W/edit#gid=2011456595 Last, Llama 2 performed incredibly well on this open leaderboard. It far surpassed the other models in 7B and 13B and if the leaderboard ever tests 70B (or 33B if it is released) it seems quite likely that it would beat GPT-3.5's score.
Related posts
-
How to start an Open Source project. Building RESO API JS client
-
Using CSS Custom Properties with Fallbacks for Efficiency
-
Versioning Your REST API Made Easy: Tips and Tricks
-
How to Validate Emails in PHP: regex, filter_var(), and API Explained
-
The Notifier Pattern for Applications That Use Postgres