test
CppUTest
Our great sponsors
test | CppUTest | |
---|---|---|
8 | 0 | |
870 | 1,304 | |
- | 1.5% | |
2.5 | 5.5 | |
10 months ago | 4 days ago | |
Python | C++ | |
MIT License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
test
-
Mixtral 7B MoE beats LLaMA2 70B in MMLU
Sources [1] MMLU Benchmark (Multi-task Language Understanding) | Papers With Code https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu [2] MMLU Dataset | Papers With Code https://paperswithcode.com/dataset/mmlu [3] hendrycks/test: Measuring Massive Multitask Language Understanding | ICLR 2021 - GitHub https://github.com/hendrycks/test [4] lukaemon/mmlu · Datasets at Hugging Face https://huggingface.co/datasets/lukaemon/mmlu [5] [2009.03300] Measuring Massive Multitask Language Understanding - arXiv https://arxiv.org/abs/2009.03300
-
Show HN: Llama-dl – high-speed download of LLaMA, Facebook's 65B GPT model
Because there are many benchmarks that measure different things.
You need to look at the benchmark that reflects your specific interest.
So in this case ("I wasn't impressed that 30B didn't seem to know who Captain Picard was") the closest relevant benchmark they performed is MMLU (Massive Multitask Language Understanding"[1].
In the LLAMA paper they publish a figure of 63.4% for the 5-shot average setting without fine tuning on the 65B model, and 68.9% after fine tuning. This is significantly better that the original GPT-3 (43.9% under the same conditions) but as they note:
> "[it is] still far from the state-of-the-art, that is 77.4 for GPT code-davinci-002 on MMLU (numbers taken from Iyer et al. (2022))"
InstructGPT[2] (which OpenAI points at as most relevant ChatGPT publication) doesn't report MMLU performance.
CppUTest
We haven't tracked posts mentioning CppUTest yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
Google Test - GoogleTest - Google Testing and Mocking Framework
CppUnit - C++ port of JUnit
Unity Test API - Simple Unit Testing for C
Catch - A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)
Google Mock
fff - A testing micro framework for creating function test doubles
doctest - The fastest feature-rich C++11/14/17/20/23 single-header testing framework
Boost.Test - The reference C++ unit testing framework (TDD, xUnit, C++03/11/14/17)
gdb-frontend - ☕ GDBFrontend is an easy, flexible and extensible gui debugger. Try it on https://debugme.dev
benchmark - A microbenchmark support library
backward-cpp - A beautiful stack trace pretty printer for C++
Nonius - A C++ micro-benchmarking framework