Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
Why do you think that https://github.com/Libr-AI/OpenFactVerification is a good alternative to long-form-factuality