industry

FACTS Grounding: A new benchmark for evaluating the factuality of large language models (deepmind.google)

deepmind.google · 1 year ago · write a board post referencing this

Our comprehensive benchmark and online leaderboard offer a much-needed measure of how accurately LLMs ground their responses in provided source material and avoid hallucinations