back to top
Thursday, January 23, 2025
spot_img
HomeTechnologyEven the most advanced AI struggles to surpass this new benchmark

Even the most advanced AI struggles to surpass this new benchmark

The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides various data labeling and AI development services, have introduced a challenging new benchmark for cutting-edge AI systems.

Known as Humanity’s Last Exam, this benchmark comprises numerous crowdsourced questions covering topics such as mathematics, humanities, and natural sciences. The questions are presented in various formats, including ones that feature diagrams and images to increase the difficulty of evaluation.

In a preliminary study, none of the prominent publicly available AI systems were able to achieve a score exceeding 10% on Humanity’s Last Exam.

CAIS and Scale AI intend to open up the benchmark to the research community to allow researchers to delve deeper into the nuances and assess new AI models.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments