Artificial Analysis overhauls its AI Intelligence Index, replacing popular benchmarks with 'real-world' tests

23 hours ago venturebeat

The arms race to build smarter AI models has a measurement problem: the tests used to rank them are becoming obsolete almost as quickly as the models improve. On Monday, Artificial Analysis, an independent AI benchmarking organization whose rankings are closely watched by developers and enterprise buyers, released a major overhaul to its Intelligence Index that fundamentally changes how the industry measures AI progress.

The new Intelligence Index v4.0 incorporates 10 evaluations spanning agents, coding, scientific reasoning, and general knowledge. But the changes go far deeper than shuffling test names. The organization removed three staple benchmarks — MMLU-Pro, AIME 2025, and LiveCodeBench — that have long been cited by AI companies in their marketing materials. In their place, the new index introduces evaluations designed to measure whether AI systems can complete the kind of work that people actually get paid to do.

"This index shift reflects a broader transition: intelligence is ...

Copyright of this story solely belongs to venturebeat . To see the full text click HERE

Share: