AI Models Collapse in Face of Complex Problems
Hindustan Times Gurugram
|June 09, 2025
Just days ahead of the much-anticipated Worldwide Developer Conference (WWDC), Apple has released a study titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity", which saw researchers testing 'reasoning' AI models such as Anthropic's Claude, OpenAI's models, DeepSeek RL, and Google's Thinking models to see how far they can scale to replicate human reasoning.
NEW DELHI: Just days ahead of the much-anticipated Worldwide Developer Conference (WWDC), Apple has released a study titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity," which saw researchers testing 'reasoning' AI models such as Anthropic's Claude, OpenAI's models, DeepSeek RL, and Google's Thinking models to see how far they can scale to replicate human reasoning. Spoiler alert—not as much, as the entire AI marketing pitch would have you believe. Could this signal what may be in store for Apple's AI conversation ahead of the keynote?
The study questions the current standard evaluation of Large Reasoning Models (LRMs) using established mathematical and coding benchmarks, arguing they suffer from data contamination and don't reveal insights into reasoning trace structure and quality. Instead, it proposes a controlled experimental test-bed using algorithmic puzzle environments. The limitations of AI benchmarking, and need to evolve, is something we had written about earlier.
This story is from the June 09, 2025 edition of Hindustan Times Gurugram.
Subscribe to Magzter GOLD to access thousands of curated premium stories, and 10,000+ magazines and newspapers.
Already a subscriber? Sign In
MORE STORIES FROM Hindustan Times Gurugram
Hindustan Times Gurugram
India’s top-performing AI stock faces scrutiny after 55,000% surge
The world’s best-performing stock is turning into a cautionary tale for investors chasing outsized returns from the artificial intelligence boom.
2 mins
December 19, 2025
Hindustan Times Gurugram
Belgium, EU face off over Russian assets as Putin calls leaders ‘pigs’
Belgium insisted on Thursday that its European Union partners must provide ironclad guarantees that it will be protected from Russian retaliation before it would back a massive loan for Ukraine, AP reported.
1 min
December 19, 2025
Hindustan Times Gurugram
Modi conferred with Oman's top honour
Prime Minister Narendra Modi was on Thursday conferred with the Order of Oman, a top civilian honour of the country, by Sultan Haitham bin Tarik for “his contributions to bilateral ties and visionary leadership”.
1 min
December 19, 2025
Hindustan Times Gurugram
Presence at natl camps must for selection: WFI
The Wrestling Federation of India’s new selection policy has made attendance at national camps mandatory for selection to the India team prohibiting wrestlers from training independently.
1 mins
December 19, 2025
Hindustan Times Gurugram
Need urgent roll-out of UPI market-share caps
here isa warning for all trusted systems in India in Indigo's recent operational meltdown.
3 mins
December 19, 2025
Hindustan Times Gurugram
AI carbon footprint equals 8% of global aviation emissions
The boom in artificial intelligence in 2025 led to as much carbon dioxide (CO2) being released into the atmosphere as New York City does annually, according to a new study, The Guardian reported.
1 min
December 19, 2025
Hindustan Times Gurugram
Gzb: Body of landlady, killed by tenants, found in suitcase
High-rise rent horror
2 mins
December 19, 2025
Hindustan Times Gurugram
An evening of dance, legacy and grace
Just like every art form, dance, at its core, is a dialogue of aesthetics, technique, and expressive depth.
1 min
December 19, 2025
Hindustan Times Gurugram
MP: 3.5mn names likely to be deleted from rolls
Around 3.5 million names are likely to be removed from the electoral rolls of Madhya Pradesh after the first phase of Special Intensive Revision (SIR), state poll officials said on Thursday, a day before the draft rolls will be published.
1 min
December 19, 2025
Hindustan Times Gurugram
Employment in food delivery rises 27% amid rapid expansion
India’s food delivery sector directly employed 1.37 million workers in 2023-24, up from 1.08 million in 2021-22, expanding at a compounded annual growth rate (CAGR) of 12.3%, according to a recent study by the National Council of Applied Economic Research (NCAER), and investment group Prosus.
1 mins
December 19, 2025
Listen
Translate
Change font size

