Intentar ORO - Gratis
AI Models Collapse in Face of Complex Problems
Hindustan Times Chandigarh
|June 09, 2025
Just days ahead of the much-anticipated Worldwide Developer Conference (WWDC), Apple has released a study titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity", which saw researchers testing 'reasoning' AI models such as Anthropic's Claude, OpenAI's models, DeepSeek RL, and Google's Thinking models to see how far they can scale to replicate human reasoning.
NEW DELHI: Just days ahead of the much-anticipated Worldwide Developer Conference (WWDC), Apple has released a study titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity," which saw researchers testing 'reasoning' AI models such as Anthropic's Claude, OpenAI's models, DeepSeek RL, and Google's Thinking models to see how far they can scale to replicate human reasoning. Spoiler alert—not as much, as the entire AI marketing pitch would have you believe. Could this signal what may be in store for Apple's AI conversation ahead of the keynote?
The study questions the current standard evaluation of Large Reasoning Models (LRMs) using established mathematical and coding benchmarks, arguing they suffer from data contamination and don't reveal insights into reasoning trace structure and quality. Instead, it proposes a controlled experimental test-bed using algorithmic puzzle environments. The limitations of AI benchmarking, and need to evolve, is something we had written about earlier.
Esta historia es de la edición June 09, 2025 de Hindustan Times Chandigarh.
Suscríbete a Magzter GOLD para acceder a miles de historias premium seleccionadas y a más de 9000 revistas y periódicos.
¿Ya eres suscriptor? Iniciar sesión
MÁS HISTORIAS DE Hindustan Times Chandigarh
Hindustan Times Chandigarh
Bangladesh beefs up security ahead of verdict on Hasina
Bangladesh was on edge on Sunday ahead of a tribunal's verdict on Monday in a case of crimes against humanity against former premier Sheikh Hasina, with additional security forces deployed in Dhaka amid a two-day lockdown called by her Awami League Party.
2 mins
November 17, 2025
Hindustan Times Chandigarh
Understanding South Asia’s evolving insurgent dynamics
KOLBY HANSON TALKS ABOUT WHAT IT MEANS TO BE A ‘LIKELY’ RECRUIT OF AN ARMED GROUP AND POLITICAL ECONOMY OF NORTHEAST
2 mins
November 17, 2025
Hindustan Times Chandigarh
‘46% OF WINNERS IN BIHAR POLLS ARE 55 OR OLDER’
Around 46% of winners in the recently concluded Bihar assembly polls are 55 years old or above, an eight percentage point increase from 2020, when 38% of all 243 MLAs ‘were 55 or more, according to an analysis by PRS Legislative Research.
1 min
November 17, 2025
Hindustan Times Chandigarh
1 MORE HELD IN ASSAM FOR POST ON DELHI BLAST
One more person has been arrested in Assam for putting up social media posts supporting the Delhi blast, taking the total number of apprehensions in this regard to 21, Chief Minister Himanta Biswa Sarma said.
1 min
November 17, 2025
Hindustan Times Chandigarh
{ WESTERN GHATS } DIFFERING CLAIMS Goa’s conundrum on tiger presence in its wild
Are the forests of Goa, part of the biodiversity-rich Western Ghats, home to tigers or not?
2 mins
November 17, 2025
Hindustan Times Chandigarh
2 mosque committee members booked for obstructing ASI check
{ SAMBHAL
1 min
November 17, 2025
Hindustan Times Chandigarh
‘RSS RUNS ON THE EMOTIONAL STRENGTH OF ITS VOLUNTEERS’
Rashtriya Swayamsevak Sangh (RSS) chief Mohan Bhagwat on Sunday said the organisation draws its strength entirely from the “emotional resolve and life force” of its volunteers.
1 min
November 17, 2025
Hindustan Times Chandigarh
Ahan says reports of high entourage cost 'not true'
The high entourage cost debate has been the central focus in the film industry for some time now. And actor Ahan Shetty was reported to have lost out on projects due to his costs, including Sajid Nadiadwala’s Sanki.
1 min
November 17, 2025
Hindustan Times Chandigarh
India-Bhutan: Reiterating old ties in a new context
Prime Minister (PM) Narendra Modi was on a two-day visit to Bhutan last week.
3 mins
November 17, 2025
Hindustan Times Chandigarh
Hunt for two phones used by Umar are ‘biggest missing piece’ in probe
New threads in investigation
4 mins
November 17, 2025
Listen
Translate
Change font size
