Mit Magzter GOLD unbegrenztes Potenzial nutzen

Mit Magzter GOLD unbegrenztes Potenzial nutzen

Erhalten Sie unbegrenzten Zugriff auf über 9.000 Zeitschriften, Zeitungen und Premium-Artikel für nur

$149.99
 
$74.99/Jahr
The Perfect Holiday Gift Gift Now

AI Models Collapse in Face of Complex Problems

Hindustan Times Bengaluru

|

June 09, 2025

Just days ahead of the much-anticipated Worldwide Developer Conference (WWDC), Apple has released a study titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity", which saw researchers testing 'reasoning' AI models such as Anthropic's Claude, OpenAI's models, DeepSeek RL, and Google's Thinking models to see how far they can scale to replicate human reasoning.

- Vishal Mathur

NEW DELHI: Just days ahead of the much-anticipated Worldwide Developer Conference (WWDC), Apple has released a study titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity," which saw researchers testing 'reasoning' AI models such as Anthropic's Claude, OpenAI's models, DeepSeek RL, and Google's Thinking models to see how far they can scale to replicate human reasoning. Spoiler alert—not as much, as the entire AI marketing pitch would have you believe. Could this signal what may be in store for Apple's AI conversation ahead of the keynote?

The study questions the current standard evaluation of Large Reasoning Models (LRMs) using established mathematical and coding benchmarks, arguing they suffer from data contamination and don't reveal insights into reasoning trace structure and quality. Instead, it proposes a controlled experimental test-bed using algorithmic puzzle environments. The limitations of AI benchmarking, and need to evolve, is something we had written about earlier.

WEITERE GESCHICHTEN VON Hindustan Times Bengaluru

Hindustan Times Bengaluru

Indian jails: Prisoners of the caste system

In December 2020, as the world grappled with unequal access to Covid-19 vaccines, another form of inequality was exposed inside India’s prisons.

time to read

3 mins

January 01, 2026

Hindustan Times Bengaluru

Hindustan Times Bengaluru

Denmark’s last letter ends 400-yr postal tradition in first for world

Denmark's state-run postal service, PostNord, delivered its final letter on Tuesday (local time), bringing an end to more than 400 years of traditional mail delivery as the country fully embraces digital communication, CNN reported.

time to read

1 mins

January 01, 2026

Hindustan Times Bengaluru

Old challenges, new resolutions

Managing air pollution to negotiating a world in churn, the government has its task cut out in 2026

time to read

2 mins

January 01, 2026

Hindustan Times Bengaluru

Markets surge nearly 1% on last trading day of 2025

MUMBAI: Equity benchmark indices Sensex and Nifty jumped nearly 1% on Wednesday, the final trading session of 2025, after days of range-bound trading amid sustained buying by domestic institutional investors.

time to read

1 min

January 01, 2026

Hindustan Times Bengaluru

Suvidha providers to help resolve PF-related issues, says Mandaviya

The Employees' Provident Fund Organisation (EPFO) will soon appoint \"Suvidha providers\" to act as a guide to subscribers and help them access benefits, such as cash withdrawals, and resolve issues, said Union labour minister Mansukh Mandaviya.

time to read

1 min

January 01, 2026

Hindustan Times Bengaluru

TRUMP TO MAKE A VISIT TO CHINA

025 ended with the US-China relationship finally on somewhat firmer ground.

time to read

1 min

January 01, 2026

Hindustan Times Bengaluru

Old-school tactics, new-age instinct in Carlsen’s endgame

The Soviets are history.

time to read

3 mins

January 01, 2026

Hindustan Times Bengaluru

U.S. STATE DELAYS REVOCATION OF 17K CDLS AFTER SIKH GROUP SUES

A week after immigrant groups filed a lawsuit, California said Tuesday it will delay the revocations of 17,000 commercial driver's licences (CDL) until March to allow more time to ensure that truckers and bus drivers who legally qualify for the licenses can keep them.

time to read

1 min

January 01, 2026

Hindustan Times Bengaluru

DGCA SEEKS AI'S EXPLANATION FOR OPERATING A B-787 DESPITE SNAGS

The Directorate General of Civil Aviation (DGCA) has issued a show-cause notice to Air India, flagging safety concerns over the operation of a Boeing 787-8 Dreamliner aircraft VT-ANI despite repetitive technical snags.

time to read

1 min

January 01, 2026

Hindustan Times Bengaluru

IMRAN KHAN'S SISTERS DENIED MEET WITH HIM, HOLD PROTEST

Leaders of the Pakistan Tehreek-e-Insaf (PTI) and sisters of former prime minister Imran Khan were once again prevented from meeting him at Adiala Jail, prompting them to stage a sit-in near the prison, Dawn reported.

time to read

1 min

January 01, 2026

Listen

Translate

Share

-
+

Change font size

Holiday offer front
Holiday offer back