Poging GOUD - Vrij
Hundreds of safety tests for AI models are flawed, say experts
The Guardian
|November 04, 2025
Experts have found weaknesses, some serious, in hundreds of tests used to check the safety and effectiveness of new artificial intelligence models being released into the world.
Computer scientists from the government's AI Security Institute, and experts at universities including Stanford, Berkeley and Oxford, examined more than 440 benchmarks that provide a key safety net.
They found flaws that "undermine the validity of the resulting claims", adding that "almost all ... have weaknesses in at least one area". As a consequence, resulting scores might be "irrelevant or even misleading".
Many of the benchmarks are used to evaluate the latest AI models released by big technology companies, said the study's lead author, Andrew Bean, a researcher at the Oxford Internet Institute.
In the absence of nationwide AI regulation in the UK and US, benchmarks are used to check if new AIs are safe, align to human interests and achieve their claimed capabilities in reasoning, maths and coding.
Dit verhaal komt uit de November 04, 2025-editie van The Guardian.
Abonneer u op Magzter GOLD voor toegang tot duizenden zorgvuldig samengestelde premiumverhalen en meer dan 9000 tijdschriften en kranten.
Bent u al abonnee? Aanmelden
MEER VERHALEN VAN The Guardian
The Guardian
Money hacks Cushion yourself from the impact of inflation
Inflation measures how much prices rise over time. It is measured officially by the Office for National Statistics (ONS).
4 mins
November 29, 2025
The Guardian
It's festive gift guide time. O come, all ye frothers, joyful and indignant!
The hunt is on in London for the German hairy snail. OK. I have an idea.
4 mins
November 29, 2025
The Guardian
'A hidden crisis' How methanol poisoning has left a trail of trauma
For Bethany Clarke, poison tasted like nothing. There was no bitter aftertaste, no astringent sting at the back of the tongue.
5 mins
November 29, 2025
The Guardian
Ryanair shuts frequent flyer club after customers use it too much
Ryanair is shutting its frequent flyer members’ club after only eight months because customers used its benefits too much.
1 mins
November 29, 2025
The Guardian
Are we at 'peak pizza'? Fried chicken takes a slice of the market as gen Z tastes change
Pizza has become ubiquitous on British dinner plates thanks to brands from Pizza Express and Franco Manca to Domino’s and Goodfella’s - but is it still hot?
3 mins
November 29, 2025
The Guardian
Taliban can trace Afghans with kit left by UK, inquiry hears
The UK left behind sensitive technology allowing the Taliban to track down Afghans who had worked with western forces, a whistleblower has told the Afghan leak inquiry.
2 mins
November 29, 2025
The Guardian
Is Rothermere set to become the UK's most powerful media mogul?
Waiting two decades for another chance to snaffle a prized business acquisition is a luxury not afforded to many executives.
8 mins
November 29, 2025
The Guardian
Don't make prostate screening routinely available, say experts
Prostate cancer screening should not be made available to the vast majority of men across Britain, a panel of expert government health advisers has said, tothe “deep disappointment” of several charities and campaigners.
3 mins
November 29, 2025
The Guardian
Stars join the race for Christmas No 1 with Palestine charity single
Musicians including Neneh Cherry, Celeste and Brian Eno have joined the annual race for the Christmas No 1 spot with a single to raise funds for Palestinian-led organisations.
1 mins
November 29, 2025
The Guardian
All Bar One firm faces £130m hit from higher wage and food bills
The owner of All Bar One yesterday warned of about £130m in extra costs over the next year because of higher wages and rising food prices.
1 mins
November 29, 2025
Listen
Translate
Change font size

