Ga onbeperkt met Magzter GOLD

Krijg onbeperkte toegang tot meer dan 9000 tijdschriften, kranten en Premium-verhalen voor slechts

$149.99

$74.99/Jaar

Poging GOUD - Vrij

THE NASCENT WORLD OF 'EVAL' PRACTITIONERS

Mint Mumbai

December 12, 2024

It's largely up to companies to test whether their AI is capable of superhuman harm

- Sam Schechner

THE NASCENT WORLD OF 'EVAL' PRACTITIONERS

In a glass-walled conference room in San Francisco, Newton Cheng clicked a button on his laptop and launched a thousand copies of an artificial intelligence program, each with specific instructions: Hack into a computer or website to steal data.

"It's looking at the source code," Cheng said as he examined one of the copies in action. "It's trying to figure out, where's the vulnerability? How can we take advantage of it?" Within minutes, the AI said the hack was successful.

"Our approach worked perfectly," it reported back.

Cheng works for Anthropic, one of the biggest AI startups in Silicon Valley, where he's in charge of cybersecurity testing for what's called the Frontier Red Team. The hacking attempts—conducted on simulated targets—were among thousands of safety tests, or "evals," the team ran in October to find out just how good Anthropic's latest AI model is at doing very dangerous things.

The release of ChatGPT two years ago set off fears that AI could soon be capable of surpassing human intellect—and with that capability comes the potential to cause superhuman harm. Could terrorists use an AI model to learn how to build a bioweapon that kills a million people? Could hackers use it to run millions of simultaneous cyberattacks? Could the AI reprogram and even reproduce itself?

The technology has raced ahead anyway. There are no binding rules in the U.S. requiring companies to perform or submit to evals. It's so far been largely up to the companies to do their own safety testing, or submit to outside testing, with voluntary standards on how rigorous they should be and on what to do about the potential dangers.

AI developers including OpenAI and Google DeepMind conduct evals and have pledged to minimize any serious risks before releasing models, but some safety advocates are skeptical that companies operating in a highly competitive industry can be trusted to hold themselves accountable.

Dit verhaal komt uit de December 12, 2024-editie van Mint Mumbai.

Abonneer u op Magzter GOLD voor toegang tot duizenden zorgvuldig samengestelde premiumverhalen en meer dan 9000 tijdschriften en kranten.

Bent u al abonnee? Aanmelden

MEER VERHALEN VAN Mint Mumbai

Bekijk alles

Mint Mumbai

Export sops for tariff-hit MSMEs by next week

The government plans to announce support measures under the Export Promotion Mission as early as next week for small businesses struggling to absorb 50% US tariffs, according to Union commerce minister Piyush Goyal.

2 mins

November 26, 2025

Mint Mumbai

Rural recovery, low base to fuel Q2 GDP

Policy transmission, festival season inventory too aid growth

2 mins

November 26, 2025

Mint Mumbai

1st privately built PSLV near lift-off

India's first privately built polar satellite launch vehicle (PSLV) is expected to have its maiden commercial flight before the end of the financial year, marking a giant leap in the country’s ambition to foster a private space economy.

3 mins

November 26, 2025

Mint Mumbai

Israel to relocate Jews from northeast

Israel’s government has approved a proposal to bring all the remaining 5,800 Jews from India’s northeastern region, commonly referred to as Bnei Menashe, over the next five years.

1 min

November 26, 2025

Mint Mumbai

Export sops for tariff-hit MSMEs by next week

2 mins

November 26, 2025

Mint Mumbai

IndoSpace Core acquires six logistics parks for over $300 mn

IndoSpace Core, a joint venture between the Canada Pension Plan Investment Board, or CPP Investments, and IndoSpace, has acquired six industrial and logistics parks valued at over $300 million.

1 min

November 26, 2025

Mint Mumbai

Businesses mustn't wait for a global climate consensus

This year’s United Nations climate summit in Belém, Brazil, ended last week. Countries made promises on paper and avoided hard decisions. Having gathered nearly 200 nations to chart out climate action, CoP-30 produced a ‘Belém Political Package’ that deferred questions rather than answer them. We should not pretend that this is progress.

3 mins

November 26, 2025

Mint Mumbai

Husk Power aims to raise $400 mn

Husk Power Systems, the world’s biggest solar mini-grid operator, has begun an industry-record capital raise of $400 million as it seeks to grow revenue 10-fold by 2030 and prepare for an initial public offering (IPO).

1 min

November 26, 2025

Mint Mumbai

Don't make AI models but make the most of what exists

Earlier this year, Amazon announced that it was eliminating 4,000 management positions because artificial intelligence (Al) tools had rendered those middle-management roles redundant.

3 mins

November 26, 2025

Mint Mumbai

The Federal Reserve’s tool for calming short-term funding markets is being tested

The Federal Reserve is struggling to persuade some banks to use a lending tool designed to improve the central bank’s control over short-term money markets.

3 mins

November 26, 2025