Intentar ORO - Gratis
To fix AI, first break it: Red teaming for AI safety
The Sunday Guardian
|July 06, 2025
Artificial intelligence is transforming society at an unprecedented pace, from generative chatbots in customer service to algorithms aiding medical diagnoses.
Along with this promise, however, come serious risks AI systems have produced biased or harmful outputs, revealed private data, or been 'tricked' into unsafe behaviour. In one healthcare study, for example, red-team testing found that roughly one in five answers from advanced AI models like GPT-4 was inappropriate or unsafe for medical use. To ensure Al's benefits can be realized safely and ethically, the tech community is increasingly turning to red teaming - a practice of stress-testing AI systems to identify flaws before real adversaries or real-world conditions do.
In simple terms, red teaming is about playing 'devil's advocate' with AI systems - actively trying to break, mislead, or misuse them to expose weaknesses.
Originally a military and cybersecurity concept, red teaming refers to an adversarial testing effort where a 'red team' simulates attacks or exploits against a target, while a 'blue team' defends.
In the AI context, AI red teaming means probing AI models and their surrounding systems for vulnerabilities, harmful behaviours, or biases by emulating the strategies a malicious or curious attacker might use.
In essence, a red teamer tries to ask, 'How could this AI go wrong or be made to do something bad?" and then systematically tests those scenarios. Red teaming in AI goes beyond just the model's answers - it can involve examining the whole pipeline (data, infrastructure, user interface) for weaknesses. As modern AI models are open-ended and creative by design, they can also be creatively misused.
Esta historia es de la edición July 06, 2025 de The Sunday Guardian.
Suscríbete a Magzter GOLD para acceder a miles de historias premium seleccionadas y a más de 9000 revistas y periódicos.
¿Ya eres suscriptor? Iniciar sesión
MÁS HISTORIAS DE The Sunday Guardian
The Sunday Guardian
The world order changeth gradually, though surely
No single nation or its leader, including the USA or China, can assume stewardship of the emerging, diffused global order.
6 mins
January 04, 2026
The Sunday Guardian
WHY THE SHANTI BILL CAN REDEFINE INDIA’S ENERGY FUTURE
India’s clean energy transition is primarily discussed in terms of solar additions, wind corridors, and storage technologies.
4 mins
January 04, 2026
The Sunday Guardian
Fantasies about Russia may spark World War III
Peace would result in it being too obvious to hide even within Zelenskyy's European backers, that the war being conducted at great human cost was futile from the start.
5 mins
January 04, 2026
The Sunday Guardian
New jihadi module IMK busted in Assam
An offshoot of Bangladesh-based JMB, IMK propagates the ideology of ‘Ghazwatul Hind’
4 mins
January 04, 2026
The Sunday Guardian
Delhi court convicts man in 2017 murder case
A Delhi court has convicted a man for murdering a youth by hitting him with a bamboo stick during a late-night quarrel at the Anand Vihar ISBT in 2017.
1 mins
January 04, 2026
The Sunday Guardian
INDIAN NAVY PLANS TO INDUCT A WARSHIP EVERY SIX WEEKS
The Indian Navy is on track to induct ships at the rate of one every one-and-a-half months in the coming year, fuelling the economy as its maritime muscle is strengthened.
3 mins
January 04, 2026
The Sunday Guardian
PM to flag off first Vande Bharat sleeper train from Guwahati
Ahead of the upcoming assembly elections, Assam and West Bengal will get the country's first Vande Bharat sleeper train.
1 mins
January 04, 2026
The Sunday Guardian
Transport Ministry proposes Aadhaar-like numbers for EV batteries
The transport ministry has proposed assigning Aadhaar-like unique identification number to EV batteries to ensure their end-to-end traceability and efficient recycling.
2 mins
January 04, 2026
The Sunday Guardian
Congress’ seat claim strains Assam opposition unity
Congress's aggressive seat target unsettles allies as opposition struggles to finalise Assam election strategy.
3 mins
January 04, 2026
The Sunday Guardian
How CCP is ‘assimilating’ Inner Mongolia
The most decisive tool of assimilation has been language policy. Mongolian-medium education has been systematically dismantled, replaced with Mandarin instruction.
2 mins
January 04, 2026
Listen
Translate
Change font size
