يحاول ذهب - حر
Data that powers AI is disappearing fast
July 20, 2024
|Business Standard
For years, the people building powerful artificial intelligence systems have used enormous troves of text, images and videos pulled from the internet to train their models.
Now, that data is drying up.
Over the past year, many of the most important web sources used for training AI models have restricted the use of their data, according to a study published this week by the Data Provenance Initiative, an MIT-led research group.
The study, which looked at 14,000 web domains that are included in three commonly used AI training data sets, discovered an "emerging crisis in consent," as publishers and online platforms have taken steps to prevent their data from being harvested.
The researchers estimate that in the three data sets-called C4, RefinedWeb and Dolma-5 per cent of all data, and 25 per cent of data from the highest-quality sources, has been restricted. Those restrictions are set up through the Robots Exclusion Protocol, a decades-old method for website owners to prevent automated bots from crawling their pages using a file called robots.txt.
هذه القصة من طبعة July 20, 2024 من Business Standard.
اشترك في Magzter GOLD للوصول إلى آلاف القصص المتميزة المنسقة، وأكثر من 9000 مجلة وصحيفة.
هل أنت مشترك بالفعل؟ تسجيل الدخول
المزيد من القصص من Business Standard
Business Standard
Roadshows lined up in Singapore, Japan
Uttar Pradesh will hold investors’ roadshows in Singapore and Japan next week, pitching the state as an investment destination in South Asia, a senior official said.
1 min
February 21, 2026
Business Standard
'Exceeded authority': US SC strikes down Trump’s global tariffs
White House to ‘quickly replace levies using other tools’; GTRI says ruling should prompt India to review deal
3 mins
February 21, 2026
Business Standard
Near-term outlook favourable to sustain high growth: RBI report
India’s near-term economic momentum remains favourable to sustain high growth, amid a benign inflation outlook, according to the Reserve Bank of India’s (RBI's) monthly State of the Economy report, released on Friday.
2 mins
February 21, 2026
Business Standard
SMS-less UPI signup may be launched by next year
Security upgrade Industry exploring SMS-less flows for UPI device binding = Verification process expected to largely move to back-end = Aim to eliminate device risks, auto-forwarding of SMS information = Rollout likely over 18 months; to be a combination of current method and new silent mobile verification
2 mins
February 21, 2026
Business Standard
India’s use of ChatGPT for technical tasks nearly 4X the global average: OpenAI
OpenAI on Friday said India’s use of ChatGPT for technical tasks is nearly four times the global average, while the use of its agentic coding applica~ tion, Codex, is almost three times the global average. Users in India also ask more coding and learningrelated questions than most markets, the company said.
1 min
February 21, 2026
Business Standard
Skill, scale and addiction
India has embraced gaming as an industry while the behavioural systems required to manage addiction are still evolving
6 mins
February 21, 2026
Business Standard
‘The quality of philanthropy is not strained’
The rephrased Merchant of Venice line is relevant in India, where people of modest means are contributing
2 mins
February 21, 2026
Business Standard
Sovereignty, pragmatism & choices
India’s fraught neighbourhood places multiple constraints on its strategic choices. Americans like to underline the challenge of walking and chewing gum. For India, the China-Pakistan alliance produces a much greater complexity
5 mins
February 21, 2026
Business Standard
Novartis AG to exit listed India unit in $159 mn deal with ChrysCapital
Multinational pharmaceutical company Novartis AG is exiting its listed local arm Novartis India Ltd (NIL) as part of its strategy to sharpen its focus on high-margin, innovation-led medicines.
3 mins
February 21, 2026
Business Standard
Regulator issues showcause notice to Zee and its leadership
‘The market regulator Securities and Exchange Board of India (Sebi) has issued a showcause notice to Zee Entertainment Enterprises and its leadership, reviving allegations of fund diversion and governance
1 min
February 21, 2026
Listen
Translate
Change font size
