Facebook Pixel Who owns the data used to train AI? | PC Pro - technology - Read this story on Magzter.com

Try GOLD - Free

Who owns the data used to train AI?

PC Pro

|

September 2023

Elon Musk says he owns it. Twitter's Ts & Cs suggest otherwise. James O'Malley investigates who really owns the data being used to train AI

-  James O'Malley

Who owns the data used to train AI?

For decades, the fields of rocket science and brain surgery have been cited as fields of endeavour that present almost unimaginable levels of complexity. Now we might want to add another tricky job to the list: managing Twitter.

Since Elon Musk dropped $44 billion and took control of Twitter at the end of last year, it hasn't gone well. The CEO who, let's not forget, is heavily invested in both rocket and neural science - has seen the value of the social network plummet. One study found that more than half of Twitter's top 1,000 advertisers have given up on the platform since his takeover.

The stress is starting to show. When Microsoft announced that it would be pulling advertising from the platform, reportedly because it refused to pay hiked API-access fees, Musk responded with a tweeted threat: "They trained illegally using Twitter data. Lawsuit time."

His argument is that Al models such as the ones created by Microsoft and its partner OpenAI, the firm behind ChatGPT, were getting a free ride on Twitter's data. Large language models (LLMs) that power AI tools such as ChatGPT have been "trained" on text taken from across the internet. This could conceivably have included data from Twitter.

Now Musk wants his pound of flesh. But who really owns data once it's out on the internet? Does Musk have any right to lay claim to it? The answer, you'll be shocked to hear, is complicated.

Scrapes of wrath

"There are so many variables that help to answer whether a specific scraping act is legal or illegal," said Denas Grybauskas, head of legal at web intelligence collection firm Oxylabs.

His company specialises in writing scrapers - software and tools that automate the work of downloading the contents of a website or individual web page, then extracting and organising the data. It's the equivalent of saving a web page on your computer, but automated and performed at mass scale.

MORE STORIES FROM PC Pro

PC Pro

PC Pro

AI & DEV TEAMS The start of a beautiful friendship

Are real-life programmers living on borrowed time? Nik Rawlinson explores the growing popularity of AI-powered development

time to read

9 mins

April 2026

PC Pro

PC Pro

Progress ShareFile Premium

A smart cloud collaboration hub with slick client project management and tight security at a good price

time to read

2 mins

April 2026

PC Pro

PC Pro

Ionos HiDrive Pro

Basic collaboration features, but this affordable cloud file-sharing service will appeal to small businesses

time to read

2 mins

April 2026

PC Pro

PC Pro

Tresorit Engage Business

Tresorit Engage combines clever client collaboration spaces with super-secure cloud file sharing

time to read

2 mins

April 2026

PC Pro

PC Pro

HP Color LaserJet Pro 4202dw

Rich black text, slick design and rapid duplex printing compensate for the clumsy dial and costly cartridges

time to read

3 mins

April 2026

PC Pro

PC Pro

SYNOLOGY DS725+

Excellent performance and upgradability make this a great choice for power users and micro businesses

time to read

3 mins

April 2026

PC Pro

PC Pro

UGREEN NASYNC DH4300 PLUS

Ugreen's DH4300 Plus isn't particularly fast, but it's user-friendly and great value overall

time to read

3 mins

April 2026

PC Pro

PC Pro

TERRAMASTER F2-425 PLUS

If outright performance is everything, you'll be hard-pressed to beat Terra Master's F2-425 Plus

time to read

3 mins

April 2026

PC Pro

PC Pro

D-Link DGS-1530-52P

This port-dense PoE+ switch offers a great feature set, but value and management options could be improved

time to read

2 mins

April 2026

PC Pro

PC Pro

Dell UltraSharp 52 Thunderbolt Hub Monitor

A superb choice for anyone who currently finds themselves with three or more monitors sitting on their desk

time to read

5 mins

April 2026

Translate

Share

-
+

Change font size