Versuchen GOLD - Frei
Who owns the data used to train AI?
PC Pro
|September 2023
Elon Musk says he owns it. Twitter's Ts & Cs suggest otherwise. James O'Malley investigates who really owns the data being used to train AI
For decades, the fields of rocket science and brain surgery have been cited as fields of endeavour that present almost unimaginable levels of complexity. Now we might want to add another tricky job to the list: managing Twitter.
Since Elon Musk dropped $44 billion and took control of Twitter at the end of last year, it hasn't gone well. The CEO who, let's not forget, is heavily invested in both rocket and neural science - has seen the value of the social network plummet. One study found that more than half of Twitter's top 1,000 advertisers have given up on the platform since his takeover.
The stress is starting to show. When Microsoft announced that it would be pulling advertising from the platform, reportedly because it refused to pay hiked API-access fees, Musk responded with a tweeted threat: "They trained illegally using Twitter data. Lawsuit time."
His argument is that Al models such as the ones created by Microsoft and its partner OpenAI, the firm behind ChatGPT, were getting a free ride on Twitter's data. Large language models (LLMs) that power AI tools such as ChatGPT have been "trained" on text taken from across the internet. This could conceivably have included data from Twitter.
Now Musk wants his pound of flesh. But who really owns data once it's out on the internet? Does Musk have any right to lay claim to it? The answer, you'll be shocked to hear, is complicated.
Scrapes of wrath
"There are so many variables that help to answer whether a specific scraping act is legal or illegal," said Denas Grybauskas, head of legal at web intelligence collection firm Oxylabs.
His company specialises in writing scrapers - software and tools that automate the work of downloading the contents of a website or individual web page, then extracting and organising the data. It's the equivalent of saving a web page on your computer, but automated and performed at mass scale.
Diese Geschichte stammt aus der September 2023-Ausgabe von PC Pro.
Abonnieren Sie Magzter GOLD, um auf Tausende kuratierter Premium-Geschichten und über 9.000 Zeitschriften und Zeitungen zuzugreifen.
Sie sind bereits Abonnent? Anmelden
WEITERE GESCHICHTEN VON PC Pro
PC Pro
The enshittification of AI begins
The arrival of ads in ChatGPT could be the beginning of a slippery slope
4 mins
April 2026
PC Pro
Microsoft turmoil exposed in Epstein files
Windows 8 fallout revealed in email exchanges with convicted sex offender
1 mins
April 2026
PC Pro
The wrong type of circular economy
If you perceive tears of rage they're due to a new source of frustration: the squandering of wealth and resources in the unregulated AI quest
3 mins
April 2026
PC Pro
Asus ExpertBook Ultra
If you thought business laptops were boring, allow the ExpertBook Ultra to change your mind
6 mins
April 2026
PC Pro
An end to mobile notspots?
We find out about the backup SIM that works with every UK network
4 mins
April 2026
PC Pro
DrayTek VigorSwitch P2542x
A good-value gigabit PoE+ switch witha high port density, a big power budget and heaps of features
3 mins
February 2026
PC Pro
Jabra PanaCast 40 VBS
This smart VC combo offers on-demand Microsoft Teams Rooms and BYOD modes, plus great image quality
2 mins
February 2026
PC Pro
"Progress lies not with the trusted brands but through innovators in the gaming sector"
In the land where everyone is in a bind, those with the smallest devices will be king - or, why mini systems make such great diagnostic devices
8 mins
February 2026
PC Pro
Ubiquiti Networks UniFi U7 Pro XG
This classy tri-band business Wi-Fi AP delivers the perfect blend of features, performance and value
2 mins
February 2026
PC Pro
6 things to watch for in 2026
What to expect from the year ahead in the tech industry
5 mins
February 2026
Translate
Change font size
