OpenAI, the creator of ChatGPT, has stated that training AI without copyrighted material would be impossible, given the increasing pressure on artificial intelligence firms regarding the content used for training their products. AI models such as ChatGPT and the image generator DALL-E gain their abilities through training sessions that include large amounts of content scraped from the Internet without the permission of their rights holders. This type of free-for-all scraping has been prevalent in academic machine learning research, but since deep learning AI models became commercially available, the practice has come under intense scrutiny.
"Because copyright today covers virtually every sort of human expression—including blog posts, photographs, forum posts, scraps of software code, and government documents—it would be impossible to train today's leading AI models without using copyrighted materials," OpenAI said in its submission to the House of Lords.
Furthermore, OpenAI claims that limiting training data to public domain books and drawings "created more than a century ago" will result in AI systems that do not "meet the needs of today's citizens."
This statement comes after The New York Times filed a lawsuit against OpenAI and Microsoft, a major investor in the company, last month for allegedly using the newspaper's content illegally in their products. The 69-page lawsuit alleges that OpenAI illegally used the New York Times' work to develop AI systems that compete with media companies.
The lawsuit claims that OpenAI's tools produce "output that recites Times content verbatim, closely summarizes it, and mimics its expressive style," as evidenced by scores of examples.
Getty Images, which owns one of the world's largest photo libraries, is suing the creator of Stable Diffusion, Stability AI, in both the United States and England and Wales for suspected copyright violations. In the United States, a group of music publishers, including Universal Music, are suing Anthropic, the Amazon-backed company behind the Claude chatbot, alleging it of using "innumerable" copyrighted song lyrics to train its model.
If you want to know more about the case, read the entire submission here.
Resources: Image courtsey - Mariia Shalabaieva & Levart_Photographer
Tags: chatGPTcopyrightsOpenAI
Share this:
Jan 15, 2025⋅ 4 min read
🚀 Get Ready for the Davoxel Blender Art Challenge! 🎨✨
events
🎨 Join the Davoxel Blender Art Challenge! Vote, create, win prizes! Poll starts Jan 15, contest starts Jan 20. #DavoxelArtChallenge 🚀
Nov 29, 2024⋅ 3 min read
Cyber week sale is here: Massive Discounts Await Creators and Customers on Davoxel!
events
Cyber Week Sale is here! Enjoy 70% off on everything at Davoxel.com from November 29th to December 4th – don't miss out!
Jul 22, 2024⋅ 3 min read
Bentley systems acquires E-on software: Vue and Plantfactory now Free!
news
Bentley systems acquires E-on software: Vue and Plantfactory now Free!
Feb 10, 2024⋅ 11 min read
Unity vs. Unreal Engine: Choosing the Right Platform for Your Game
hot-takes
Dive into the Unity vs. Unreal Engine debate to find out which game development platform suits your project best.