The big picture: As generative AI continues to evolve, many companies are being forced to decide whether to embrace or reject this form of artificial intelligence. This is especially prominent in the world of news media, a source where countless companies are pulling information to teach their AI models.
According to a report by NPR, the New York Times and OpenAI have been in "tense negotiations" over reaching a licensing deal in recent weeks. This agreement would allow OpenAI to legally train its GPT model off of material published by the Times, something the newspaper decided to prohibit earlier this month.
This isn't the first time OpenAI has attempted to secure a deal with a news agency, as it successfully reached one with the Associated Press, one of the biggest news agencies globally. OpenAI being allowed to train GPT with not one but two major news sources could be used to improve the AI model significantly.
However, NPR reports that these negotiations have not gone as smoothly as OpenAI had planned. In fact, two anonymous sources spoke with NPR, claiming that the Times is now considering legal action due to "the discussions becoming so pretentious." According to one of the sources, the Times fears how AI would be used within search engines. When a user searches for a topic, rather than needing to click on an article posted by the Times, AI can simply summarize whatever was written by the journalist. "The need to visit the publisher's website is greatly diminished," the source claims.
An AI model learns nearly all of its information by scrubbing websites and collecting any data that it deems necessary, most of the time (for now) without prior authorization from the original source. This raises a huge question regarding the morality – and the legality – of performing such a task, as it can fall into a "gray area" with copyright.
If the lawsuit comes to fruition and OpenAI is found to be violating copyright laws, a federal judge could force OpenAI to completely wipe GPT's data set and start from scratch. OpenAI would then be allowed to rebuild ChatGPT's database, but only with information it is authorized to utilize, which will undoubtedly slow the process down significantly.
Along with that setback, federal copyright laws have major monetary fines, reaching up to $150,000 per infringement if committed intentionally. "If you're copying millions of works, you can see how that becomes a number that becomes potentially fatal for a company," says Daniel Gervais, co-director of the intellectual property program at Vanderbilt University.
The Times would not be the first group to file a lawsuit against an AI company. Earlier this year, Getty Images sued Stability AI for training Stable Diffusion using photos from Getty without authorization. However, Getty Images did not seek financial compensation from Stability AI; instead, it wished to rebuild the model in hopes of "respecting intellectual property."
A class-action lawsuit has also been filed against OpenAI, which claims ChatGPT scraped data from millions of users without prior knowledge or consent. This information was pulled from various third-party apps such as Spotify, Microsoft Teams, and Snapchat, among many others, but users were not informed.
Whether or not the Times will move forward and set a precedent with the lawsuit has yet to be seen. It is also unknown if negotiations are still ongoing or if they have since fallen through.