OpenAI, Google, and Microsoft meet with leading publishers to discuss journalism in the age of generative AI. The greatest challenge is copyright.
Leading tech companies are talking to major publishers about using news content to train AI models and use the content in chatbots. News Corp, Axel Springer, The New York Times and The Guardian, for example, are said to have spoken with at least one of the leading AI companies.
The tech companies are said to be willing to pay millions and are interested in long-term relationships. In addition, some tech companies are in talks about how publishers can use AI to increase their revenues.
Negotiations are said to be in the early stages. Under discussion is a kind of content subscription from AI companies to publishers, who in turn would be allowed to use the content for their technology.
Current estimates for the copyright-compliant use of news content for AI training range from $5 million to $20 million per year. The Financial Times is also involved in the negotiations and reports on them.
According to the Financial Times, Axel Springer CEO Mathias Döpfner is proposing a quantitative model similar to music streaming. However, this would require AI companies to be transparent about what content they use to train their models. OpenAI, for example, does not disclose the training data for GPT-4, citing the competitive environment.
Döpfner sees an annual subscription as a second choice, as it would put smaller regional and local news providers at a disadvantage. He calls for an industry-wide, collaborative solution.
“If there is no incentive to create intellectual property, there is nothing to crawl. And artificial intelligence will become artificial stupidity,” Döpfner said.
Google has reportedly agreed to license news content for AI training, and has presented a model to The Guardian and NewsUK. Google confirms the talks but does not comment on the content. The company is talking to news publishers in the US, UK and Europe, it said, and has already trained AI on publicly available content, including from publishers. Another option is to give publishers more control over the use of their own content, for example by offering an opt-out option.
OpenAI CEO Sam Altman has reportedly spoken with News Corp and The New York Times. Introducing ChatGPT with Internet access, OpenAI acknowledged that this is a “new way” to interact with the Internet. It said it looks forward to suggestions on how to return traffic to its sources and contribute to the health of the ecosystem.
AI chatbots undermine established content ecosystems
There are two issues with copyright in the context of artificial intelligence and publishing. First, journalistic content is part of the training data. Second, internet-connected chatbots access journalistic content in real-time and use it as a template for a generated response, such as a short summary.
The problem is that the creator of the original content, the publisher, gets nothing. The chatbot users don’t bring in revenue because they don’t come to the website. In the worst case, the chatbot doesn’t even cite the source correctly. Branding is lost.
In addition to publishers, this development generally affects all content providers. Even video content and podcasts from services like YouTube or Spotify could end up in chatbots in the form of short text summaries, audio clips, or shortened videos. News, reviews, and how-to content are likely to be most affected.