The New York Times has hit back at efforts by Open AI to force it to hand over source materials used in the creation of its articles as the high-profile copyright infringement dispute between the two companies enters the discovery phase.
The newspaper accuses Open AI of infringing its copyrights by training Chat GPT with its content without permission. OpenAI’s demand to see source materials used by journalists writing for the Times suggests that its defence may rely heavily on copyright technicalities as it fights the infringement lawsuit.
This may include opening up a debate about whether or not the works it has used to train its AI meet the threshold to be protected by copyright. If that approach is successful, that may set a precedent for technology companies involved in other AI related copyright cases, including those filed by the music industry, to engage a similar defence.
In a letter to the judge, Open AI argues that copyright protection extends “only to those components of a work that are original to the author”, citing American legal precedent.
The AI company goes on to contend that the Times “cannot pursue a claim for infringement” in relation to any parts of an article that were not original to the newspaper, including where press releases statements, third-party quotes or newswire content was used.
With that in mind, the AI company said, the court “should order the Times to produce documents sufficient to show what portions” of the articles Open AI is accused of exploiting without licence “are original to the Times”.
The Times has urged the judge to reject Open AI’s discovery demands arguing that its news gathering processes are irrelevant to copyright protection.
In its own letter to the judge, the news company states, “OpenAI claims that the reporters’ notes underlying the asserted works may shed light on whether The Times’s news articles are really original, expressive content. But that is not how copyright law works. The expressive nature of a work is determined by reference to the work itself”.
The Times further argues that, even if a reporter’s notes showed that an article was 90% third party content, it would still be protected by copyright and therefore the newspaper would still be able to sue for infringement, “particularly when (as here) defendants appear to have copied entire works verbatim”.
Beyond its copyright concerns, the Times argues that Open AI’s discovery requests are too broad and could infringe its reporters’ legal rights to protect their sources, and keep information confidential, potentially interfering with the paper’s “privileged newsgathering process” in a way that “would have serious negative and far-reaching consequences”.