OpenAI Loses Key Discovery Battle as It Cedes Ground to Authors in AI Lawsuits

“Logo text OpenAI has lost a key discovery battle over internal communications related to the startup deleting two huge datasets of pirated books, a development that further tilts the scales in favor of authors suing the company. To rewind, authors and publishers have gained access to Slack messages between OpenAI’s employees discussing the erasure of”, — write: www.hollywoodreporter.com

Logo text

OpenAI has lost a key discovery battle over internal communications related to the startup deleting two huge datasets of pirated books, a development that further tilts the scales in favor of authors suing the company.

To rewind, authors and publishers have gained access to Slack messages between OpenAI’s employees discussing the erasure of the datasets, named “books 1 and books 2.” But the court held off on whether plaintiffs should get other communications that the company argued were protected by attorney-client privilege.

In a controversial decision that was appealed by OpenAI on Wednesday, US District Judge Ona Wang found that OpenAI must hand over documents revealing the company’s motivations for deleting the datasets. OpenAI’s in-house legal team will be deposed.

At stake: Billions of dollars and, potentially, OpenAI’s defense in the case. The communications could help prove what’s known as “willful” infringement, which triggers significantly higher damages of $150,000 per work. And if it’s found that the company destroyed the evidence with potential litigation in mind, the court could direct juries in later trials to assume it would’ve been unfavorable for OpenAI.

The discovery ruling bolsters what’s increasingly looking like a winning argument over the practice of pirating books from shadow libraries. That theory has changed over the course of AI litigation. At first, lawyers for the authors directly connected the piracy to OpenAI’s training of its models under a single umbrella. But later, they separated the theories and alleged that the distinct act of illegally downloading the works, regardless of whether they were used, constitutes copyright infringement.

Important to note: the one win for authors in another AI copyright case, this one initiated by Andrea Bartz against Anthropic, related to the company illegally downloading millions of books and storing them in a central library. The decision heavily leaned in favor of Anthropic, but the court greenlit the theory for trial. “That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft,” wrote US District Judge William Alsup. After the ruling, Anthropic agreed to pay $1.5 billion to settle the lawsuit.

Last year, a lawyer for OpenAI said that the “books 1” and “books 2” datasets weren’t being used for training purposes and that they were deleted in 2022 “due to their non-use.” Counsel representing authors and publishers called foul play.

The issue has been a major battleground in discovery. At first, OpenAI claimed attorney-client privilege but later said that it would turn over some information. Then, it moved to withdraw its representation that the datasets were deleted due to nonuse and said that all evidence on the erasure is privileged.

In the ruling, the court found that most of the communications aren’t shielded from discovery. This includes Slack messages between OpenAI employees in a channel called “project-clear” and “excise-libgen,” where they discussed deleting the datasets.

“OpenAI has waived privilege by making a moving target of its privilege assertions,” Wang wrote. She added, “OpenAI has gone back-and-forth on whether ‘non-use’ as a ‘reason’ for the deletion of Books1 and Books2 is privileged at all. OpenAI cannot state a ‘reason’ (which implies it is not privileged) and then later assert that the ‘reason’ is privileged to avoid discovery.”

The upshot in OpenAI’s messy legal maneuvering: The company effectively opened the door to the privileged material when it disclosed a reason for the deletion of the dataset.

To stave off a finding of “willful” infringement, it’ll have to show a good faith belief in the innocence of its action. The company faces an uphill battle on that issue, with the court stressing a “fundamental conflict” in circumstances when a defendant blocks discovery into communications about his state of mind by asserting attorney-client privilege.

OpenAI continues to maintain that it did not willfully infringe on any copyrighted material. On Wednesday, it moved to pause enforcement of discovery obligations.

Name	Price	24H (%)
Bitcoin(BTC)	$90,377.00	3.43%
Ethereum(ETH)	$3,021.42	1.93%
Tether(USDT)	$1.00	0.02%
XRP(XRP)	$2.22	0.97%
BNB(BNB)	$890.73	3.11%
Solana(SOL)	$142.68	2.49%
TRON(TRX)	$0.276300	0.75%
Dogecoin(DOGE)	$0.154484	0.45%
Monero(XMR)	$397.65	3.26%
Litecoin(LTC)	$86.84	1.50%
Toncoin(TON)	$1.61	3.04%
POL (ex-MATIC)(POL)	$0.137665	0.72%

Login

Register

Related posts

Leave a Comment Cancel Reply