Strategic counsel for innovators and creatives.

Blog

AI Cases in Plain English: Where Training, Outputs, and Licensing Stand Now

If you create, publish, or build with AI, the last few months clarified key lines and left some big questions open.

1) Training on lawfully obtained books can be fair use (on these facts)

In Bartz v. Anthropic, Judge William Alsup held that using purchased, copyrighted books to train models was “exceedingly transformative” and fair use. But he distinguished that from maintaining a separate, permanent “central library” of pirated books, which he said is not protected.

2) Meta also won on core training theories though the opinion flags limits

In Kadrey v. Meta, the court dismissed several copyright claims tied to training LLaMA, while noting unresolved issues that could matter in future cases and other factual settings.

3) Music lyrics are a hot zone

Music publishers seeking a preliminary injunction against Anthropic over lyric use didn’t get it this spring (the ask was too broad), but the case continues to underscore that lyrics and other highly protectable works carry special risk in datasets and outputs.

4) Discovery is changing how companies log AI outputs

In NYT v. OpenAI, the court ordered preservation of output log data on May 13, 2025, signaling that ephemeral logs can be discoverable. OpenAI later reported that the preservation obligation ended Sept. 26, 2025, after further proceedings, but the message for builders is clear: expect more scrutiny of your data governance.

5) Output-based claims are alive

A consolidated authors’ case in SDNY recently survived in part, allowing claims tied to outputs (not just training) to proceed, another reason creators should monitor how models handle verbatim or near-verbatim text.

What creators, media teams, and startups should do now

  • Keep your data clean. Know what you trained on, how you obtained it, and what you kept. Don’t host or rely on “shadow libraries.” Document licenses and sources.

  • Watch outputs. Police for verbatim or near-verbatim reproduction of protected lyrics, articles, or book text.

  • Tune your contracts. Update vendor/MSA language: representations on data provenance, logging, takedown cooperation, and indemnity.

  • Have a takedown playbook. Creators: register works, monitor for leakage, and move fast on DMCA/takedowns.

  • Plan for discovery. If you’re shipping AI, assume courts may ask for output logs and training proofs; build retention policies accordingly.

Need an AI-era IP audit (data provenance, contracts, takedown plan)? Schedule a strategy session with ARS Counsel today.


Almuhtada Smith