Legal Liability and Using Copyrighted Works to Train AI Models: Emerging Boundaries in Fair Use

Lia Lin

We live in an era where artificial intelligence (“AI”) is developing rapidly, and people’s demand has driven developers to train powerful, sophisticated AI systems. Most AI systems share a similar training process: ingesting vast amounts of input material to analyze their style, patterns, and structure, then using the analytical results to generate new outputs.1 Two recent summary judgment decisions from the Northern District of California discussed this issue. In Kadrey v. Meta Platforms, Inc. and Bartz v. PBC, courts considered whether the use of copyrighted works for training AI systems constitutes copyright infringement, and determined that, in certain circumstances, this conduct is not liable under the fair use doctrine.2

Courts in both cases applied the four-factor fair use test: (1) the purpose and character of the use, especially whether it is commercial or is for non-profit educational purpose; (2) the nature of the copyrighted work; (3) the substantiality of the portion used in relation to the work as a whole; and (4) the effect of the use on the market for or the value of the work.3 Both courts focused on factors (1) and (4), finding that factor (2) pointed against fair use and factor (3) favored fair use slightly.4

In analyzing factor (1), both courts found that the use of digitized and copyrighted books to train AIs was highly transformative. In Meta, the court emphasized that training an AI model on books served a distinct purpose of enabling it to generate language responses to users’ prompts rather than conveying the expressive content of the original works.5 Similarly, Bartz found that the purpose and character of using copyrighted works to train AIs to generate new texts was quintessentially transformative, as the system in front of the court only absorbed expressive texts without reproducing the given works’ creative content to the public.6

The reasoning that no transformed content will be re-provided to the public was also applied by courts when they determined that factor (4) favored the fair use, holding that the plaintiffs failed to show that the disputed usage harmed the market for their works. In Bartz, the plaintiffs conceded that training the AI did not result in any exact copies or even knock-offs being provided to the public.7 In Meta, the court similarly observed that the given AI could not reproduce more than 50 consecutive words from any of the plaintiffs’ books, even under the expert witness’s designed method, which particularly induces AIs to regurgitate materials from their training data.8 Weighing the four factors, both courts found that copies used to train AIs were justified as fair use. 

To a certain extent, these cases do guide AI developers seeking to rely on the fair use defense, such as asserting that the copying is transformative. However, they do not indicate that courts have fully opened the door to the free use of copyrighted works for AI training, especially given the potential weaknesses in the plaintiffs’ litigation strategies. In Meta, the court noted that, since the plaintiffs did not provide sufficient evidence to support the claim that their works suffered market harm caused by AI training, factor (4) should be weighed in favor of the defendant. Yet this does not mean that the extensive use of countless works to train AI models will never weaken the market for the plaintiffs’ books.9 As Meta pointed out, no matter how transformative AI training may be, it’s hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for copyrighted books.10

[1] Shivam Agarwal, How to Train an AI Model? Step-by-Step Guide 2025, Dualite (Sep. 13, 2025), https://dualite.dev/blog/how-train-ai-model.

[2] Kadrey v. Meta Platforms, Inc., 788 F. Supp. 3d 1026 (N.D. Cal. 2025); Bartz v. Anthropic PBC, 787 F. Supp. 3d 1007 (N.D. Cal. 2025).

[3] Kadrey, 788 F. Supp. 3d at 1037; Bartz, 787 F. Supp. 3d 1007 at 17.

[4] Kadrey, 788 F. Supp. 3d at 1049–50; Bartz, 787 F. Supp. 3d 1007 at 42, 45.

[5]Kadrey, 788 F. Supp. 3d at 1044.

[6] Bartz, 787 F. Supp. 3d 1007 at 21.

[7] Id. at 13.

[8] Kadrey, 788 F. Supp. 3d at 1041–42.

[9] Id. at 1056.

[10] Id. at 1059.