Contributory Copyright Infringement: Uncertainty in the Age of Generative AI

The emergence of generative artificial intelligence (“AI”) has forever altered the landscape of intellectual property law. While technology users were once confined to search engines, AI systems now provide them with autonomously generated text, images, music, and code. The issue, however, is that these outputs often resemble or reproduce copyrighted works. AI companies, such as OpenAI, Meta, and Perplexity rely on vast datasets to train their algorithms; these datasets typically include significant amounts of copyrighted materials.[1] As these AI tools proliferate to seemingly dominate every facet of life, courts and scholars struggle to determine when an AI platform or its developers should be held liable for copyright infringements facilitated by the technology, both in their inputs and outputs.[2] In particular, courts must determine whether an AI company that has knowledge that its systems reproduce copyrighted material can be held liable for that infringement.[3]

The Copyright Act of 1976 has not yet provided a standard for secondary copyright infringement.[4] Instead, courts have developed the doctrine of contributory copyright infringement through case law. The earliest standard provides that “one who, with knowledge of the infringing activity, induces, causes, or materially contributes to the infringing conduct of another, may be held liable as a ‘contributory’ infringer.”[5] The Supreme Court later limited the doctrine, holding that manufacturers of products “capable of substantial noninfringing uses” are not contributorily liable for infringement merely because some product users infringe upon copyrights.[6] Still, the Court clarified that when a company distributes a device “with the object of promoting its use to infringe copyright,” liability may attach.[7] The existing doctrine reflects the tension between technological innovation and copyright enforcement that persists today.

Lower courts have diverged in applying these principles to internet service providers (ISPs). The appellate courts have split on the requisite level of knowledge and action taken necessary to establish contributory liability.[8] The Second and Tenth Circuits require actual knowledge and some affirmative act of assistance,[9] while the Ninth Circuit requires that the service provider knows or has reason to know of infringement and fails to take “simple measures” to prevent further harm.[10] The Fourth Circuit espouses an even narrower standard, requiring “specific enough knowledge of infringement that the defendant could do something about it” and proof that infringement was “substantially certain to result” from continued service.[11] This unresolved circuit split leaves technology companies and copyright owners without a consistent standard or predictability in litigation.

The advent of generative AI has heightened the urgency of resolving this circuit split. AI platforms are fundamentally different from the passive ISPs that have been subject to contributory liability cases. Whereas ISPs primarily transmit or host user content, AI companies design and control models that actively generate new material based on large datasets, which frequently include protected works.[12] Because AI companies can monitor, filter, and alter their training data inputs, they are uniquely situated to prevent output infringements before they occur.[13] For example, an AI company can prevent an input infringement by identifying and removing a copyrighted book from its training data. This, in turn, would prevent an output infringement because the output could not include quotes from the copyrighted book. Importantly, however, an “actual knowledge only” liability standard encourages willful blindness by allowing AI companies to ignore obvious infringement risks. Still, a rule imposing liability without regard to knowledge would unduly burden technological innovation and discourage investment in beneficial AI technologies.

The uncertainty surrounding contributory copyright liability for AI platforms exists at a critical moment for the U.S. economy and its position in global competition. Generative AI has already driven immense economic growth across various industries, and its continued development will likely stimulate further growth.[14] As a result, regulatory decisions affecting AI development have consequences that extend beyond individual AI companies or copyright disputes. Still, unchecked infringement threatens to undermine the copyright system that has long incentivized creative production. Copyright law seeks to encourage investment in creative endeavors by protecting those works.[15] If AI platforms are permitted to train on copyrighted works without accountability, authors may have diminished incentives to create. The challenge, therefore, is to calibrate liability in a manner that preserves both creative incentives and technological progress. This is exactly the challenge that continues to divide the circuit courts.

[1] Matthew Sag, Copyright Safety for Generative AI, 61 Hous. L. Rev. 295, 295 (2023).

[2] See, e.g., Elizabeth A. Henderson, Intellectual Property Liability for Businesses in the Age of AI: What New Liabilities Businesses Using AI Could Face and the Possible Methods of Self-Protection, 13 Mich. Bus. &

Entrepreneurial L. Rev. 62, 63–64 (2024) (“AI has triggered a wave of intellectual property litigation that will set the legal standards for AI programs and their uses going forward.”); see also Jane C. Ginsburg & Luke A. Budiardjo, Authors and Machines, 34 Berkeley Tech. L. J. 343, 343 (2019) (identifying the issue of whether authorship may be attributed to an AI platform).

[3] See Sharon Choi, Assessing the Efficacy of Third-Party Liability Copyright Doctrines Against Platforms That Host AI-Generated Content, 66 B.C. L. Rev. 1087, 1104 (2025).

[4] See 17 U.S.C. § 101–122, 501 (2024) (providing exclusive rights in copyrighted works and the basis for infringement, but no explicit basis for secondary liability).

[5] Gershwin Pub. Corp. v. Columbia Artists Mgmt., Inc., 443 F.2d 1159, 1162 (2d Cir. 1971).

[6] Sony Corp. of America v. Universal City Studios, Inc., 464 U.S. 417, 442 (1984) (“[t]he sale of copying equipment, like the sale of other articles of commerce, does not constitute contributory infringement if the product is widely used for legitimate, unobjectionable purposes, or, indeed, is merely capable of substantial noninfringing uses.”).

[7] MGM Studios Inc. v. Grokster, Ltd., 545 U.S. 913, 919 (2005).

[8] See A&M Recs., Inc. v. Napster, Inc., 239 F.3d 1004 (9th Cir. 2001) (providing the Ninth Circuit’s standard); see also BMG Rts. Mgmt. (US) LLC v. Cox Commc'ns, Inc., 881 F.3d 293 (4th Cir. 2018) (providing the Fourth Circuit’s standard); see also Faulkner v. Nat'l Geographic Enters. Inc., 409 F.3d 26 (2d Cir. 2005) (providing the Second Circuit’s standard).

[9] Faulkner, 409 F.3d at 40 (“[O]ne who, with knowledge of the infringing activity, induces, causes or materially contributes to the infringing conduct of another, may be held liable as a ‘contributory’ infringer.”) (citing Gershwin, 443 F.2d at 1162); Diversey v. Schmidly, 738 F.3d 1196, 1204 (10th Cir. 2013) (“contributory liability attaches when the defendant causes or materially contributes to another's infringing activities and knows of the infringement.”).

[10] Napster, 239 F.3d at 1020 (“Contributory liability requires that the secondary infringer ‘know or have reason to know’ of direct infringement.”); Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146, 1172 (9th Cir. 2007) (“[A] computer system operator can be held contributorily liable if it ‘has actual knowledge that specific infringing material is available using its system,’ and can ‘take simple measures to prevent further damage’ to copyrighted works, yet continues to provide access to infringing works.”) (citing Napster, 239 F.3d at 1022).

[11] BMG Rts. Mgmt., 881 F.3d at 311–12.

[12] See Daniel J. Gervais, The Machine as Author, 105 Iowa L. Rev. 2053, 2053 (2020); see also Sag, supra note 1 (“The training data for these large language models consists predominantly of copyrighted works.”); Andrew W. Torrance & Bill Tomlinson, Training is Everything: Artificial Intelligence, Copyright, and “Fair Training”, 128 Dick. L. Rev. 233, 236 (2023) (“To learn how to behave, the current revolutionary generation of AIs must be trained on vast quantities of published images, written works, sounds, or other forms of data, many of which fall within the core subject matter of copyright law.”).

[13] See Sag, supra note 1, at 338 (“Currently, the training data for many such models excludes toxic and antisocial material, so filtering out copyrighted works is technically plausible.”).

[14] Goldman Sachs, Generative AI could raise global GDP by 7% (Apr. 5, 2023), https://www.goldmansachs.com/insights/articles/generative-ai-could-raise-global-gdp-by-7-percent.

[15] See, e.g., U.S. Const. art. I, § 8, cl. 8.