Copyright Issues with AI-Enabled Coding

Kathan Roberts

GitHub, owned by Microsoft,[1] is a widely-used cloud storage system that allows software developers to store and track different versions of their code. In June 2021, GitHub introduced an artificial intelligence tool called Copilot[2] that provides autocomplete suggestions for computer code.[3] An upcoming version of Copilot, called Copilot X, will even identify software bugs, suggest fixes for them, and write test cases to verify that the software works as intended.[4]

However, Copilot raises a copyright issue. A class action lawsuit alleges that because the AI was trained on code that GitHub users uploaded to the site, the code that Copilot generates infringes those users’ copyright.[5]

Copyright law is an uneasy fit for software, because the primary use of software is as a machine, not as art or even as a means of human communication. Unlike other forms of writing, software can serve its purpose without a human reading it. However, it is an established principle that copyright law does not require “that a work be directly accessible to humans in order to be eligible for copyright protection.”[6] Rather, software is copyrightable under 17 U.S.C. § 102(a) as a literary work.[7]

Experienced coders will undoubtedly find Copilot to be a force multiplier. Writing code can be mentally draining, but Copilot automates much of that work. Software developers will often just have to verify and tweak the AI’s suggested code rather than write code from scratch. And instead of spending time hunting down the sources of errors in their code, they can rely on Copilot X to identify and fix the issues instantaneously.

But the benefit to novice programmers will be orders of magnitude greater. Not only does their code tend to have more errors than an experienced engineer’s code, but it also takes them longer to find and fix those errors. The autocomplete feature is even more significant. For experienced coders, it merely enables them to work faster, but for a novice, it may be crucial to producing code that even works at all.

This is the reverse of the situation that Christopher TenEyck identifies in his JLA Beat post Downstream Implications of AI Authorship.[8] He explains that AI systems that write literature may benefit established authors far more than novices, because the AI will have far more training data from which to learn to emulate the author’s style and voice. But software is not artwork, so the fact that the AI may or may not emulate the style of a particular writer does not substantially affect the utility of the finished product. Instead of benefitting established writers far more than novices, Copilot seems likely to do the opposite. Someone with only basic knowledge of programming and little experience will be able to implement ideas that they likely would not have been able to execute on before.

There may well be a policy argument to be made here as well. Perhaps AI-enabled coding should be tolerated because it lowers the barriers for budding innovators to enter the tech industry. But this is an academic setting, not a political body, so I will leave that to the politicians to discu

 

[1] James Vincent, The Lawsuit that Could Rewrite the Rules of AI Copyright, Verge (Nov. 8, 2022), https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data [https://perma.cc/74QX-P7ZU] [https://web.archive.org/web/20230328175111/https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data].

[2] Id.

[3] About GitHub Copilot for Individuals, GitHub, https://docs.github.com/en/copilot/overview-of-github-copilot/about-github-copilot-for-individuals [https://perma.cc/M9M2-3SBB] [https://web.archive.org/web/20230328175912/https://docs.github.com/en/copilot/overview-of-github-copilot/about-github-copilot-for-individuals] (last visited Mar. 27, 2023).

[4] Introducing GitHub Copilot X, https://github.com/features/preview/copilot-x [https://perma.cc/89QU-RPM9] [https://web.archive.org/web/20230328175935/https://github.com/features/preview/copilot-x] (last visited Mar. 27, 2023).

[5] Thomas Claburn, Microsoft, GitHub, OpenAI Urge Judge To Bin Copilot Code Rip-off Case, Register (Jan. 31, 2023), https://www.theregister.com/2023/01/31/microsoft_github_openai_copilot [https://perma.cc/555W-9PES] [https://web.archive.org/web/20230328180015/https://www.theregister.com/2023/01/31/microsoft_github_openai_copilot].

[6] Sega Enters. Ltd. v. Accolade, Inc., 977 F.2d 1510, 1519 (9th Cir. 1992).

[7] “The term ‘literary works’ … includes computer programs to the extent that they incorporate authorship in the programmer's expression of original ideas, as distinguished from the ideas themselves.” H.R. Rep. No. 94-1476, at 54 (1976).

[8] Christopher TenEyck, Downstream Implications of AI Authorship, JLA Beat, Colum. J.L. & Arts (Sept. 23, 2022), https://journals.library.columbia.edu/index.php/lawandarts/announcement/view/537 [https://perma.cc/F7P3-NCUA] [https://web.archive.org/web/20230328175945/https://journals.library.columbia.edu/index.php/lawandarts/announcement/view/537].