Google Images provides instant access to an ocean of photos, art, logos, and graphics. Google’s search, however, is limited to images that already exist somewhere out there on the internet. Generative artificial intelligence (AI) is opening new doors with computer-generated art and images. Today, users can type a phrase into one of the many image-generating AI programs, such as DALL-E 2, eDiffi, and Midjourney, which then generates never-before-seen images based on the provided phrase and other parameters. These images range from photorealistic landscapes to impressionistic portraits and are created in seconds.
Machine learning (ML) and artificial intelligence saw a crescendo of interest over the past decade. These technologies spurred the imagination of the general public and promised a technological revolution. ML and AI have produced results, especially with regards to enterprise technology, but many were left disappointed by the intangibility of these advances. For example, many businesses successfully use ML to power dynamic pricing engines, adjusting prices based on historical customer demand, but fully-autonomous cars are still years away from public sale. In the last few months, generative AI, a subset of AI and ML, has exploded in popularity. According to Google Trends, searches for “generative AI” have increased tenfold since June 2022. This is likely because image-generating AI programs have reached an inflection point in development: an open-source implementation, Stable Diffusion, was released. The technology can now be freely downloaded, modified, and reshared for the world to see.
Generative AI, like most ML and AI tools, leverages huge amounts of data to essentially perform pattern recognition. The data can be entered manually, or data can be collected from sources such as the web using automated scraping programs. Manually entering training data can be prohibitively time consuming, so many programs opt for the latter approach. In the case of image-generating AI programs, the data consists of an image and any descriptors or context associated with it. Once a sufficient amount of data has been collected, the AI program performs a series of algorithms to make connections between points of data and recognize trends. An AI program is considered “trained” when no new data is being added and all of the existing data has been processed. Developers regularly use competing programs to benchmark and measure training progress. A user can then provide new input, such as a string of text, with which the trained AI program will then compare with its bank of text data and interpolate associated image data to produce an output. Exactly how the output is produced depends on the particular algorithm used. DALL-E 2 and Stable Diffusion both use “diffusion models,” which work by first introducing noise (visual static) into the training images and then reversing the noising process to get back to the original image. New images are produced by passing randomly sampled noise through the denoising process.
As with most new technologies, generative AI has provoked a host of legal and ethical concerns. The ethical issues largely question the images that can be produced using this technology. The legal questions, on the other hand, involve both the production and use of the images. Because the generative AI programs require so much data to produce meaningful results, program developers cannot carefully examine each individual piece of data that goes into training the program. It is likely that some of the images used to train the programs are copyrighted. The question then becomes whether using copyrighted images to train an AI program falls within the scope of the fair use doctrine.
The fair use doctrine allows for limited use of copyrighted material and provides a four-factor test for determining fair use. The four factors are (1) the purpose and character of the use, (2) the nature of the copyrighted work, (3) the amount and substantiality of the portion used in relation of the copyrighted work as a whole, and (4) the effect of the use upon the potential market for or value of the copyrighted work. The first factor also asks whether the use is commercial or for nonprofit educational purposes. A program like DALL-E 2, that is proprietary, would fall under the commercial classification, whereas at least certain implementations of Stable Diffusion’s open-source program could be considered nonprofit and educational. This factor also questions whether the use progresses the art, which, based on some of the art produced by generative AI programs, is an easy “yes.” Image-generating AI programs have succeeded in creating celebrated art. For example, the Colorado State Fair recently dedicated a fine art category to AI generated images. The second factor will weigh in favor of published and clearly copyrighted images. The third factor concerns the portion of the copyrighted work that will be used, which for training will almost always be the entirety of the image. Lastly, the fourth factor questions the effect the use will have on the original work’s potential market. This is probably the most interesting factor for generative AI, because its effect is not yet clear. For famous paintings and photos, the effect is likely zero. For independent artists, however, generating custom artwork for extremely low prices can potentially have a huge effect on their business. Based on these four factors, generative AI does not clearly fall on one side of the fair use doctrine.
Generative AI shares similarities with other technologies that are protected by the fair use doctrine. Search engines, for example, comb the internet and process all available information to improve search results. Here, copyrighted material is certainly used in full for commercial purposes. The difference between search engines and generative AI may lie in the fourth factor, which for search engines likely does not apply.
The images produced using generative AI may also be copyright infringement. Because some generated images will likely use copyrighted material as training data, it’s possible that a generated image may look too similar to the copyrighted material. A good example of where courts draw that line is with artist Shepard Fairey’s famous Barack Obama “Hope” poster. The poster used a photo of Barack Obama as a reference and, despite being highly stylized, was considered to infringe on the copyright of the original photographer’s photograph. The complicating factor with generative AI programs is that the user has some control over the output. For example, including the phrase “Barack Obama Hope” will likely generate images that share a close resemblance with Fairey’s work. For this reason, generative AI programs will likely be viewed as tools rather than as independent creators. A comparison can be drawn with Photoshop and its role in creating new images. However, because the user does not have direct access to the images the program was trained on, it may be difficult to verify whether an image produced is wholly original or only a slight stylization of existing copyrighted material.
While the legal questions surrounding generative AI are far from settled in the US, other countries have enacted relevant legislation. The UK and the EU have both carved out exceptions to existing copyright law that allow for text and data mining as long as the author of the work has not reserved the right. Whether this language completely covers generative AI programs is yet to be seen. Similar legislation may be adopted in the US, but given the rapid development of generative AI, more novel legal questions will likely present themselves before such legislation is passed.