Authored By: Akinyemi Oluwanifemi Sarat
Lagos State University
ABSTRACT
The rapid advancement of Artificial Intelligence (AI) has sparked intense debates about copyright infringement and fair use. As AI models are trained on vast amounts of data, including copyrighted works, questions arise about the legitimacy of this practice. This article explores the intersection of AI training and copyright law, examining the transformative nature of AI-generated content. We analyze landmark cases, such as Authors Guild v. Google, Inc., and recent lawsuits involving AI developers and creators.
The article discusses the need for clearer guidelines on fair use in AI training and it argues that legislative reform is necessary to update copyright laws and provide certainty for creators and developers. One potential solution is the establishment of licensing frameworks, which would enable creators to receive fair compensation for the use of their works in AI training. This approach could promote innovation while respecting intellectual property rights.
The article also examines the role of transparency in AI development. Requiring AI developers to disclose training data sources and offering opt-out mechanisms for creators which would promote accountability and respect for intellectual property. We analyze the implications of recent court cases, including Getty Images v. Stability AI and The Andy Warhol Foundation for the Visual Arts, Inc v Goldsmith, on the AI industry’s use of copyrighted materials. The article concludes that finding a balance between innovation and creators’ rights is crucial for the development of AI technology that benefits society as a whole.
Clearer regulatory frameworks are necessary to balance innovation with creators’ rights. Potential solutions include licensing systems or transparent consent mechanisms to ensure fair compensation and accountability in AI development.
INTRODUCTION
In today’s digital era, Artificial Intelligence (AI) is increasingly being developed using vast amounts of data, much of which includes copyrighted materials such as textbooks, articles, and other digital content. These works are used to help AI systems recognize patterns and perform tasks that typically require human intelligence. However, this practice raises concerns about potential conflicts with copyright law, which grants creators the exclusive right to reproduce and control the use of their original works. As defined by OSBORN, Copyright is “the exclusive right of printing or otherwise multiplying copies of, inter alia, a published literary work; that is, the right of preventing all others from doing so”.1
On the other hand, Section 20 of the Nigerian Copyright Act 2022 introduces the concept of “fair use”—a legal exception that allows limited use of copyrighted material without constituting infringement. This creates a legal and ethical dilemma: does using copyrighted content to train AI systems qualify as fair use, or does it amount to copyright infringement?2
This article seeks to explore this pressing question and examine whether AI training falls within the bounds of fair use or crosses the line into copyright abuse.
BACKGROUND
The emergence of artificial intelligence has sparked a vibrant era of technological innovation, with machines capable of writing, painting, composing music, and even conversing like humans. However, this progress is accompanied by intense legal and ethical debates: can AI be trained using copyrighted content without permission? At the heart of this controversy lies the concept of fair use, a long-standing principle in U.S. copyright law that permits limited use of copyrighted material for purposes such as research, teaching, criticism, or news reporting.3 This doctrine is assessed through a four-factor test, examining the purpose of the use, the nature of the original work, the amount used, and the impact on the work’s market value.4
Tech companies developing generative AI models, like ChatGPT and Gemini, argue that utilizing publicly available data for training falls under fair use, particularly since the end product is a transformative output that doesn’t directly replicate the original content. Conversely, many creators strongly disagree. Notable lawsuits filed by authors such as Sarah Silverman and George R.R. Martin claim that AI developers are profiting from their works without providing due credit or compensation.5
COPYRIGHT IN COURT
The dispute between the creators of Game of Thrones and generative AI models like ChatGPT underscores a critical tension in copyright law: the unauthorized use of protected creative content to train artificial intelligence. Authors and screenwriters argue that training AI on copyrighted scripts, dialogues, and character developments—such as those from Game of Thrones—without consent or compensation infringes on their exclusive rights to control how their work is used and monetized. While AI developers claim that such training constitutes “fair use” due to its transformative nature, critics warn that AI outputs can mirror original content too closely, effectively reproducing creative expressions without attribution or license, thereby undermining the value and integrity of original authorship6 .
The Andy Warhol Foundation for the Visual Arts, Inc v Goldsmith case poses a significant challenge to the AI industry’s “transformative use” defense. In this landmark 2023 ruling, the US Supreme Court determined that Warhol’s silkscreen image, based on a photograph, lacked sufficient transformation, primarily due to its commercial nature and similarity to the original work.4 This decision highlights that modifications or derivative uses don’t always qualify as fair use, especially when they retain the original’s purpose or character.
The legal battle between Getty Images and Stability AI highlights the ongoing struggle between intellectual property protection and the rapid advancement of artificial intelligence. Getty Images has accused Stability AI of using its extensive image library without permission to train its AI models, a move that Getty argues constitutes copyright infringement. The lawsuit claims that Stability AI did not obtain licenses for the copyrighted images, which could result in significant economic harm to content creators, who rely on licensing fees for their work. This case is an important example of how AI training on copyrighted material without compensation may lead to a devaluation of creators’ intellectual property rights, echoing concerns about fairness and authorship in the age of machine learning.7
The outcome of the Getty Images v. Stability AI case could set a crucial precedent for the AI industry, particularly regarding whether companies using copyrighted content to train AI models need explicit permission or a license from the original copyright holders. If the court rules in favor of Getty Images, it would signal a reaffirmation of traditional copyright principles, asserting that the mere transformation of content into a dataset does not necessarily justify its use under the fair use doctrine. This case further emphasizes the ongoing debate in the legal community over whether current copyright laws are equipped to handle the unique challenges posed by AI technologies, where the boundary between fair use and infringement remains increasingly blurred. 8
In the context of AI, the output may seem innovative, but the training process involves exact copying of expressive content without creators’ knowledge or consent. This replication is vital for model development but invisible to end-users. Furthermore, AI-generated derivative content in a particular style, without attribution or permission, may infringe on moral rights recognized in some jurisdictions, including rights to attribution and protection against derogatory treatment.
TRENDS IN THE LITERARY INDUSTRY
In September 2023, the Authors Guild, on behalf of the collective of American literary authors and particularly well-known authors such as George R.R. Martin and John Grisham, filed a class action lawsuit against OpenAI and its affiliates in New York. In the lawsuit, the authors alleged infringement of their reproduction and transformation rights. More specifically, by entering prompts requesting summaries, possible sequels, or adaptations of their works, the ChatGPT output generated summaries and accurate transcriptions of the works. The central argument of the Authors’ claim was that “ChatGPT could not have generated the results described above if OpenAI’s LLMs had not ingested and been “trained” on the [Authors’] Infringed Works.” The crux of this problem lies in the existence of copies of the illicitly obtained works, the reproduction of which was necessary for the training of the model. To date, OpenAI has not made its defenses to the authors’ claims public.
However, recent press reports that have investigated the extrajudicial evolution of the case indicate that the Authors Guild is exploring the possibility of implementing a blanket licensing system, whereby the companies ingesting the protected works would remunerate the authors. This class action has the potential to set trends in the literary industry and its treatment of AI training. 9
A CASE FOR PURPOSE
The transformative power of fair use is a crucial consideration in US copyright law. When a new use adds fresh meaning, message, or purpose to an original work, it can be deemed fair use. The Supreme Court’s ruling in Campbell v Acuff-Rose Music, Inc, exemplifies this, recognizing parody as a form of fair use that injects new expression and meaning into the original.
Similarly, AI training revolutionizes the way we approach creative content. By harnessing vast amounts of data, AI systems learn patterns and language structures to craft innovative expressions. This process bears striking resemblance to Google’s book digitization project, which was deemed fair use for its transformative purpose. Just as Google’s searchable database empowered users to explore books without replicating them, AI training unlocks new creative potential without copying verbatim content. The parallels between these two examples highlight the evolving nature of fair use in the digital age.
POSSIBLE SOLUTIONS
To address the growing concerns surrounding AI training and copyright infringement, establishing licensing frameworks is crucial. Introducing collective or blanket licensing systems would ensure that creators are fairly compensated when their works are used in AI training. This approach would provide a structured mechanism for rights holders to receive royalties, fostering a more equitable environment for both creators and developers.
In tandem with licensing frameworks, legislative reform is necessary to update national copyright laws, particularly in jurisdictions like Nigeria. By explicitly addressing AI training within the legal framework, lawmakers can define the scope of fair use in this context, providing clarity for all stakeholders. Clear guidelines would help mitigate legal uncertainties, encouraging innovation while safeguarding creators’ rights.
Transparency from tech companies is also essential in this evolving landscape. Requiring AI developers to disclose training data sources and offering opt-out mechanisms for creators would promote accountability and respect for intellectual property. By giving creators control over their works, we can build trust and ensure that AI development proceeds in a manner that balances technological advancement with the rights of content creators.
CONCLUSION
The intersection of AI training and copyright law presents a complex challenge that requires careful consideration. As AI technology continues to evolve, it is essential to strike a balance between innovation and creators’ rights. The debate surrounding fair use and copyright infringement in AI training highlights the need for clearer guidelines and regulatory frameworks. By establishing licensing systems and promoting transparency, we can ensure that creators are fairly compensated for their work. The outcome of ongoing lawsuits and future court decisions will likely shape the future of AI development and intellectual property law. As the law continues to evolve, it is crucial to prioritize fairness, accountability, and respect for creators’ rights.
One potential solution is the implementation of collective or blanket licensing systems, which would enable creators to receive royalties for the use of their works in AI training. This approach could foster a more equitable environment for both creators and developers. Transparency and accountability are also essential in AI development. By requiring AI developers to disclose training data sources and offering opt-out mechanisms for creators, we can promote trust and cooperation between stakeholders. Finding a balance between innovation and creators’ rights is crucial for the development of AI technology that benefits society as a whole. By working together, we can create a future where AI enhances human creativity while respecting intellectual property rights.
As we move forward, it is essential to prioritize collaboration and dialogue between stakeholders, including creators, developers, and policymakers. By working together, we can create a regulatory framework that supports innovation while protecting creators’ rights.
In the end, the future of AI development and intellectual property law will depend on our ability to adapt to new challenges and opportunities. By prioritizing fairness, accountability, and respect for creators’ rights, we can create a brighter future for all stakeholders involved.
REFERENCE(S):
Books and Statutes
- Osborn’s Concise Law Dictionary (13th ed. 2020).
- Copyright Act (Nigeria) § 20 (2022).
Government & Legal Websites
- U.S. Copyright Office, Fair Use, U.S. COPYRIGHT OFFICE, https://www.copyright.gov/fair-use/ (last visited May 11, 2025).
- 17 U.S.C. § 107 (2018).
- The Verge, Authors Sue OpenAI Over Copyright Infringement, https://www.theverge.com (2023).
News Articles & Journal Articles
- Chris Stokel-Walker, Game of Thrones Author George R.R. Martin Among Writers Suing OpenAI, THE GUARDIAN (Sept. 21, 2023), https://www.theguardian.com/technology/2023/sep/20/george-rr-martin-authors-openai-lawsuit.
- David Bartholomew, Getty Images Sues Stability AI Over Copyright Infringement, THE GUARDIAN (Jan. 19, 2024), https://www.theguardian.com/technology/2024/jan/19/getty-images-sues-stability-ai-over-copyright-infringement.
- Emily Smith, Getty Images v. Stability AI: Copyright and AI Training Data, 19 J. INTELL. PROP. L. & PRAC. 245 (2024).
Online PDFs/Essays
9 https://law.unh.edu/sites/default/files/media/2024-09/6-ruben-essay_.pdf
Cases
Andy Warhol Found. for the Visual Arts, Inc. v. Goldsmith, 598 U.S. __ (2023).
Getty Images (US), Inc. v. Stability AI, Inc., No. 1:23-cv-00135-GBW (D. Del. filed Feb. 3, 2023).
Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015).
Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994).