The recent launch of Meta’s Llama 2 has brought unprecedented interest in open-source large language models (LLMs). While the excitement surrounding this launch is understandable, it is crucial not to overlook the legal uncertainties surrounding intellectual property (IP) ownership and copyright in the generative AI space. Many assume that regulatory risks only concern the companies creating LLMs, but this assumption proves dangerous when we consider the poison pill of generative AI: derivative works.
Key Takeaway
Derivative works in generative AI pose significant legal risks related to IP ownership and copyright.
Derivative works have clear legal treatment under copyright law. However, the lack of precedents for addressing data derivatives, which are expected to become increasingly prevalent due to open-source LLMs, creates a cloud of uncertainty. When a software program generates output data based on input data, it becomes challenging to determine which parts of the output data qualify as derivatives. This upstream problem spreads contagion down the derivative chain, expanding the scope of claims as legal challenges over IP in LLMs emerge.
Uncertainty surrounding the legal treatment of data derivatives has always been the norm in software. However, LLMs change the game in three significant ways.
Centralization
LLMs are capable of generating varied outputs applicable in limitless ways. Unlike traditional software, LLMs can produce text, images, code, audio, video, and pure data. It is not a matter of if, but when LLM usage becomes ubiquitous. As IP ownership and copyright case law on LLMs are still developing, there is an increasing risk that these uncertainties will extend beyond LLM vendors to LLM users. This includes risks related not only to copyright but also to potential harms caused by hallucinations and bias.
Incentives
Copyright holders have a vested interest in expanding the definition of LLM derivatives to increase the scope of their potential damages. Major platform companies also leverage license restrictions to gain an edge over their competitors. The Llama 2 license, for example, prohibits using Llama to “improve” non-Llama LLMs. Fuzzy definitions of derivative works benefit rights holders and those with substantial legal resources.
As generative AI continues to advance, enterprise technology leaders must understand and manage the risks posed by derivative works. Proactive measures can help navigate the legal challenges and uncertainties surrounding IP ownership and copyright in the generative AI space. By staying informed and adopting responsible practices, businesses can mitigate potential legal liabilities arising from derivative works in the LLM ecosystem.