Scandal in GenAI: copyright, content creation, and ChatGPT
What a month of courtroom chaos means for our digital future
The end of 2022 saw the commercial introduction of generative AI (or Gen AI for short). The public was introduced to some of the most popular Gen AI tools, including OpenAI’s ChatGPT, Midjourney, Stable Diffusion, and even DeviantArt’s DreamUp. As a result, the year 2023 marked a pivotal point for Gen AI, due to its explosive popularity. Yet amidst this technological whirlwind, significant legal and ethical debates were brewing.
More than mere socio-ethical considerations parrotted in academic circles or concerns voiced by disgruntled artists, the legal structure (or lack thereof) of Gen AI pioneers like OpenAI is quickly disintegrating. This month, we all read the headlines discussing The New York Times suing OpenAI and Microsoft, or seen the uproar on X (formerly Twitter) when a list of 16,000 artists, whose illustrations are being used to train Midjourney’s AI without consent, was leaked.
As Gen AI evolves, it's stirring up some big questions about ethics and legality, especially in the creative world. Think about it: Gen AI is trained on vast databases of text, illustrations, and music, all linked to real human creators. So what happens when an AI program creates content that looks indistinguishable from yours, but without your permission?
Are you confused about how this is all playing out? Well, you’re in the right place.
Let’s break down the month of AI scandals together.
A Creator’s Dilemma
In 2023, Kelly McKernan arguably became one of the first independent artists to publicly voice these concerns. It all began when McKernan was made aware that her style was replicated by others as she was tagged in various generated images that copied her art style. “I can see my hands in these. They are like echoes of my stylistic choices, like unfinished sketches pulled from my sketchbook or from my brain,” she said (Aronoff, 2023).
This concern did not go unnoticed by a lawyer named Matthew Butterick, who decided to represent these artists alongside another lawyer, Joseph Saveri. Under their initiative, they took a step toward the first piece of litigation against Gen AI (Pascual, 2024). They filed lawsuits against major AI firms like Stability AI, Midjourney, and DeviantArt for copyright infringement.
This first step garnered some attention and paved the way for more lawsuits against Gen AI. In February 2023, Getty Images sued Stability AI, claiming that they had infringed on Getty’s copyrights by using images from its archive without permission. In September 2023, two other groups of writers filed complaints against OpenAI. And, in October 2023, three major music publishers, Universal Music Publishing Group, Concord Music Group, and ABKCO, sued Anthropic, a company created by former OpenAI workers, for training its algorithms with copyrighted song lyrics (Panettieri, 2024). A final lawsuit marked the end of the year when the first media outlet took the same measures. In December 2023, The New York Times sued OpenAI and Microsoft for allegedly using millions of their articles to train their algorithms without consent.
Although the lawsuit took the world by storm, Butterick did not find the escalation surprising. He views these legal actions as integral to a global conversation on how Gen AI and human creativity will coexist. “This race has just begun”, he said (Pascual, 2024).
He wasn’t wrong.
The year started with a major leak on X by Jon Lam, a storyboard artist at Riot Games. The leak involved a list of 16,000 artists whose illustrations are being used to train Midjourney’s AI. Screenshots of conversations between Midjourney developers discussed how they collected these artists’ styles for training purposes. Lam claims this information has been submitted as evidence for a lawsuit.
On January 13th, 2024, Meta admitted to using Books3, a well-known dataset comprising over 195,000 books, to train their commercial AI products, after facing multiple lawsuits from various authors. Meta refutes any claims of copyright infringement, asserting these books should be considered fair use (we’ll touch on this further in the next section, don’t worry) since they are unauthorized copies (Maruccia, 2024).
Where do we go from here?
The Legal Labyrinth
The main issue here is that AI, especially with how it intersects with copyright infringement and authorship, lacks proper robust regulatory frameworks. This prompted the creation of entities such as The Center for Art Law, the only independent art law entity in the US. The Center held a panel discussion to address the use of AI in art and discuss the copyright issues involved. The panel highlighted how the legal landscape struggles to keep up with AI’s advancements and questioned the applicability of current copyright laws to these new challenges.
Legal expert Ron Lazebnik highlights two key aspects of how the concept of liability can be applied to copyright law in cases relating to AI. First, through cases that revolve around the use of the artist’s work for training. Second, through cases that revolve around the generation of a piece of AI artwork that resembles the works of others.
There are a lot of loopholes that the companies being sued can use to defend themselves. They can argue technicalities, such as claiming that the artists did not follow the appropriate legal procedures to register their works or claiming that the plaintiffs have not shown how the AI has stripped away the relevant copyright management information from the artist’s work.
There’s also the ‘fair use’ defense mentioned earlier, which claims that despite the generated work being similar to the plaintiff’s, it was used in a legally permissible way. Finally, there’s the issue of style, as there’s a debate regarding whether the AI models copying an artist’s style should be considered copyright infringement. This is particularly complicated considering that the style itself is not protected under copyright law (Aronoff, 2023).
What about MENA?
Mahmoud Othman, an Egyptian lawyer and legal advisor specializing in the arts, sheds light on the complex intersection of AI and copyright law through his Instagram videos. He addresses Amr Moustafa’s claim that he could prevent Amr Diab from singing using AI. Is this legally possible? Othman suggests that under certain conditions, it could be.
Egypt lacks a comprehensive framework for AI, especially within the arts and copyright infringement. Therefore, disputes like this would default to the country's existing copyright laws, particularly Law No. 82 of 2002. Othman explains that these laws protect artists' intellectual property, which includes their unique performances – a crucial aspect of their identity.
He clarifies that for an artist's work to be protected, it must be original and presented in a tangible form accessible to others. In the proposed scenario, if Amr Moustafa were to create AI-generated music mimicking Amr Diab's distinct style without prior agreement, Diab could sue for copyright infringement and seek compensation. The only way Amr Moustafa could legally prevent Amr Diab from singing the generated songs, as Othman notes, would involve a contract ensuring Diab's compensation for the use of his voice and style in AI-generated content.
Applying this to the broader dilemma of AI-generated content, artists, and authors could argue that their styles and works, used without permission to train AI models, constitute derivative work. They may have grounds for compensation. However, the lack of specific AI copyright laws complicates this issue. Determining infringement relies on factors like the transformative nature of the AI-generated artwork compared to the original and its market impact. Egypt doesn’t have doctrines such as 'fair use', but that doesn’t dismiss the possibility of similar legal exceptions.
OpenAI’s Response
Remember the loopholes we discussed in the previous section?
You might have guessed it: OpenAI used them to defend the company’s AI training systems in response to The New York Times’s lawsuit. They claimed that using publicly available internet materials is fair use and that that claim is supported and widely accepted by various sources, especially the US Copyright Office. They also claimed that several other regions and countries, such as the European Union, Japan, and Singapore, have laws permitting copyrighted material to train AI systems as this is a great way to advance AI innovation. They ended this section of their statement by stating that they provide a simple opt-out process for publishers that prevents their tools from accessing websites that do not wish to partake in the training process (OpenAI, 2024).
After going through this response, a critical point emerges from the claims being laid out. OpenAI said that training AI systems on copyrighted material is a great way to advance AI innovation. This aligns perfectly with OpenAI’s plea to the British parliament for using copyrighted material, in a filing submitted to a House of Lords subcommittee, claiming that training AI models effectively without copyrighted works is "impossible."
…is ethical innovation really impossible?
This raises another pivotal concern: not only is OpenAI advocating for the use of copyrighted works, but they are also seeking to use them without compensating the original artists (Milmo, 2024).
The UK government had initially proposed a data mining exception into copyright law that could be used by AI companies, but after considerable backlash and the rise of these lawsuits, they decided to backtrack. On January 15th, 2024, the UK government declared, in response to a report by Parliament’s Culture, Media & Sport Select Committee that there will be no copyright exceptions for AI. Instead, they confirmed that, with the Intellectual Property Office, they will be working with copyright owners and tech companies to develop a code of practice that will help find a compromise between the two sides (Cooke, 2024).
Similarly, at a Senate hearing on January 10th, 2024, lawmakers from both sides agreed that companies like OpenAI should pay media outlets for using their work to train their AI models. Richard Blumenthal, the Democrat who chairs the Judiciary Subcommittee on Privacy, Technology, and the Law that held the hearing, claimed that “It’s not only morally right,” but that “It’s legally required.” Curtis LeGeyt, CEO of the National Association of Broadcasters, Danielle Coffey, CEO of the News Media Alliance, and Roger Lynch, CEO of Condé Nast, expressed their support of this statement, with Coffey and Lynch explaining that AI companies are infringing on copyright under current law and that they believe that they are using “stolen goods.” (Weiss, 2024).
This brings us to a critical juncture: the intersection of generative AI and creative rights isn't just about technological innovation; it's a fundamental question of respecting individual rights and artistic integrity. The emerging legal confrontations underscore the art world's struggle to navigate these challenges alongside AI's rise. The outcomes of these debates and lawsuits are more than legal precedents; they are shaping the future interplay between ethics, artistic creativity, and technological innovation. And of course, they all lead us toward the one blatant conclusion: we need AI regulation.
As we continue to witness these events unfurl, it’s clear that our current legal infrastructures are insufficient to address the nuanced ethical issues AI brings to creativity and content creation. Effective regulation should not only aim to protect intellectual property and artistic rights but also ensure that AI development is aligned with ethical standards and societal values.
References:
Aronoff, A. (2023, May 23). Grey Area: Copyright and Fair Use in AI-Generated Artworks. NYFA. https://www.nyfa.org/blog/grey-area-copyright-and-fair-use-in-ai-generated-artworks/
Bauder, D. (2023, December 27). The New York Times sues OpenAI and Microsoft for using its stories to train chatbots. AP News. https://apnews.com/article/nyt-new-york-times-openai-microsoft-6ea53a8ad3efa06ee4643b697df0ba57
Belci, T. (2024, January 4). Leaked: the names of more than 16,000 non-consenting artists allegedly used to train Midjourney’s AI. The Art Newspaper - International Art News and Events. https://www.theartnewspaper.com/2024/01/04/leaked-names-of-16000-artists-used-to-train-midjourney-ai
Brittain, B. (2023, January 17). Lawsuits accuse AI content creators of misusing copyrighted work. Reuters. https://www.reuters.com/legal/transactional/lawsuits-accuse-ai-content-creators-misusing-copyrighted-work-2023-01-17/
Chen, M. (2023, January 24). Artists and Illustrators Are Suing Three A.I. Art Generators for Scraping and “Collaging” Their Work Without Consent. Artnet News. https://news.artnet.com/art-world/class-action-lawsuit-ai-generators-deviantart-midjourney-stable-diffusion-2246770
Cooke, C. (2024, January 15). No copyright exception for AI reiterates UK government - but tech companies still lobbying for more change. Complete Music Update. https://completemusicupdate.com/no-copyright-exception-for-ai-reiterates-uk-government-but-tech-companies-still-lobbying-for-more-change/
Grynbaum, M. M., & Mac, R. (2023, December 27). The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work. The New York Times. https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html
Knibbs, K. (2024, January 10). Congress Wants Tech Companies to Pay Up for AI Training Data. Wired. https://www.wired.com/story/congress-senate-tech-companies-pay-ai-training-data/
Maruccia, A. (2024, January 13). Meta admits using pirated books to train AI, but won’t pay for it. TechSpot. https://www.techspot.com/news/101507-meta-admits-using-pirated-books-train-ai-but.html
Milmo, D. (2024, January 8). “Impossible” to create AI tools like ChatGPT without copyrighted material, OpenAI says. The Guardian. https://www.theguardian.com/technology/2024/jan/08/ai-tools-chatgpt-copyrighted-material-openai?CMP=twt_b-gdnnews
OpenAI. (2024, January 8). OpenAI and journalism. Openai.com. https://openai.com/blog/openai-and-journalism?utm_source=www.stp.news
Panettieri, J. (2024, January 8). Generative AI Lawsuits Timeline: Legal Cases vs. OpenAI, Microsoft, Anthropic and More. Sustainable Tech Partner for Green IT Service Providers. https://sustainabletechpartner.com/topics/ai/generative-ai-lawsuit-timeline/
Pascual, M. G. (2024, January 4). The activist who’s taking on artificial intelligence in the courts: “This is the fight of our lives.” EL PAÍS English. https://english.elpais.com/technology/2024-01-04/the-activist-whos-taking-on-artificial-intelligence-in-the-courts-this-is-the-fight-of-our-lives.html
Weiss, B. (2024, January 10). Fair use’ no excuse for AI companies to pilfer news content, media advocates tell Senate. Courthouse News Service. https://www.courthousenews.com/fair-use-no-excuse-for-ai-companies-to-pilfer-news-content-media-advocates-tell-senate/
Comments