Reading time: about 9 minutes
Welcome to the second of my essays about the sudden explosion of AI large language models like ChatGPT and DALL-E. You can read what I think about text-generation AIs in the first part: I use AI to write fiction. Let’s talk about that.
I decided to write a separate essay on AI image generators because as fraught as the conversation around tools like ChatGPT is, it is about 1000 times more intense around DALL-E and Midjourney and similar tools. Also, I’m more of a spectator on this side of the conversation, as I am not a working visual artist. So I want to be extra careful and thoughtful and nuanced. (You’ve been warned!)
Let’s start at the beginning, again.
There are a number of AI image generators out in the world that can take a short text description from a user and create an image from that description. For example: I entered the text prompt “portrait of the AI artist as itself” into the Midjourney image generator, and here is the result I found most interesting.
Within the last two years, the technology leapt forward from these AIs making fairly awful images to making breathtaking images. 👇
Needless to say, artists are upset. People are upset. Artists and people are also thrilled and amazed. Some people, of course, use these new tools to make porn, because of course they do.
There are many layers to the upset-ed-ness. The first and most clear-cut layer is that AI image generators have to be trained on images to work. These AI models were trained on billions of images, many of which were copyrighted images scraped from the web and used without knowledge or permission from the image owners. The outcry from artists on this point, as well as a couple lawsuits for copyright violation, have led many of the AI image generator companies to promise to provide ways for artists to opt-out of having their images used in the AI training datasets, either now or very soon. (Artists and programmers are also working together to create tools to fool the AI training model.)
While the companies behind these image AI generators taking the “ask for forgiveness later” route was heinously typical, it’s good that they are moving to do the right thing here, and hopefully litigation will make them do it faster. Because the second layer of upset from artists is that if asked, these AI image generators can produce images in their style.
Some artists are rather phlegmatic about this new technology, but others are understandably furious. If anyone can type in a prompt asking for an image in my style for $10 per month, artists are asking, then why would anyone ever again pay me a commission to make my art?
This is an excellent question. It’s a really, really, really important question that drives to the heart of where art and artists and AI go next. Unfortunately, instead of having the kind of nuanced discourse this question requires, the Internet is doing its best to turn it into a shitfest.
Artist Sam Yang recently pointed out how easy it was for the AI to replicate his art style. This is possible because Sam’s images that he posted to Instagram ended up in the AI’s training data. Unsurprisingly, he’s not thrilled about that. When he took to the Internet to bring awareness to this issue and vent his outrage, the response was a bunch of trolls creating more and more and more images in his painting style using AI and posting them all over the Internet. Because some fucking people don’t deserve to be labeled as human until they get over high school and stop slopping their pain all over everyone else.
*ahem*
In his YouTube video “Why Artists are Fed Up with AI Art,” Sam asks the question, why were visual artists’ rights trampled on when the same company (Stability AI) that helped fund the AI image training database also funded an AI music training database that doesn’t use copyrighted music at all? I’m going to go out on a limb here and assume the answer to that is this: the large majority of recording artists work with record labels. And if you remember the Napster incident of 2000, you know that as much as recording artists often hate their labels, they were super happy to sic the label’s lawyers all over Napster for copyright infringement. Napster lasted as a company for barely over a year after the lawsuits started.
Unlike musicians, most freelance visual artists don’t have big corporate entities with too much money holding their rights hostage and willing to protect those interests at all costs. While GPT-3, the text AI generator, likely has plenty of copyrighted material in its training database, unless it mistakenly spits out a couple pages verbatim of a published book, the publishing houses don’t have much they can go after OpenAI for (but should it happen and the right people catch that mistake, they absolutely will).
But what are artists to do?, you might ask. Artists have spent years, possibly decades creating and nurturing their own unique style. And even if the current court cases pressure or force AI companies to take copyright images out of the training model, the damage has been done. The AI has already “learned” the style—it doesn’t reference the training images any longer. These artists should band together and sue image AI generators for stealing their styles! And lots of artists really want to do that—and one group already has.
Now we run right into a fair use issue, and oh lordy, do things get ugly from here.
The court case launched by Butterick and the Joseph Saveri Law Firm claims that “AI art models “store compressed copies of [copyright-protected] training images” and then “recombine” them; functioning as “21st-century collage tool[s].”
So, AI doesn’t do that. But, because the way these image AI generators do work is so technical and confusing and who has time for nuance when your livelihood is being threatened—lots of artists are taking this argument to the Internet and running with it. I watched a YouTuber say: “I liken it to grabbing a bunch of copyrighted images based on—let’s say a Vogue photographer’s work, photo manipulating them together, and saying that you’ve created something new and original. You have just stolen from somebody else.”
No that’s not stealing, that would be a collage, which is a real art form, and it couldn’t exist without fair use laws about transforming the original work into something new. Not understanding exactly how the laws work or how the AIs work and feeling lots of very understandable outrage is leading to a lot of incorrect statements, making it harder and harder to even begin to have a conversation about how this technology can possibly fit into art making or be used responsibly. Which we really need to have, before doors we don’t want to open become inadvertently, irrevocably opened.
In a recent interview, when Cory Doctorow was asked about AI and copyright, he talked about a court case that was litigated recently where Marvin Gaye’s estate sued Pharrell Williams and Robin Thicke, claiming their song “Blurred Lines” copied Marvin Gaye’s style (they called it a ‘groove’). To be clear: the song “Blurred Lines” contains no Marvin Gaye lyrics, it has no riffs or beats or series of notes copied from a Marvin Gaye song. It was simply inspired by Marvin Gaye’s style. Marvin Gaye’s estate won. Pharrell and Robin appealed, with the signatures from 200 other supporting artists, claiming that the decision changed the understanding of copyright law, allowing artists to copyright a groove or style. The court struck down the appeal, saying that the decision only applied to this one case and was not setting a standard.
You might think that this is a good outcome, that artists should be able to protect their work, but what artists can protect, corporations can protect better. The judges might claim they aren’t setting a standard, but will that stop future lawyers from arguing that this case sets a precedent? Do you think for a minute that if they saw a way to do it, Disney wouldn’t copyright ‘princess’? They would love it if they were the only ones allowed to ever princess again. This is the problem. If we start tipping fair use law to include ideas or feelings (which together make up what we think of as a style), corporations will rush to grab up all the rights they can. This is exactly why recipes and clothing patterns can’t be copyrighted. Because one asshole would copyright chocolate chip cookies and then no one but him could make or sell them.
Alright. So how do these image AI generators actually work?
For the answer I recommend this 8-minute Vox explainer: The text-to-image revolution, explained, because it is such a tech-heavy process I am loathe to even attempt to sum it up correctly. But from the Vox explanation, it is clear that the AI is not going back and referencing training images as a basis to create new ones. If I understand it, the AI went through a training phase where it learned that certain pixel patterns relate to certain text descriptions, and during that process built it’s own infinitely complex reference data (they call it “latent space” in the video). When a user asks it to create an image, the AI references that latent space, those patterns it found, not the original pictures, for how to create the image. Then it creates the image with programmed-in randomness, so it should never be able to use its patterns to perfectly replicate what it was trained on.
Does this violate fair use? Honestly? Who the hell knows. Fair use is a very case-by-case decision, one judges determine with guidelines and “common sense” and gut feelings, not rules. On the face of it, it seems like image AI generators don’t really violate fair use. But down in the specifics, letting users put in the names of living artists (whose work is not in the public domain) without permission to create images in their style is a shitty thing to do. The corporations who built these tools could have prevented that, but they didn’t. They could have used non-copyrighted images, and gotten permission and paid to use copyrighted ones, but in many cases they didn’t. So that’s not ethical, but it hasn’t technically been proven to be illegal…yet.
Ok this went on a lot longer than I thought it would, without even getting to why I personally use an AI image generator. So it looks like I’ll be writing one more essay to wrap this series up. I hope this was a helpful introduction to the AI image generator controversy and pitfalls, and I’ll talk more about my thoughts on these tools soon.
I've been thinking about AI art plenty over these last few months, watching the explosion of improvement in the space very curiously. Two trailing thoughts come to mind...
1) I think in full cinema HD. Scenes, framing, panning shots, visual texture and detailing everywhere I want to 'look'. My partner has aphantasia. While his inner sphere of creativity is livelier than mine, he and I don't 'speak the same language' when it comes to sharing our creative thoughts. We tabletop roleplay, spending a lot of time languishing in the space where he explains and I struggle to imagine. As these tools have evolved, I've personally encouraged him to explore AI art generation tools to bring to life ideas of his own in visual ways that I can better understand.
2) "If anyone can type in a prompt asking for an image in my style for $10 per month ... then why would anyone ever again pay me a commission to make my art?" I've heard this in a few different places and my honest reaction is always: Were they really ever going to commission said artist? As I entered the professional visual art space, I often thought to myself: what happens when someone sees my style and creates copy-cats? You sell a painting for $500? I'll do it for $50. Part of my journey was accepting (or at least trying to accept) that that's out of my control. While I've got pretty mixed feelings about the use of images for training data (how is feeding an AI a selection of images different than when I trawl through publicly posted art for inspiration?), I don't fundamentally see how AI art is different than paying an entirely different artist to imitate a style for less money than it costs to commission the original artist.
Ultimately I think it will come down to how these tools are litigated in courts for commercial use - I do not think AI art is ever leaving the personal use domain now. 🤔