First AI companies swipe our work — then they’ll come for our jobs

The greatest art heist in history is happening right in front of us, and we are being gaslit by its perpetrators. Generative AI systems such as ChatGPT have exploded into the public consciousness, racking up hundreds of millions of users and bringing billion-dollar valuations for their creators. They can generate photorealistic images, produce entire songs in seconds and write articles you’d never know weren’t written by a human. They will drive the costs of production in every creative industry close to zero, rendering legions of jobs obsolete.

The heist lies in how these systems are built. You need three main resources to build generative AI: AI engineers, GPUs (the chips on which AI models are trained) and training data. AI companies spend eye-watering sums on the first two but expect to get the third free.

Training data is the content from which generative AI systems learn. Without it they don’t function. And to be clear, what AI companies call “training data” most of us call “people’s work”. Millions of books, articles, photos, illustrations and songs that were shared online over the past couple of decades by creators who wanted to take advantage of the global audience promised by the internet are now being used instead to train the machines that will replace them.

And replace them they will. Generative AI will leave creators with name recognition largely untouched; from pop stars to presenters, anyone with a following, with fans, will survive the generative AI age. But the anonymous majority — the designers, TV composers, voice actors — will be outcompeted by AI systems that work faster, and cost less, than people ever could.

It is in this context that generative AI companies train their models on people’s work without paying them or asking their permission. This is not some experimental approach taken by a rebellious minority; it is the approach taken by almost every generative AI company you’ve heard of. I know because I worked at one. And they’re doing it in plain sight. They tell us there’s nothing unfair about it, and they say the same in their lobbying of governments around the world.

Media companies with the resources to sue them are doing so. The legal issues are complex and may take years to resolve. The defence given generally relies on the fact that AI tends not to regurgitate its training data word for word or pixel for pixel. But common sense tells us there can be no real defence here. Generative AI systems take people’s work and incorporate it in products that directly compete with that work. No amount of legal gymnastics can justify this.

It’s tempting to assume that this is just business as usual for Silicon Valley, the next cycle of tech companies disrupting a market to the frustration of incumbents. But that’s to miss how uniquely exploitative generative AI is.

No other technology in recent memory has been built on people’s creative output without getting their permission or bringing them any benefit. Search, app stores, online marketplaces, NFTs, virtual reality — all undeniably have their issues but none take people’s work without giving anything back. The closest anything has come was streaming, but thankfully the exploitation of the Napster era gave way to content licensing.

Companies building generative AI are the only tech firms that make this extraordinary claim on humanity’s creative output and tell us we’re crazy for objecting.

This issue should matter to all of us. AI companies may today be training on art, books and music but they need more data to improve their models. If your work isn’t feeding the machines yet, it may be soon. We all need to consider whether we’re happy letting a handful of giant companies use our work to build systems that may well replace us.

Ed Newton-Rex is chief executive of the AI non-profit Fairly Trained, a composer, and formerly product director at TikTok and vice-president of audio at Stability AI.