Generative AI is only a threat to writers if they’re not paid for the use of their work

Author

Sofie Rainier
Sofie Rainier has worked as an LLM trainer since 2023. An aphantasic writer who uses AI to build worlds and ideas, she is fascinated by the increasingly blurry line between algorithm and author. Her work explores the ethics of generative technology, focusing on how to preserve an authentic human voice in an era of automated synthesis.
spot_img

More from this author

spot_img

“Many of my published books have been used without consultation”.

That was the unadorned LinkedIn post in my feed in early 2025 from a fellow writer, sharing a call to action to highlight the “unfair and unlicensed use of authors’ materials” in training generative AI. What that action was, beyond a hashtag and a post share, was unclear and, as I followed the hashtags and links, I could see that even those sharing them weren’t consistently clear, either. All I could see was that many writers believed AI posed an existential threat to their craft.

As an AI trainer by day who moonlights as a writer, this was an unhappy collision of my professional worlds. I could see a lot of emotionally charged, catastrophic language being flung about. “The death of creativity!” (and variations thereof) was a popular take. One anonymous respondent in a November 2025 survey released by Gotham Ghostwriters called AI a “sociopathic plagiarism machine”. Sir Philip Pullman described using copyrighted works to train AI as “immoral” and a “wicked system”.

This shook me a bit. I take a lot of pride in the work I do, the small role I play in trying to make AI better for everyone. And as a writer, AI has become a valuable part of my creative process. I’m left to wonder: am I a Quisling in my writing community?

Is it really so bad?

I’ll admit, I had my doubts. Inflammatory rhetoric can trigger that response in a skeptic. But as I began to move into fact rather than feeling, I found there were hard truths I needed to reconcile. Open letters from groups like Creators’ Rights Alliance and The Authors Guild called on lawmakers and AI tech leaders to increase transparency, eliminate the use of illegally-obtained works and ensure fair compensation for creators.

As a trainer, I see my role in shaping generative AI as roughly analogous to the way a professor would shape their students: challenge them, guide them, and help the students find their own identity and style. Reinforcement Learning from Human Feedback (RLHF) trainers reward individualised responses in general, but also use training methods like red-teaming to push against and reinforce boundaries. This works by provoking the model into providing an unsafe answer (such as offensive language, sexual or violent content, or chunks of copyrighted text), in order to teach it to recognise and refuse unsafe requests.

A wooden judge's gavel rests on a pale stone surface in a courtoom, which is blurred in the background
Case dismissed? Photo via MiamiAccidentLawyer on Pixabay

This is an integral training strategy across the AI ecosystem, yet claims made in waves of lawsuits suggest that major models continue to struggle. All are still prone to outputting “memorised” and “regurgitated” responses. In December 2023, The New York Times filed a lawsuit against OpenAI, providing evidence of Chat GPT-4 largely mimicking a large-scale investigative report from the Times rather than summarising it.

So here, the student analogy starts to break down. Students aren’t generally storing entire texts in their brains, with the capability to immediately produce a copy on demand to any of millions of users. Even if they did, the costs of plagiarising are a steep deterrent in academia; in the AI world, despite how lucrative generative AI has been and is projected to remain, tech companies are only just starting to face any such penalties.

The cost of quality

Returning to our university students, it’s worth remembering that they are required to purchase the textbooks and course materials that are meant to shape them. Conversely, court filings have demonstrated that at least one big tech company actively decided against purchasing, or in any way licensing, the high-quality texts that they used to teach their AI models because, “books are actually more important than web data”. Instead, they turned to vast datasets that are made largely of pirated materials to save time and money.

AI research organisation Eleuther AI published an analysis in 2020 that demonstrated that high-quality long-form texts like textbooks and novels comprise smaller, more efficient datasets than those compiled by simply scraping the web, while teaching the models to understand more complex topics. So, the companies are saving money on storage size and training time on the development end, and producing better products that inspire greater demand from users. Meanwhile, the writers of contributing works are reaping little or no benefit.

A position paper published in April 2025 suggested that if the companies had to pay for these works according to the time and labour that it might take to create them (using extremely conservative pay rates and estimated creation time), the cost of training datasets could be two to three orders of magnitude greater than the expense of the rest of the AI operations combined. Those kinds of savings create an immense profit margin, and writers whose works contribute to those margins should see dividends from them as well.

For creators, benefitting from the AI windfall would help to alleviate another major concern: a loss of work and income associated with the rise of AI. A 2023 study of market displacement on a freelancing online platform revealed, “The release of ChatGPT leads to a 2% drop in the number of jobs on the platform, and a 5.2% drop in monthly earnings”. Clients can turn to AI to generate in minutes what these writers and artists might spend days or weeks creating, and the effects of this can be seen in the study’s data.

Has the horse bolted?

AI models and the ability to create them with relative ease are already out in the wild, and the slate of benefits AI brings makes it unlikely to disappear anytime soon. But while AI tech was relatively unfettered from its infancy and into the early 2020s while the world got to grips with it, since 2023 we have seen legislation, litigation, and licensing deals starting to define and enforce the role of AI in our technological ecosystem.

Legislation

Legislation in most countries and jurisdictions is still being debated or in the consultation phase. In August 2024, the EU Artificial Intelligence Act came into force, which includes legislation designed to protect writers and artists by creating an opt-out system, where writers and publishers have the option to declare that they don’t want their works used in training AI models.

Licensing

In July 2023, OpenAI and the Associated Press agreed to a trailblazing deal, where OpenAI was given access to the AP’s archives to use for training its ChatGPT model. The specific terms of the deal weren’t disclosed, but this nevertheless set a precedent for licensing information that was followed and built upon by Reddit and Google and HarperCollins and Microsoft in 2024. These landmark deals allowed tech companies to rely less heavily on messy, indiscriminate web scrapes, and demonstrated the high value of quality text in the world of AI.

Litigation

The list of lawsuits being filed right now is dynamic, to say the least. New lawsuits are being filed with head-spinning frequency. Some have been settled with licensing deals, and most are ongoing.

In general, AI companies are arguing that their use of copyrighted materials is protected under fair-use or -dealing allowances in copyright law. Early decisions seem to favour AI companies. In Germany, a district court ruled in September 2025 that web-scraping, “constitutes a legitimate form of text and data mining (TDM) for scientific research purposes”. However, in an order from June 2025, California judge William Alsup drew a critical demarcation between the transformative use of books as training data – which he deemed protected fair use – and the acquisition of pirated materials for building training datasets – an act he ruled to be “inherently, irredeemably infringing”. That case resulted in a record settlement in August 2025, where Anthropic agreed to pay $1.5 billion to affected authors.

These cases are starting to define more discretely the role of writers and AI companies in the development of generative AI. What we can take away right now, in early 2026, is that companies’ cases pleading fair use are seeing support in the courts and in early legislation, but only when they work in collaboration with creators of the training materials they are fairly using. This is giving writers much more freedom to choose whether their works will be included and, when they are, to be compensated for it.

The Skeptic is made possible thanks to support from our readers. If you enjoyed this article, please consider taking out a voluntary monthly subscription on Patreon.

spot_img
- Advertisement -spot_img

Latest articles

More like this