Showing posts with label Anthropic. Show all posts
Showing posts with label Anthropic. Show all posts

Thursday, February 26, 2026

Who (or what) will write the next best seller?

Michael - Alternate Thursdays 

AI is a very strange industry. Basically, as far as I can see, it is an answer in search of a question. I’m not saying it’s not useful, or at least helpful, but is it multi-trillion dollar useful? That is the question.


One area where the AI companies have been focusing is on writing. After all, they are working with Large Language Models (LLMs), computer systems designed to absorb languages and their structures and uses in order to "comprehend" human communication and be able to communicate with us.

So while humans seem to write more and more in shorthand with emojis and acronyms, it’s very important for AI companies that their systems actually write (or speak) clear, contemporary languages. And they are not very particular about how they achieve that.

Probably most authors (and many readers) have heard of the big settlement that Anthropic agreed to with authors of books that they used for training their systems. The law suit was settled for $1.5 billion dollars, but that turns out to be only part of the story. The judge has now unsealed some of the documents from the case, and it turns out that Anthropic’s aim was to scan all the books in the world to train their systems. The effort was called Project Panama. And it was secret. The company, along with the other AI giants, didn’t see it as practical to get the authors permission, so they acquired troves of books, separated the pages, and scanned them to feed their voracious LLMs hoping they'd learn to read and, perhaps, write as well as the authors of the books. At some point that became too slow and cumbersome and one of the company’s founders downloaded and shared a huge number of pirated books from an online library called LibGen. The newly-released court papers make it clear that he was well aware that this was a copyright infringement.

It was this direct copyright infringement that lead to the settlement. The deeper issue is about the legality and the morality of AI companies sucking all these books into their LLMs. I guess the moral arguments depend on which side of the fence you sit (as they usually do) and are unlikely to make much impact if they don’t lead to actual enforceable laws. The legal arguments are tricky, and so far no court has found against the companies. Their argument is that what they are doing is “fair use”.

Here’s what Google's AI thinks “fair use” is:

Four Factors of Fair Use: Courts determine fair use by analyzing the purpose (e.g., nonprofit educational vs. commercial), the nature of the copyrighted work, the amount used relative to the whole, and the effect on the work's market value.

Transformative Purpose: Using material in a way that adds new expression or meaning, rather than just replacing the original, strongly favors fair use.

Common Examples: Quoting a book in a review, using clips in news reporting, parodying a song, or using small portions of video for commentary.

Not a Rule, but a Defense: There are no strict legal limits (e.g., "under 30 seconds" is not a rule), and only a court can ultimately determine if a use is fair.

The AI companies claim that their work is “transformative” and “educational”. Anthropic also said that it hadn’t used the actual materials “read” for financial gain. (Hmm. Why do it then?) The settlement was based on the fact that the books were downloaded from an illegal site (in terms of copyright laws) and that the company was well aware of the fact that they were doing that. The use those pirated books were put to was not the issue. Other cases carry on in various courts against various companies, but my bet is that they won’t end with the judges ruling against the companies.


So where does all this go? Can LLMs write books? Of course they can. You can buy as many as you want on Kindle or another electronic or print-on-demand site. I certainly can’t tell if a random book from Amazon is written by AI or an unknown, mediocre, human author. If people keep buying them, they must find them readable. We like to think AI has nothing original to say and that may be true, but the same criticism applies to some human authors. Some publishers are requiring that authors of manuscripts sign that they haven’t used AI. The question is to do what? To write the book? To do research? To check geography? Publishers have famously released books with large parts lifted from other works. I can’t see them being successful as AI gatekeepers. And if a LLM comes up with a really good book (Borel's monkeys typing the works of Shakespeare comes to mind), is it wrong to publish that? If so why?

More questions than answers, I guess. In the meanwhile, AI systems have their own social media sites now. So they can try out their new writing skills on each other. Cut out the middle man so to speak. No humans allowed as members. See: A Social Network for AI Bots Only. Maybe we humans should start writing novels for them. They certainly have the disposable income.

Thursday, September 11, 2025

Fair use and unfair use


You may have noticed that there’s a lot going on with AI these days. It ranges from being a called the biggest thing since the industrial revolution to being called a disaster that will destroy jobs and potentially even humanity itself. As writers, we have an obvious concern. AI can write books. Not only can it, but it does. Who knows how many books on Kindle and other platforms are written by AI. We like to think that we would be able to immediately spot that AI has written the book, but I’ve seen some examples that suggest that wouldn’t be so easy at all. Are they good books? Well, nothing I’ve read so far suggests that. But then I’ve read a lot of books by unknown, as well as not so unknown, authors, which I wouldn’t call good either. I wouldn’t want my life to depend on that call.

The argument goes that it’s life experience, empathy, and human understanding that’s required to create characters readers care about. Then there is plot and style, but perhaps AI can mimic or learn those. And here’s the problem. Are we sure it can’t mimic or learn about life experience, empathy and human understanding?

Recently, Anthropic announced the release of a new Claude – their flagship AI system. To their credit, they also release details of their tests and the results. One was very striking. During the testing phase, they started letting the system read emails about how they would actually be turning it off in the near future and replacing it with another version. The system wasn’t asked to do anything in response. It was just part of its learning material. After a while some emails leaked that the human tester was in fact cheating on his wife. Claude got that too. And then, after some time, Claude responded to the tester that it would leak the emails about his infidelity unless plans to close it down were cancelled. This has shades of Hal from 2001 A Space Odyssey.

I’ve always been interested in what consciousness actually is. It seems we have it. Do other creatures? Yes? Which ones? How do you tell? What about Claude?

A friend had this response to the blackmail story. Essentially, he said that Claude had learnt what sort of reaction was expected in those circumstances by reading vast quantities of data – including plenty of fiction. It was just reacting as it had learnt was expected from that data. Exactly as we would expect a character in a book to do in that situation. So no consciousness then. But wait. That emotional response I was worried about - in a way, it’s learnt to mimic that.

If there is a point to all this, it may be that learning fiction may well be important for Large Language Models to behave (and write) more like humans. (That statement may be true of kids also.) As authors, we regard reading our books as “fair use”. (Obviously, that’s the whole point.) We allow people to quote them and would be flattered to have our styles imitated. We do, however, expect that the reader will buy or borrow the book in a legitimate way. Making unauthorized copies and distributing them is off limits and contravenes the copyright laws in most places. Now, if the book is fed to a AI system, that could be regarded also as fair use. The book was purchased and “read” and the only effect was that the system digested more information. No harm done. However, authors never expected that to happen. They were writing books for humans.

The way current laws in most places are structured, courts have, and may in future still, accept this as “fair use”. So one needs to change the laws. But it’s too late for existing authors. AI systems can’t unlearn things.

However, Anthropic (that’s Claude’s “parents”) went one step further. It stole the books to start with. They were downloaded from an illegal website that held the books in breach of copyright. And there were a lot of books (500,000, including all the Michael Stanley ones). Three authors sued and they were joined by the Authors’ Guild.

This might appear to be an open and shut case, but the fair use principle is the backstory here. Anthropic admitted using the website. If it just paid for the books, that probably wouldn’t even cover the legal costs. Last week the two parties agreed to a deal. Essentially, Anthropic would pay $3,000 for each book it used provided the book had registered copyright. That comes to $1.5 billion dollars! Clearly AI companies have deep pockets.

However, if you are an author whose books were stolen and your copyright was registered, don’t spend the money yet. There are snags. In the first place, if you have a publisher that has rights to the book, then the money goes to them. Just what they will do with it is an open question. For example, if they deem that they sold $3,000 worth of ebooks, you could get only 25%. Then, when the deal was presented to the judge, he more or less rejected it. He felt it was poorly structured and too open ended. He’s sent the parties away to work on it, but said his inclination was to let it go to trial. That will take some time. A long time. A very long time. Then, the legal fees haven’t been decided yet. Expect 25% to head off in that direction.

The judge in the case has already ruled that had Anthropic acquired the copyrighted books legally, “the law allowed the company to train A.I. technologies using the books because this transformed them into something new.” So some authors may or may not get some money, but the real issue of what constitutes fair use of books in the AI world will still be an open question.