New AI Lawsuit From Authors Against Anthropic Targets Growing Licensing Market for Copyrighted Content

Exactly.

I know a pretty good number of authors on FB. If you went up to them and said, I'd like to buy "one-of-everything-you-have-written" to train an LLM AI I am developing, they'd probably say, "okay, prices as listed on the covers." They might charge more, of course, but that's up to them.

To not even make the offer to buy their books? Are you F*ing kidding me?!? F you, PAY ME!!!!

It's the same reason digital artists are using tools like Glaze or Nightshade to render their products unusable by AI. And while Glaze just makes it so that AIs cannot copy your style, Nightshade will actively kill any AI very quickly. The example given is that they included pixels in a layer that makes the AIs associate pictures of a dog as being a cat. After 300 Nightshaded images with the "dog is cat", any time you ask that AI for a picture of a dog you will get a picture of a cat.
 
Exactly.

I know a pretty good number of authors on FB. If you went up to them and said, I'd like to buy "one-of-everything-you-have-written" to train an LLM AI I am developing, they'd probably say, "okay, prices as listed on the covers." They might charge more, of course, but that's up to them.

To not even make the offer to buy their books? Are you F*ing kidding me?!? F you, PAY ME!!!!

It's the same reason digital artists are using tools like Glaze or Nightshade to render their products unusable by AI. And while Glaze just makes it so that AIs cannot copy your style, Nightshade will actively kill any AI very quickly. The example given is that they included pixels in a layer that makes the AIs associate pictures of a dog as being a cat. After 300 Nightshaded images with the "dog is cat", any time you ask that AI for a picture of a dog you will get a picture of a cat.

The billionaires behind this know what they are doing. They did not ask for permission or offer fair compensation when all of this started. They just took it off the internet for their profit-making ventures. That's why Microsoft invested billions in OpenAI. But other lawsuits happened. OpenAI began hiring more lawyers. Now this. They have the deep pockets. They know the average author and artist does not.
 
The billionaires behind this know what they are doing. They did not ask for permission or offer fair compensation when all of this started. They just took it off the internet for their profit-making ventures. That's why Microsoft invested billions in OpenAI. But other lawsuits happened. OpenAI began hiring more lawyers. Now this. They have the deep pockets. They know the average author and artist does not.
Which is illegal.
Obviously the US needs to spend more time with ethics lessons in schools, and some Game Theory (the fancy name for "the reasoning behind why you make decisions").​
Being "nasty" (term of art in Game Theory referring to "not cooperating") is only advantageous when there is exactly ONE interaction between the parties. As soon as there are multiple interactions, being "nice" (term of art in Game Theory meaning "to cooperate") instantly becomes more advantageous.​
Show kids that screwing people over only works if you NEVER deal with that individual again and even the child sociopaths will grok that it is not to their advantage.​
Seriously, had the various AI companies just used what was truly freely available (example, got a library card per AI project), this wouldn't be an issue. Instead, we're seeing AI companies being sued by everyone whose writing were in the training set without compensation.
 
One of my readers is Mr. Kaplan, an important Californian lawyer, on one occasion I was worried about some legal issues in the US and he answered: It is illegal everything that lawyers cannot solve with time and a lot of money.
 
Which is illegal.
Obviously the US needs to spend more time with ethics lessons in schools, and some Game Theory (the fancy name for "the reasoning behind why you make decisions").​
Being "nasty" (term of art in Game Theory referring to "not cooperating") is only advantageous when there is exactly ONE interaction between the parties. As soon as there are multiple interactions, being "nice" (term of art in Game Theory meaning "to cooperate") instantly becomes more advantageous.​
Show kids that screwing people over only works if you NEVER deal with that individual again and even the child sociopaths will grok that it is not to their advantage.​
Seriously, had the various AI companies just used what was truly freely available (example, got a library card per AI project), this wouldn't be an issue. Instead, we're seeing AI companies being sued by everyone whose writing were in the training set without compensation.

A library card? And paying real people for hundreds of hours of scanning? Google tried before with books and this is what happened:

 
Those who want AI believe they are immune. That no lawsuit can hurt them.
That doesn't mean that they are immune however. Stealing intellectual property is most definitely a good case for a lawsuit, it's only a matter of time. As far as I'm aware, the EU has already asked OpenAI to publicly list what they trained their AI on, if they don't comply, that is another basis for a lawsuit, and the EU has been quite successful recently against big corpo (Apple being the most recent example).
 
That doesn't mean that they are immune however. Stealing intellectual property is most definitely a good case for a lawsuit, it's only a matter of time. As far as I'm aware, the EU has already asked OpenAI to publicly list what they trained their AI on, if they don't comply, that is another basis for a lawsuit, and the EU has been quite successful recently against big corpo (Apple being the most recent example).

I hope they make progress. In the U.S., AI companies are claiming "fair use." Fair use, before the internet, meant going to a library, copying a few pages from a book or magazine as reference for a term paper. You can't copy the whole book. Now, AI companies want fair use to mean "Anything we found on the internet." Bad idea.
 
A library card? And paying real people for hundreds of hours of scanning? Google tried before with books and this is what happened:

Wasn't aware of that case before.

Okay, so the AI writer buys a library. Thousands of books. Digitizes them, feeds them to the AI in training. Problem solved. BUY BOOKS.
 
Read Kara Swishers "Burn Book: A tech Love Story" if you want to get a sense of many of the billionaires behind these efforts. The portraits she paints for many are not complimentary. They are basically hoovering up everything they can on the internet (and almost certainly from this site as well). A lot of information provided for "free" for their marketing and AI training, but often provided unintentionally by all of us. Personal writings and photos, personal information, personal videos, peroonal research and purchases, all appropriated without asking the originator. Using the internet infrastructure that was funded by taxpayers to help themselves without permissions to AI training material. And currently, no effective controls or regulations, at least in the US.

Kara Swisher calls it Digital Shoplifting. Great description.
 
Last edited:
Read Kara Swishers "Burn Book: A tech Love Story" if you want to get a sense of many of the people behind these efforts. They are basically hovering up everything they can on the internet (and almost certainly from this site as well). A lot of information provided for "free" for their AI training, but often unintentionally by all of us.

They are abusing the trust of The People. They are hiding behind their lawyers. They want what they want. And more billions...
 
Okay, so the AI writer buys a library. Thousands of books. Digitizes them, feeds them to the AI in training. Problem solved. BUY BOOKS.
That would involve money leaving their pockets, and as we all well know, they're too greedy to allow that as it will eat into their profits. Which is why they're stealing them
 
Too much upfront investment.
Yeah, that's why they broke the law instead.


Just owning copyrighted books does not give anyone the right to copy them.
But it's totally legal to memorize them. That's what those AIs are doing, memorizing the copyrighted materials. Learning how words follow each other, how the language is put together.
 
Yeah, that's why they broke the law instead.



But it's totally legal to memorize them. That's what those AIs are doing, memorizing the copyrighted materials. Learning how words follow each other, how the language is put together.
Your argument sounds a bit like lawyering.
How does the AI memorize the books? Why, by copying the contents into the database, stored in hard drives, to subsequently parse the contents for training, e.g. a neural network. But the acquisition of the work for transfer into the database is copying. Or, as Kara Swisher calls it, digital shoplifting.
 
Your argument sounds a bit like lawyering.
How does the AI memorize the books? Why, by copying the contents into the database, stored in hard drives, to subsequently parse the contents for training, e.g. a neural network. But the acquisition of the work for transfer into the database is copying. Or, as Kara Swisher calls it, digital shoplifting.
It's no different from someone who memorizes the book. They have a copy of the book in their head and can produce copies of that copyrighted work at will.
 
t's no different from someone who memorizes the book. They have a copy of the book in their head and can produce copies of that copyrighted work at will.

So, by your argument, I get a book out of the library, memorize it, and then I can sit down at my computer, type it in, and then print it. Then I can sell those printed copies. No, you can quote it, discuss it with colleagues and friends, but reproducing it in its entirety will likely end up with you in court. So there are limits to what you can do "at will".

The question is, are AI companies exceeding the Fair Use laws by taking a perfect digital copy of an existing work for training their AI, often with the intent to produce works for which the original owner would be compensated. First, if you look at US Copyright Law and the Fair Use policy (see material from the US site below), you are not legally able to appropriate the work wholesale for the purposes of reproduction and profit. The companies have cited "Fair Use" as the basis for their AI training, but I think that given the potential impact on authors (and others, like photographers, actors, musicians etc.) that courts may well find that the wholesale appropriation of works is not Fair Use. Especially since it can deprive individuals of compensation for their creative endeavors. The current copyright law did not anticipate AI training, and this is a gray area to be resolved, but this issue is being taken up now by, for example, the New York Times vs OpenAI, Getty Images vs Stability AI, and Andersen v. Stability AI. The list of pending cases can be seen here: AI Fair Use litigation cases

A recent article by Enterprise AI says:" OpenAI has acknowledged that its programs are trained on publicly available data sets, including copyrighted works. This process involves making copies of the data to be analyzed. However, creating such copies without permission could infringe on copyright holders' rights. OpenAI argues that its training processes constitute fair use and do not involve infringement."

We will see what the courts decide as to whether this is Fair Use. The fact that much of the subsequent use of AI products either cost to the AI end user or is used to generate AI company revenue (e.g. subscriptions or providing other materials) may work against the AI companies, as it is commerical use.

Here is what US Copyright Law says (note, these are abridged points, entire paragraphs would be too lengthy):

Section 107 of the US Copyright Act provides the statutory framework for determining whether something is a fair use and identifies certain types of uses—such as criticism, comment, news reporting, teaching, scholarship, and research—as examples of activities that may qualify as fair use.
Purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes: Courts look at how the party claiming fair use is using the copyrighted work, and are more likely to find that nonprofit educational and noncommercial uses are fair.
Nature of the copyrighted work: This factor analyzes the degree to which the work that was used relates to copyright’s purpose of encouraging creative expression. Thus, using a more creative or imaginative work (such as a novel, movie, or song) is less likely to support a claim of a fair use than using a factual work (such as a technical article or news item).
Amount and substantiality of the portion used in relation to the copyrighted work as a whole: Under this factor, courts look at both the quantity and quality of the copyrighted material that was used. If the use includes a large portion of the copyrighted work, fair use is less likely to be found; if the use employs only a small amount of copyrighted material, fair use is more likely.
Effect of the use upon the potential market for or value of the copyrighted work: Here, courts review whether, and to what extent, the unlicensed use harms the existing or future market for the copyright owner’s original work. In assessing this factor, courts consider whether the use is hurting the current market for the original work.
 
Last edited:
So, by your argument, I get a book out of the library, memorize it, and then I can sit down at my computer, type it in, and then print it. Then I can sell those printed copies. No, you can quote it, discuss it with colleagues and friends, but reproducing it in its entirety will likely end up with you in court. So there are limits to what you can do "at will".

The question is, are AI companies exceeding the Fair Use laws by taking a perfect digital copy of an existing work for training their AI, often with the intent to produce works for which the original owner would be compensated. First, if you look at US Copyright Law and the Fair Use policy (see material from the US site below), you are not legally able to appropriate the work wholesale for the purposes of reproduction and profit. The companies have cited "Fair Use" as the basis for their AI training, but I think that given the potential impact on authors (and others, like photographers, actors, musicians etc.) that courts may well find that the wholesale appropriation of works is not Fair Use. Especially since it can deprive individuals of compensation for their creative endeavors. The current copyright law did not anticipate AI training, and this is a gray area to be resolved, but this issue is being taken up now by, for example, the New York Times vs OpenAI, Getty Images vs Stability AI, and Andersen v. Stability AI. The list of pending cases can be seen here: AI Fair Use litigation cases

A recent article by Enterprise AI says:" OpenAI has acknowledged that its programs are trained on publicly available data sets, including copyrighted works. This process involves making copies of the data to be analyzed. However, creating such copies without permission could infringe on copyright holders' rights. OpenAI argues that its training processes constitute fair use and do not involve infringement."

We will see what the courts decide as to whether this is Fair Use. The fact that much of the subsequent use of AI products either cost to the AI end user or is used to generate AI company revenue (e.g. subscriptions or providing other materials) may work against the AI companies, as it is commerical use.

Here is what US Copyright Law says (note, these are abridged points, entire paragraphs would be too lengthy):

Section 107 of the US Copyright Act provides the statutory framework for determining whether something is a fair use and identifies certain types of uses—such as criticism, comment, news reporting, teaching, scholarship, and research—as examples of activities that may qualify as fair use.
Purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes: Courts look at how the party claiming fair use is using the copyrighted work, and are more likely to find that nonprofit educational and noncommercial uses are fair.
Nature of the copyrighted work: This factor analyzes the degree to which the work that was used relates to copyright’s purpose of encouraging creative expression. Thus, using a more creative or imaginative work (such as a novel, movie, or song) is less likely to support a claim of a fair use than using a factual work (such as a technical article or news item).
Amount and substantiality of the portion used in relation to the copyrighted work as a whole: Under this factor, courts look at both the quantity and quality of the copyrighted material that was used. If the use includes a large portion of the copyrighted work, fair use is less likely to be found; if the use employs only a small amount of copyrighted material, fair use is more likely.
Effect of the use upon the potential market for or value of the copyrighted work: Here, courts review whether, and to what extent, the unlicensed use harms the existing or future market for the copyright owner’s original work. In assessing this factor, courts consider whether the use is hurting the current market for the original work.
And the AIs are NOT REPRODUCING THE BOOK IN ANY WAY.
 
And the AIs are NOT REPRODUCING THE BOOK IN ANY WAY.
No, they are not, and that is not the argument. The AI systems are not photocopying machines. But they are storing copies as data training sets without permission or compensation for the works.

Again, from the above: "OpenAI has acknowledged that its programs are trained on publicly available data sets, including copyrighted works. This process involves making copies of the data to be analyzed. However, creating such copies without permission could infringe on copyright holders' rights. OpenAI argues that its training processes constitute fair use and do not involve infringement."

Open AI admits they are making copies of the COPYRIGHTED creative works without compensation. The heading of this thread was about a similar lawsuit against Amazon, for taking content without compensation. The issue before the courts, yet to be decided, is whether this constitutes "Fair Use" under Section 107. It is a legal matter.
 
Last edited:
No, they are not, and that is not the argument. The AI systems are not photocopying machines. But they are storing copies as data training sets without permission or compensation for the works.

Again, from the above: "OpenAI has acknowledged that its programs are trained on publicly available data sets, including copyrighted works. This process involves making copies of the data to be analyzed. However, creating such copies without permission could infringe on copyright holders' rights. OpenAI argues that its training processes constitute fair use and do not involve infringement."

Open AI admits they are making copies of the COPYRIGHTED creative works without compensation. The heading of this thread was about a similar lawsuit against Amazon, for taking content without compensation. The issue before the courts, yet to be decided, is whether this constitutes "Fair Use" under Section 107. It is a legal matter.

Without prior agreement with copyright holders, the AI companies have stolen their work. Period.
 
No, they are not, and that is not the argument. The AI systems are not photocopying machines. But they are storing copies as data training sets without permission or compensation for the works.

Again, from the above: "OpenAI has acknowledged that its programs are trained on publicly available data sets, including copyrighted works. This process involves making copies of the data to be analyzed. However, creating such copies without permission could infringe on copyright holders' rights. OpenAI argues that its training processes constitute fair use and do not involve infringement."

Open AI admits they are making copies of the COPYRIGHTED creative works without compensation. The heading of this thread was about a similar lawsuit against Amazon, for taking content without compensation. The issue before the courts, yet to be decided, is whether this constitutes "Fair Use" under Section 107. It is a legal matter.
And my argument is that buying a pile of (e)books to use as the AI training material would be the same as the pile of physical and ebooks I have at home that I use to train my attempts at storytelling. So would checking the books out from a library, with each AI core getting its own library card.**

That OpenAI et al have not even done that is absolutely criminal.

** Assuming that the AIs can run their training exercises on the books and not retain a complete copy in memory. I will concede that current law/precedent makes keeping copies of borrowed ebooks criminal copyright infringement.
 
I think, Scott, we have come to the same point. And that is the key point in the legal argument; the producers were not consulted nor compensated for the wholesale appropriation of their work. And their work was used to produce a commercial product.

We will have to see how the courts settle this issue
 

Similar threads

Please donate to support the forum.

Back
Top Bottom