austin-cheney 9 hours ago

A slippery slope to eliminate copyright out right. The argument made is that AI is somehow more special and will otherwise lose to competition with China.

The flaw there is that AI is not more special than any other endeavor while all other American markets must equally compete with China.

What that failure means is that when anything is exceptional then everything becomes exceptional because the economic conditions apply equally and therefore bypassing copyright protections applies equally to anybody facing external competition.

  • 1vuio0pswjnm7 7 minutes ago

    How does one define "win" or "lose" in this supposed competition.

    There is endless discussion of a "race" but I cannot find a single discussion of how "winning" or "losing" is actually determined (cf. wild speculation about what the future will look like -- may work for marketing puporses but almost always incorrect).

  • glimshe 9 hours ago

    Copyrights are more often used to defend large corporations than small creators. As long everybody has a level playing field and individuals benefit from weaker copyright laws, it might actually make the world a better place. I'm not arguing for the complete elimination of copyright protections, but today's laws, in particular copyright duration, are immoral. This is as good of a starting point as any assuming OpenAI isn't the only who gets to benefit from it.

    • codedokode 9 hours ago

      In a billion dollar company can use pirated books for a business, should we allow them to use pirated software too? Do you think that requiring a company to pay for Windows license is "immoral"?

      • thedevilslawyer 8 hours ago

        Ideally, software shouldn't be copyrightable, or patent-able. It's what FOSS is based on (couldn't remove copyright, so let's hack it via copyleft).

      • rich_sasha 7 hours ago

        Well, just you try not to pay for ChatGPT...

      • actionfromafar 8 hours ago

        Of course, how else could we train the neural networks to run the programs?

        /largest_company

    • austin-cheney 9 hours ago

      > Copyrights are more often used to defend large corporations than small creators.

      Are there numbers to this or is it empty conjecture? The reality is that resulting civil judgments apply the same regardless of owner size, which benefits small owners disproportionately to large owners with regard to windfall versus regular revenue. That is OpenAI’s principle concern: they don’t want to get sued into bankruptcy by numerous small rights owners.

      • thedevilslawyer 8 hours ago

        Copyright significantly powers revenue to corporations than individuals. Take music - this article show it's only 12% to individual musicians.

        https://www.rollingstone.com/pro/news/music-artists-make-12-...

        • rich_sasha 7 hours ago

          Surely copyright isn't the problem here. Without copyright, music industry could pay nothing for the music..? Just copy it with impunity.

          Music industry, presumably, takes a bet on many musicians, and only a few make it. The revenues made by the successful ones effectively subsidise the unsuccessful ones.

          Also if musicians are so widely screwed by the bad industry, why don't they create a cooperative agency that treats them well? There's enough money sloshing around in successful musicians' coffers.

    • hansmayer 9 hours ago

      ...right, lets make sure we protect the little, undercapitalised startup OpenAI from the large corporations holding them back :)

  • echelon 9 hours ago

    OpenAI has no moat. They're afraid of open source and want the government to protect them.

    Microsoft doesn't think they're very cool anymore.

    Sam Altman is going to have one of the quickest falls from grace in tech history. It's a shame he's using his time to try to legislate a worse world for the rest of us.

    • actionfromafar 8 hours ago

      At the rate things are going in the US, "legislate" seems to be largely replaced by "executive directive", so maybe you don't have to worry about legislation. (We will still have the worse world part, of course.)

  • hansmayer 9 hours ago

    All of this, plus it's not even AI in the generic sense, it's just very advanced text generation, or a certain application of AI. So the chinese Gemini will offer to summarise e-mails at lower cost, who cares?

srg0 8 hours ago

Copyrighted material includes works by authors from outside the US. By Berne convention, the exceptions which any country may introduce must not "conflict with a normal exploitation of the work" and "unreasonably prejudice the legitimate interests of the author". So if at least one French author does license their work for AI training, then any exception of this kind will harm their legitimate interests and rob them of potential income from normal exploitation of the work.

If the US can harm authors from other countries, then other countries may be willing to reciprocate to American copyright holders, and introduce exceptions which allow free use of the US copyrighted material for some specific purposes they deem important.

IANAL, but it is a slippery slope, and it may hurt everyone. Who has more to lose?

And I hope that Mistral.AI takes note.

  • thedevilslawyer 8 hours ago

    > then any exception of this kind will harm their legitimate interests

    Pray tell what legitimate interest of the author is harmed by LLM's training on that work? No one is publishing the authors book.

    • pintxo 3 hours ago

      The legitimate interest that there does not exist a tool that allows any random person to create art in the same style as she does? Which could arguably devalue their offering?

    • Palmik 8 hours ago

      What I think the parent meant is the interest to sell license to others to train on their data.

      • srg0 8 hours ago

        Exactly. Some copyright holders do license their work for AI training. It certainly happens in the music industry, but I don't see why texts would be any different. The exception would harm their business.

        • thedevilslawyer 5 hours ago

          Example please? It's always been fair use to train on accessible data. It's how for eg: so much of research has been going on for decades.

hansmayer 9 hours ago

I just wish they understood they are limited not by the content available, but by the intrinsic characteristics of the architecture and algorithms of LLMs. It's just not the AGI that will magically open it's eyes one day.The sooner we stop burning billions of dollars on it, the better.

xrd 9 hours ago

The follow-on prompt was "add the word freedom a lot more."

  • hansmayer 7 hours ago

    ... sprinkle in a lot of "strategy" too, to make the reader seem like they are smart. Lay "America/Americans" even thicker, to combine with the sense of higher purpose, i.e. patriotism.

deepsummer 8 hours ago

I think an AI should be treated like a human. A human can consume copyright material (possibly after paying for it), but not reproduce it. I don't see any reason why the same can't be true for an AI.

  • actionfromafar 8 hours ago

    Then, we should also put the AI in jail when it's breaking copyright laws. Or being an accessory to breaking copyright law.

    • deepsummer 8 hours ago

      An AI that's breaking copyright laws shouldn't be legal. So yes, it's kind of like putting it in jail.

  • nness 8 hours ago

    The issue is so much about consumption of copyright material, but acquisition of that material.

    Like a real person, AI companies need to adhere to IP and license or purchase the materials that they wish to consume. If AI companies licensed all materials they acquired for training purposes, this would be a non-issue.

    OpenAI are looking for a free pass to break copyright law, and through that, also avoid any issues that would arise through reproduction.

    • Palmik 8 hours ago

      A real person wouldn't have to pay to read random blog, Reddit comments, StackOverflow answers or code on GitHub (many open source licenses do not imply license for training).

      They might have to pay for books, or use a library.

      Should these cases be treated differently? If so, it might lead to more closed internet with even more paywalls.

      • alphabettsy 4 hours ago

        I think those are less of an issue. They want to train on paywalled news articles, magazines and books. In addition to other media that the average person would have to pay for or would otherwise have limitations applied.

        • Palmik 2 hours ago

          In my opinion, if any copyright related rule is applied to books or other paywalled content, it should equally apply any Joe Shmoe's blog or code on GitHub.

someothherguyy 8 hours ago

Yeah, shorten the terms of copyright on original works by about 90%, and call it a win for everyone except for rights holders.

  • fmajid 7 hours ago

    Rights holders are the economically marginal tail wagging the dog due to the disproportionate political power of content industries. All of Hollywood's annual revenues represented 2 weeks of telcos' SMS revenue back when you paid per message.

fmajid 7 hours ago

Well, if we finally have hundred-billion-dollar corporations pushing back on the copyfight around the continual expansion of copyright (e.g. the congressman for Disney, Sonny Bono) or abusive laws like DMCA, that's a welcome development.

antonkar 9 hours ago

Basically stole almost whole output of humanity both dead and alive, put it in their Frankenstein Monsters’ ever growing brains and now want to let em roam unsupervised longer and longer (AI agents) and continue to steal things.

Taking away human freedoms and giving em to agents 101

  • thedevilslawyer 8 hours ago

    What stealing? None of the original content is gone. Perhaps "infringement" is a more apt word.

    • antonkar 8 hours ago

      Yes, if you’ll infringe like they, you’ll be in jail forever

      • thedevilslawyer 8 hours ago

        Ignoring the non-sequitur on jail, I guess you're affirming that it's not stealing?

        • antonkar 7 hours ago

          Most people will call it stealing, lawyers will find a way to call it differently.

          So, you’re affirming that you can steal almost the whole creative output of humanity and not sit in jail your whole life?)

          They not just stole or infringed, they profit from it, replace and compete with the very from whom they stole (or whom they infringed as you prefer calling it).

          The model is like their private library they don’t allow you to enter or see, instead they have a strict librarian who spits hallucinated quotes at you.

          The problem is in that. They are not Robbin Hoods who steal to share with the poor. They steal from the poor to make the rich richer. To enrich themselves, grab human freedoms and give those freedoms and more to AI agents.

          You cannot steal the whole output of humanity and put in your brain. AI agents and companies already have massively more rights and freedoms than you and it’s gonna get much worse.

          There is a narrow way through dystopias because intelligence is inherently static and non-agentic (think static 4d spacetime of a universe), we can open the Library and empower people by making models explorable like 3D games

ThatMedicIsASpy 8 hours ago

You steal from others and make them pay - constant scraping cost money (traffic, server load, scraping protection). Then you should only be allowed to release open source models.

  • gloxkiqcza 7 hours ago

    A ruling that only open source models can freely use copyrighted data for training would be a funny outcome and a big F you to OpenAI. I don’t expect it to happen but an interesting thought nonetheless.

csomar 9 hours ago

> An export control strategy that exports democratic AI: For countries seeking access to American AI, we propose a strategy that would apply a commercial growth lens—both Total and Serviceable Addressable Markets—to proactively promote the global adoption of American AI systems and with them, the freedoms they create. At the same time, the strategy would use export controls to protect America’s AI lead, including by making updates to the AI diffusion rule.

What a bunch of gibberish hot garbage.

  • isaacremuant 9 hours ago

    It works for comedy without changing a word. Impressive.

nomilk 9 hours ago

Wonder how much the addition of copyrighted material affects how smart the resulting model is. If it's even 20% better LLM makers could be forced out of the US into jurisdictions that allow use of copyrighted data.

I suspect most LLM users will ~always choose the smartest model.

  • srg0 8 hours ago

    > most LLM users will ~always choose the smartest model

    Most LLM users will choose the cheapest model which is good enough.

    I think that LLMs' performance is already "good enough" for a lot of applications. We're in the diminishing returns part of the curve.

    There are two other concerns:

    1. being able to run the model on trusted infrastructure locally (so some jerk won't turn it off on a whim, and the data will remain safe and comply with the local data protection laws and policies)

    2. having good tools to create AI applications (like how easy it is to fine-tune it to customer needs)

    > how much the addition of copyrighted material affects how smart the resulting model is

    Copyrighted material improve the models, not by making it smart, but more factually correct, because it will be trained on reputable, reliable and up-to-date sources.

  • noosphr 8 hours ago

    The jump from llama2 to llama3 had something to do with meta downloading every textbook ever published and using it as training data.

    The arguments by meta so far in that court case are absolutely terrible and I'm half expecting to see the world's first trillion dollar copyright infringement award.

    • Palmik 8 hours ago

      Incorrect. Llama 1 trained on books3 dataset.

  • regularjack 9 hours ago

    All of it is copyrighted material

iamsaitam 9 hours ago

If this happens, I hope they get banned in Europe. This is unacceptable.

megamix 8 hours ago

Can anyone also use copyrighted source code, e.g. from OpenAI?

regularjack 9 hours ago

The arrogance of these people is without end.

faragon 9 hours ago

If a person can read copyrighted material and produce derivative works, why not an AI?

  • hansmayer 9 hours ago

    Because - well a person can read copyrighted material it legally obtained the rights to, for example by purchasing a hard or electronic copy of the book or magazine. Alternatively, and according to the laws worldwide, if a person were to engage in massive theft for the purpose of "reading" all available copyrighted materials in the world, by obtaining copyrighted material without permission and consent of the copyright holder, they would be at least paying heavy fines, and in most jurisdictions also spend at least a few years in jail. Why should the same not apply to corporations and their executives?

    • jemmyw 9 hours ago

      I don't think there is actually a law anywhere that says you need to obtain the rights to copyright material to read/view them. The person or organisation showing it to you, which might be yourself, needs to have a license. Otherwise things like libraries couldn't exist and you wouldn't be allowed to lend books or even have books in your house that other family members can read.

      Not saying that particularly impacts your argument about OpenAI, because an LLM in training is not a person. It is transforming data from one format to another for later consumption by people. Therefore they probably would need a license.

      • hansmayer 8 hours ago

        I mean, look at it this way. Let's say you purchase a Woody Allen film on DVD. Will anyone seriously prosecute you for watching it at home together with your friends? No, that falls within normal usage. But let's say you now organise a local watching event with the same DVD for 200 people in a hall somewhere, and charge everyone, whatever, $6 - just to cover the hall expenses. Will you be prosecuted? Very likely. Libraries are probably under some sort of "fair use" regulation due to public interest and such. They don't quite generate profit with their line of work - nor should they!

        • jemmyw 6 hours ago

          Right, but those 200 people won't be prosecuted for watching it, which was my point. The example I was thinking about when posting would be putting up a copy of copyright art in a public place. The people in the public place are not breaking the law by looking at it, only the person who placed it... well even then, would the workers who put it up be liable? Probably not, it's not reasonable for someone who puts up billboards to check the copyright license.

          • hansmayer 5 hours ago

            I do agree with this example in general. But I guess from my point of view, the OpenAI comes across more like the person enabling the use of copyrighted art, and would thus be subject to copyright regulations. Their users I'd see rather as the people viewing the art in public, perhaps unaware of the copyright restrictions. But it also seems like these discussions in themselves are a bit of distraction. If the LLMs worked exactly as they are being hyped up for the third year now, I think we all would get behind the effort. Who would care about copyrights if a magic machine could lead us into the so-called post-scarcity world, right? But sadly it does not appear to be nowhere near that goal, nor will it be, based on what we know about how the technology works. So here we are, discussing if mechanical parrots should read our books :)

  • piracymadelegal 9 hours ago

    Sure, so, can I make and sell my own Lilo and Stitch movie now? It'll be even better than the one about to release, and all that means is I'll deviate even less.

    • thedevilslawyer 8 hours ago

      This was settled prior to LLMs - you can't do that because the characters names are copyrighted. LLMs change nothing here.

  • jemmyw 8 hours ago

    Because the AI is not a person. It doesn't seem like we're anywhere near AGI that could be considered a person. Training an LLM is taking existing content and transforming it into another format for later consumption by a person. That person can run prompts against the LLM to create derivative work, the LLM itself doesn't run prompts or do anything at all.

    I don't know much about the legal side, but it seems to me, from the above, that the laws for copyright for LLMs should apply to the company training the LLM as if they're creating a derivative work that they will later sell or license for other people to interact with.

  • austin-cheney 8 hours ago

    Copyright does not restrict consumption. It only restricts reproduction. To restrict consumption you need a patent.

    • thedevilslawyer 8 hours ago

      Good then that LLMs don't reproduce content.

      • someothherguyy 8 hours ago

        They produce derivative works, which is also an exclusive right of a copyright holder.

        • menaerus 8 hours ago

          If I derive my work using multiple sources, do all the copyright holders from these multiple sources have an exclusive right on my work? How otherwise would people build a knowledge on some topic and then apply that knowledge to build a product if not by reading bunch of (book) material and studying other similar products?

          • someothherguyy 6 hours ago

            If they can prove it in court. Would be much easier to do for a LLM than for a human one would think.

    • someothherguyy 8 hours ago

      > It only restricts reproduction

      and distribution.

  • otabdeveloper4 7 hours ago

    > ... a person can read copyrighted material

    Yes, after paying for it.

quintes 9 hours ago

Didn’t read but

No

ksynwa 8 hours ago

I don't think I've ever read anything this disingenuous

thiago_fm 9 hours ago

This is so wrong in so many levels.

But given that Trump clearly seems aligned with technobros, I wouldn't be surprised.

This will be good for the rest of the world, though. Other countries will be less likely to be aligned to US, end of US imperialism has been just speed up little by little.