This feels like (yet another) extension of copyright. Whilst I'm not sure I comp...

tpxl · on May 12, 2023

If creating LLMs based on copyrighted data is found to be legal, all that will do is allow giant companies to sell copyrighted work without crediting the original authors, while leaving everyone else in the dirt.

andybak · on May 12, 2023

> all that will do is allow giant companies to sell copyrighted work without crediting the original authors, while leaving everyone else in the dirt.

I'm not sure I follow. But even accepting your premise - I'm not sure how it will favour giant companies over anyone else. The models are already in the wild and anyone can use them. In some ways - large companies are less likely to do anything that might open them up to legal risks or PR downsides.

Maybe this is more of a Napster moment than it is a big tech powergrab?

tpxl · on May 12, 2023

GPT is owned by Microsoft, LLama by Facebook and Bard by Google. If you trained a model on google public properties and started distributing it for money (or its output), we'd be sued into oblivion real quick.

andybak · on May 12, 2023

My point was that the models exist, people are fine tuning them and/or releasing open clones. There are models of comparable power to the state of the art without any controlling interest from a big tech company.

The Google memo covered this in detail and it was what makes me want to question the "AI is owned by big tech" angle.

zeroonetwothree · on May 12, 2023

Won’t it actually harm big companies that own the majority of IP currently? It will empower individuals to benefit from creative works more cheaply.

tpxl · on May 12, 2023

The vast majority of IP is not owned by big companies. For every big picture movie there are hundreds of indie movies that will only get a few views.

andybak · on May 13, 2023

You're measuring by len(all_media)

Surely a more useful metric is sum(all_media.value)

tpxl · on May 13, 2023

Price/profit != value. Sure, Hollywood movies bring in a ton of money, but I get way more value from daily indie youtubers than a blockbuster released once a month.

andybak · on May 13, 2023

Agreed but also Sturgeon's Law applies.

So maybe the correct answer is somewhere in-between

gumballindie · on May 12, 2023

> Prior to (I think) 1790 there was no copyright and human beings managed minor things like, you know, the renaissance and stuff like that.

Curious if the introduction of copyright is what led to an explosion of products and innovation. Suddenly people were given an incentive to monetize their ideas. I doubt the renaissance happened due to a lack of copyright. I think it's more due to social, political and health circumstances rather than the lack of protection of one's work. We, in Europe, suffered from disease, famine, war, to the point where we reached the conclusion that enough is enough - we need rules to the game.

zeroonetwothree · on May 12, 2023

There doesn’t seem to be evidence that copyright increases innovation. Indeed in some areas with no IP protection we actually see more innovation (example: fashion)

nickfromseattle · on May 12, 2023

> it's about the net benefit to society and we should be very careful what we wish for.

Seems like we have a classic trolly problem.

On one track, compensating copyright holders is required for LLMs, and it's going to be very expensive to acquire all of this copyrighted info, meaning only the biggest companies can afford to do it.

On the other track, compensating copyright holders is not required, LLMs (led by big tech) capture most of the economic value from every incremental piece of content created by humans in perpetuity, consolidating wealth in the hands of a few shareholders and insiders.

Neither seem ideal.

gumballindie · on May 12, 2023

> On one track, compensating copyright holders is required for LLMs, and it's going to be very expensive to acquire all of this copyrighted info, meaning only the biggest companies can afford to do it.

There is also the third track which is that most abundant code is open source or unlicensed content (which is protected in the US afaik). If corporations can't monetize on it, we win, because models either need to be open source or we need payment for training.

andybak · on May 12, 2023

I'm not sure it's certain yet with AI is going to lead to more consolidation or actually have the opposite effect.

Whilst history tends to make me suspect the former, the recent leaked Google memo gave me pause for thought. AI is already out there and already can be trained on consumer hardware. It's ever so slightly possible that big tech won't be able to horde the benefits this time.

welshwelsh · on May 12, 2023

I'd choose the second track without hesitation.

Shareholders can consolidate all the wealth they want, as long as they deliver the goods: LLMs that are trained on all of humanity's creative output.

zirgs · on May 12, 2023

Open source models are possible if we pick the second option. Lots of innovation in the AI scene is happening thanks to open source models being available to the general public.