- It is possible to write Rust in a pretty high level way that's much closer to a statically-typed Python than C++ and some people do use it as a Python replacement
- You can build it into a single binary with no external deps
- The Rust type system + ownership can help you a lot with correctness (e.g. encoding invariants, race conditions)
Independent of ones philosophical stance on the broader topic: I find it highly concerning that AI companies, at least right now, seem to be largely exempt from all those rules which apply to everyone else, often enforced rigorously.
I draw from this that no-one should be subject to those rules, and we should try to use the AI companies as a wedge to widen that crack. Instead, most people people who claim that their objection is really only consistency, not love for IP spend their time trying to tighten the definitions of fair use, widen the definitions of derivative works, and in general make IP even stronger, which will effect far more than just the AI companies they're going after. This doesn't look to me like the behavior of people who truly only want consistency, but don't like IP.
And before you say that they're doing it because it's always better to resist massive, evil corporations than to side with them, even if it might seem expedient to do so, the people who are most strongly fighting against AI companies in favor of IP, in the name of "consistency" are themselves siding with Disney, one of the most evil companies — from the perspective of the health of the arts and our culture — that's working right now. So they're already fine with siding with corporations; they just happened to pick the side that's pro-IP.
oh hey, let's have a thought experiment in this world with no IP rules
suppose I write a webnovel that I publish for free on the net, and I solicit donations. Kinda like what's happening today anyway.
Now suppose I'm not good at marketing, but this other guy is. He takes my webnovel, changes some names, and publishes it online under his name. He is good at social media and marketing, and so makes a killing from donations. I don't see a dime. People accuse me of plagiarism. I have no legal recourse.
There are also unfair situations that can happen, equally as often, if IP does exist, and likewise, in those situations, those with more money, influence, or charisma will win out.
Also, the idea that that situation is unfair relies entirely on the idea that we own our ideas and have a right to secure (future, hypothetical) profit from them. So you're essentially begging the question.
You're also relying on a premise that, when drawn out, seems fundamentally absurd to me: that you should own not just the money you earn, but the rights to any money you might earn in the future, had someone not done something that caused unrelated others to never have paid you. If you extend that logic, any kind of competition is wrong!
there are two programmers.
first is very talented technically, but weak at negotiations, so he earns median pay.
second is average technically, but very good at negotiations, and he earns much better.
In China, engineers hold the most power, yet the country prospers. I don't think the problem is giving engineers power, rather a cultural thing. In china there is a general feeling of contributing towards the society, in the US everyone is trying to screw over each-other, for political or monetary reasons.
This is obviously false on the face of it. Let’s say I have a patent, song, or a book that that I receive large royalty payments for. It would obviously not be logical for me be in favor of abolishing something that’s beneficial to me.
Declaring that your side has a monopoly on logic is rarely helpful.
Either by democracy (more consumers than produces), or ethically (thoughts and intellect are not property), it's logical. I guess it is not logical for someone who makes money with it today.
I feel like this wording isn't great when there are many impactful open source programmers who have explicitly stated that they don't want their code used to train these models and licensed their work in a world where LLMs didn't exist. It wasn't their "gift", it was unwillingly taken from them.
> I'm a programmer, and I use automatic programming. The code I generate in this way is mine. My code, my output, my production. I, and you, can be proud.
I've seen LLMs generate code that I have immediately recognized as being copied a from a book or technical blog post I've read before (e.g. exact same semantics, very similar comment structure and variable names). Even if not legally required, crediting where you got ideas and code from is the least you can do. While LLMs just launder code as completely your own.
> I feel like this wording isn't great when there are many impactful open source programmers who have explicitly stated that they don't want their code used to train these models
That’s been the fate of many creators since the dawn of time. Kafka explicitly stated that he wanted his works to be burned after his death. So when you’re reading about Gregor’s awkward interactions with his sister, you’re literally consuming the private thoughts of a stranger who stated plainly that he didn’t want them shared with anyone.
Yet people still talk about Kafka’s “contribution to literature” as if it were otherwise, with most never even bothering to ask themselves whether they should be reading that stuff at all.
I don't think it's possible to separate any open source contribution from the ones that came before it, as we're all standing on the shoulders of giants. Every developer learns from their predecessors and adapts patterns and code from existing projects.
Exactly that. And all the books about, for instance, operating systems, totally based on the work of others: their ideas where collected and documented, the exact algorithms, and so forth. All the human culture worked this way. Moreover there is a strong pattern of the most prolific / known open source developers being NOT against the fact that their code was used for training: they can't talk for everybody but it is a signal that for many this use is within the scope of making source code available.
Yeah, documented *and credited*. I'm not against the idea of disseminating knowledge, and even with my misgivings about LLMs, I wouldn't have said anything if this blog post was simply "LLMs are really useful".
My comment was in response to you essentially saying "all the criticisms of LLMs aren't real, and you should be uncompromisingly proud about using them".
> Moreover there is a strong pattern of the most prolific / known open source developers being NOT against the fact that their code was used for training
I think it's easy to get "echo-chambered" by who you follow online with this, my experience has been the opposite, i don't think it's clear what the reality is.
If you fork an open source project and nuke the git history, that's considered to be a "dick move" because you are erasing the record of people's contributions.
The hard truth is that if you're big enough (and the original creator is small enough) you can just do whatever you want and to hell with what any license says about it.
To my understanding, the expensive lawyers hired by the biggest people around, filtered through layers of bureaucracy and translated to software teams, still result in companies mostly avoiding GPL code.
I’ve been thinking that information provenance would be very useful for LLMs. Not just for attribution (git authors), but the LLM would know (and be able to control) which outputs are derived from reliable sources (e.g. Wikipedia vs a Reddit post; also which outputs are derived from ideologically-aligned sources, which would make LLMs more personal and subjectively better, but also easier to bias and generate deliberate misinformation).
“Information provenance” could (and I think most likely would, although I’m very unfamiliar with LLM internals) be which sources most plausibly derive an output, so even output that exists today could eventually get proper attribution.
At least today if you know something’s origin, and it’s both obvious and publicly online, you have proof via the Internet Archive.
> I don't think it's possible to separate any open source contribution from the ones that came before it, as we're all standing on the shoulders of giants. Every developer learns from their predecessors and adapts patterns and code from existing projects.
Yes but you can also ask the developer (wheter in libera.irc, or say if its a foss project on any foss talk, about which books and blogs they followed for code patterns & inspirations & just talk to them)
I do feel like some aspects of this are gonna get eaten away by the black box if we do spec-development imo.
> there are many impactful open source programmers who have explicitly stated that they don't want their code used to train these models and licensed their work in a world where LLMs didn't exist. It wasn't their "gift", it was unwillingly taken from them.
There are subtle legal differences between "free open source" licensing and putting things in the public domain.
If you use an open source license, you could forbid LLM training (in licensing law, contrary to all other areas of law, anything that is not granted to licensees is forbidden). Then you can take the big guys (MSFT, Meta, OpenAI, Google) to court if you can demonstrate they violated your terms.
If you place your software into the public domain, any use is fair, including ways to exploit the code or its derivatives not invented at the time of release.
Curiosly, doesn't the GPL even imply that if you pre-tain an LLM with GPLed code and use it to generate code (Claude Code etc.) that all generated code -- as derived intellectual property that it clearly is -- must also be open sourced as per GPL terms? (It would seem in the spirit of the licensors.) Haven't seen this raised or discussed anywhere yet.
> If you use an open source license, you could forbid LLM training
Established OSS licenses are all from before anyone imagined that LLMs would come into existence, let alone train on and then generate code. Discrimination on purpose is counter to OSI principles (https://opensource.org/osd):
> 6. No Discrimination Against Fields of Endeavor
> The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
The GPL argument you describe hinges on making the legal case that LLMs produce "derived works". When the output can't be clearly traced to source input (even the system itself doesn't know how) it becomes rather difficult to argue that in court.
You pre suppose that output is derive work (not a given) and that training is not fair use (also not a given).
If the courts decide to apply the law as you assume the AI companies are all dead. But they are all betting that's not going to be the case. And since so much of the industry is taking the bet with them... The courts will take that into account
If you publish your code to others under permissive licenses, people using it to do things you do not want is not something being unwillingly taken from you.
You can do whatever you want with a gift. Once you release your code as free software, it is no longer yours. Your opinions about what is done with it are irrelevant.
But the license terms state under which conditions the code is released.
For example: MIT license states has this clause "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software."
It stands to reason that if an LLM outputs something based on MIT-licensed code then that output should at least contain that copyright because it's what the original author wished.
And I saw a comment below arguing that knowledge cannot be copyrighted, but the code is an expression of that knowledge and that most certainly can be protected by copyright.
No, in the same way that I wouldn't cite Euler every time I used one of his theorems - because it's so well known that its history is well documented in countless places.
However, if I was using a more recent/niche/unknown theorem, it would absolutely be considered bad practice not to cite where I got it from.
If I was implementing any known (named) algorithm intentionally I think I would absolutely say so in a comment (`// here we use quick sort to...` and maybe why it's the choice) and then it's easy for someone to look up and see it's due to Hoare or whoever on Wikipedia etc.
Now many will downvote you because this is an algorithm and not some code. But the reality is that programming is in large part built looking at somebody else code / techniques, internalizing them, and reproducing them again with changes. So actually it works like that for code as well.
> It wasn't their "gift", it was unwillingly taken from them.
Yes. Exactly. As a developer in that case I feel almost violated in my trust in “the internet.” Well it’s even worse, I did not really trust it, but did not think it could be that bad.
I don't understand this perspective. Programmers often scoff at most other examples of intellectual property, some throwing it out all together. I remember reading Google vs Oracle where Oracle sued Google for stealing code to perform a range check, about about 9 lines long, used to check array index bounds.
I guess the difference is AI companies bad? This is transformative technology creating trillions in value and democratizing information, all subsidized by VC money. Why would anyone in open source who claims to have noble causes be against this? Because their repo will no longer get stars? Because no one will read their asinine stack overflow answer?
Hot take: The Supreme Court should have sided with Oracle. APIs are a clear example of unique expression, and there is no statute exempting them specifically from copyright protection. If they are not protected by copyright, is anything really? What meaning has copyright law then?
Why is copyright law more important than anything else? AI is likely to drive the next stage of humanity's intellectual evolution, while copyright is a leaky legal abstraction that we pulled out of our asses a couple hundred years ago.
One of these is much more important than the other. If the copyright cartels insist on fighting AI, then they must lose decisively.
HDR videos and games (both native and proton) work in both KDE and Gnome (and supposedly Sway and Hyprland, but I haven't tried either). I think support in KDE/Gnome landed in a stable release ~6 months ago.
The HDR experience on KDE is about as good as the Windows one. Last time I tried Gnome, there was no way to configure SDR and HDR brightness separately, but it was definitely still usable.
First time round, Trump would consistently say lots of worrying stuff, but people in the US administration would stop him from following through.
This time, it's become quickly evident that he is following through.
The sentiment in Europe has changed from "well this isn't ideal, but we can just wait it out" to "this is scary and existential, we need self-sufficiency as soon as possible"
Desktop Linux is (becoming) usable for a normal person just in time, I was surprised how easily a non-technical friend switched over to Bazzite (immutable fedora with gaming extras).
> Visa, Mastercard, Paypal
The EU has already been working on a "Digital Euro" for a while
> all social media commonly used
I'm hoping more decentralized social media continues to pick up steam
My new favorite breed of commenters are AI bros who go around lamenting how trivial other peoples' work is, while they themselves fail to create anything that anyone else actually wants to use
I started reading the Grokipedia page on the "Russian invasion of Ukraine". Immediately after the abstract, it starts talking about the "9th century Kyivan Rus" which seems like irrelevant information to a conflict over a millenia later, but then you realize it's exact same thing that Putin started with in his interview with Tucker Carlson to push the 'Ukraine isn't a real country' narrative.
Sure x86 is an absolute mess, but I don't think it's a primary bottleneck. High end x86 cpus still beat high end ARM cpus by a significant margin on raw performance. Even supposing x86/ARM are bottlenecks... yeah a bottleneck at double digit billion ops per second.
> Languages unlock performance for the masses. Javascript will never be truly fast because it doesn't represent the machine.
C# and Go are already really fast (https://github.com/ixy-languages/ixy-languages) languages for the masses and at this point you can compile most things to WASM to get them run in the browser.
- You can build it into a single binary with no external deps
- The Rust type system + ownership can help you a lot with correctness (e.g. encoding invariants, race conditions)
reply