More

athorax · 2026-03-17T16:17:27 1773764247

How exactly does it violate the Developer's Certificate of Origin clause?

indutny · 2026-03-17T16:23:36 1773764616

The submitted code must adhere to either of (a), (b), (c), and separately a (d) clause of: https://github.com/nodejs/node/blob/main/CONTRIBUTING.md#dev...

If submitter picks (a) they assert that they wrote the code themselves and have right to submit it under project's license. If (b) the code was taken from another place with clear license terms compatible with the project's license. If (c) contribution was written by someone else who asserted (a) or (b) and is submitted without changes.

Since LLM generated output is based on public code, but lacks attribution and the license of the original it is not possible to pick (b). (a) and (c) cannot be picked based on the submitter disclaimer in the PR body.

athorax · 2026-03-17T18:37:51 1773772671

Not sure if you are intentionally misrepresenting (a), but here is the full text

(a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or

duskdozer · 2026-03-17T20:42:22 1773780142

That seems exclusive of LLMs, as the user didn't create the contribution, the LLM did.

Dylan16807 · 2026-03-18T04:19:59 1773807599

It's exclusive of code where you wrote 0% of it.

"in part" is a trivial bar to clear.

duskdozer · 2026-03-18T07:52:34 1773820354

I guess as a very strict reading where you take the output and insert a newline somewhere...but that sounds against the intent

paulryanrogers · 2026-03-17T23:27:52 1773790072

Orthogonal to? Irrespective of the use of?

Dylan16807 · 2026-03-17T19:30:27 1773775827

If there's a "the original" the LLM is copying then there's a problem.

If there isn't, then (b) works fine, the code is taken from the LLM with no preexisting license. And it would be very strange if a mix of (a) and (b) is a problem; almost any (b) code will need some (a) code to adapt it.

lmm · 2026-03-18T05:19:36 1773811176

> the code is taken from the LLM with no preexisting license

That's not good enough to comply with (b). The code must be specifically covered by an open-source license, it's not enough for it to just not have a license.

Dylan16807 · 2026-03-18T05:48:30 1773812910

There's a difference between "no license, all rights reserved" and "no license, public domain". Up until recently, you could assume that not having a license meant the former. But treating the latter as the same would just be silly.

As far as I'm concerned, public domain counts as "an appropriate open source license".

lmm · 2026-03-18T06:13:28 1773814408

> As far as I'm concerned, public domain counts as "an appropriate open source license".

For material whose author is known and has explicitly placed it in the public domain, sure. For code that fell off the back of a truck, not so much.

Dylan16807 · 2026-03-18T20:13:03 1773864783

I'm of course assuming the legal status quo holds, where code properly generated by LLM is also explicitly public domain. No shadiness involved.

(There's always a risk of an LLM copying something verbatim by accident, but if the designers are doing their job that chance gets low enough to be acceptable. Human code has that risk too after all. (And for situations that aren't an accident, with the human intentionally using snippets to draw out training text, then if they submit that code in a patch it's just a human violating copyright with extra steps.))

lmm · 2026-03-19T00:32:40 1773880360

> code properly generated by LLM is also explicitly public domain

Where? I hadn't heard of any such ruling.

Dylan16807 · 2026-03-19T03:28:42 1773890922

https://en.wikipedia.org/wiki/Artificial_intelligence_and_co...

This page has a pretty good overview.

> Both the federal and circuit courts in the District of Columbia have upheld the Copyright Office's refusal to register copyrights for works generated solely by machines, establishing that machine ownership would conflict with heritable property rights as establish by the Copyright Act of 1975.[16] As of March 2026, the Supreme Court of the United States has denied hearing challenges to the Copyright Office's decision.[17]

benatkin · 2026-03-17T19:21:08 1773775268

To many, it qualifies under either A or B, and therefore C as well. Under A, you can think of the LLM as augmenting your own intelligence. Under B, the license terms of LLM output are essentially that you can do whatever you want with it. The alternative is avoiding use of AI because of copyright or plagiarism concerns.

charcircuit · 2026-03-17T16:39:47 1773765587

It would be considered (a) since the author would own the copyright on the code.

lacoolj · 2026-03-17T17:22:39 1773768159

Owning copyright of something and writing it are very different things

habinero · 2026-03-18T04:02:26 1773806546

Not in the US. Copyright exists from the moment the work is created.

Source: https://www.copyright.gov/help/faq/faq-general.html

crote · 2026-03-17T16:51:47 1773766307

Citation needed.

Whether AI output can fall under copyright at all is still up for debate - with some early rulings indicating that the fact that you prompted the AI does not automatically grant you authorship.

Even if it does, it hasn't been settled yet what the impact of your AI having been trained on copyrighted material is on its output. You can make a not-completely-unreasonable argument that AI inference output is a derivative work of AI training input.

Fact is, the matter isn't settled yet, which means any open-source project should assume the worst possible outcome - which in practice means a massive AI-generated PR like this should be treated like a nuke which could go off at any moment.

charcircuit · 2026-03-17T17:02:12 1773766932

The two main points are that:

1. Copyright cannot be assigned to an AI agent.

2. Copyrighted works require human creativity to be applied in order to be copyrighted.

For point 2 this would apply to times were AI one shots a generic prompt. But for these large PRs where multiple prompts are used and a human has decided what the design should be and how the API should look you get the human creativity required for copyright.

In regards to being a derivative work I think it would be hard to argue that an LLM is copying or modifying an existing original work. Even if it came up with an exact duplicate of a piece of code it would be hard to prove that it was a copy and not an independent recreation from scratch.

>the worst possible outcome

The worst possible outcome is they get sued and Anthropic defends them from the copyright infringement claim due to Anthopic's indemnity clause when using Claude Code.

monocularvision · 2026-03-17T18:28:32 1773772112

That indemnity clause is only for Team, Enterprise and API users. Do you know what was used here?

Also the commercial version is limited to “…Customer and its personnel, successors, and assigns…”. I am very much not a lawyer and couldn’t find definitions of these in the agreement but I am not sure how transferable this indemnity would be to an open source project.

charcircuit · 2026-03-17T19:20:29 1773775229

I reviewed it and it looks like personal Claude Code subscriptions are not covered, so it's riskier than I claimed.

phendrenad2 · 2026-03-17T18:08:43 1773770923

Why write open-source software at all, when the government could outlaw open-source entirely? What if an asteroid destroys Earth and there are no humans left to enjoy your work? At some point, you have to agree that a risk isn't worth worrying about. And your "worst possible outcome" is just the arbitrary outcome that you think has some subjective risk threshold. And it's certainly not one I agree with. Furthermore, calling it a "nuke" is a bad analogy because that implies that it can't be put back in the bottle once opened. In reality, we're dealing with legal definitions, which can be redefined as easily as defined.

habinero · 2026-03-18T04:34:17 1773808457

> And it's certainly not one I agree with

Well, it's a good thing you're not on the hook for defending against it, then.

Like I said in another comment, you don't have a license just because they're cool and look neat. You have them specifically to guard against people like patent trolls, who are trying to wreck your shit and take your lunch money. It's not an abstract risk.

phendrenad2 · 2026-03-19T14:46:39 1773931599

> Well, it's a good thing you're not on the hook for defending against it, then

If you are on the hook for defending against it, and your risk assessment is based on emotional, irrational fear and not an objective understanding of the risks, then you're doing people a disservice and should step down.

UqWBcuFx6NV4r · 2026-03-17T22:54:16 1773788056

This is not how law works. Stop pretending that you’re a lawyer. You do not “always assume the worst”. Stop giving legal advice. You’re very clearly a developer in over his head. Law is not an engineering problem. Legislation is not a technical specification. Christ.

habinero · 2026-03-18T04:23:29 1773807809

No, they're absolutely correct, and they're not saying either of those things. They're pointing out an enormous hidden risk. Yanno, like an engineer is supposed to do.

You don't have a license because it's what all the cool kids are doing, you have one in case shit goes sideways and someone decides to try and ruin your day. You do, in fact, have to assume the worst.

The "nuke" here is some litigious company -- let's call them Patent Troll Rebranded (PTR) -- discovers that the LLM reproduced large amounts of their copyrighted code. Or it claims to have discovered it. They have large amounts of money and lawyers to fight it out in court and you are a relatively shoestring language foundation.

Either you have to unwind years of development to remove the offending code or you're spending six figures or more to defend yourself in court, all because you didn't bother to anticipate things that are anticipatable.

athorax · 2026-02-26T15:09:33 1772118573

Why do you think there is a lot of training data? Could it be because it's stable and virtually unchanged for decades? Hmmm.

esafak · 2026-02-26T15:24:44 1772119484

Because bash is everywhere. Stability is a separate concern. And we know this because LLMs routinely generate deprecated code for libraries that change a lot.

gaigalas · 2026-02-26T15:29:52 1772119792

This project runs on all shells, totally portable:

https://github.com/alganet/coral

busybox, bash, zsh, dash, you name it. If smells bourne, it runs. Here's the list: https://github.com/alganet/coral/blob/main/test/matrix#L50 (more than 20 years of compatibility, runs even on bash 3)

It's a great litmus test, that many have passed. Let me know when just-bash is able to run it.

esafak · 2026-02-26T16:01:05 1772121665

I have no connection to coral or just-bash. Why don't you do it yourself and let us know, since you are familiar with it?

gaigalas · 2026-02-26T16:17:13 1772122633

I've been working with the shell long enough that I know just by looking at it.

Anyway, it was rethorical. I was making a point about portability. Scripts we write today run even on ancient versions, and it has been an effort kept by lots of different interpreters (not only bash).

I'm trying to give sane advice here. Re-implementing bash is a herculean task, and some "small incompatibilities" sometimes reveal themselves as deep architectural dead-ends.

esafak · 2026-02-26T17:19:24 1772126364

The project does not list portability as a concern. It's for agent use; they are not trying to re-use existing bash code.

gaigalas · 2026-02-26T18:21:46 1772130106

Before, you said:

> they use it because there's a lot of training material.

Now, you say:

> they are not trying to re-use existing bash code.

Can't you see how this is a contradiction?

---

I'm sorry, I can't continue like this. I want to have meaningful conversations.

esafak · 2026-02-26T19:11:06 1772133066

Is English your second language? "They" refers to very different things here.

gaigalas · 2026-02-27T06:08:11 1772172491

The issue here is not language, is basic understanding of how LLMs are trained, how agents act on that training and what is the role of the shell from a systems perspective.

I can't have a meaningful conversation with someone that doesn't fully grasp those, no matter in which language.

athorax · 2026-02-25T17:35:47 1772040947

It's like they are trying to do the opposite of the Unix philosophy. Do many things very poorly.

pipeline_peak · 2026-02-25T17:40:49 1772041249

Why’s this poor?

0cf8612b2e1e · 2026-02-25T17:45:48 1772041548

My work machine is Win11 and the new Notepad is hilariously buggy. Repeatedly encountered bugs where the screen fails to paint, takes multiple seconds to load, hard refuses to open files of a certain size, etc.

Notepad was never fancy, but it was a reliable tool to strip formatting or take a quick note, and now I cannot even count on that.

nottorp · 2026-02-25T20:21:18 1772050878

They've rewritten it in React?

athorax · 2026-02-10T01:23:59 1770686639

They dont care. Their sales reps absolutely know that if you are using Microsoft products it is because you are locked in so deeply that escape is nearly impossible.

athorax · 2026-01-16T15:25:54 1768577154

I like CUE a lot. We use it pretty heavily for schema enforcement of CRDs. That being said, it is pretty complex and learning to use it was anything but straight forward.

For more basic configs, I would potentially look into KCL https://www.kcl-lang.io/

It has a much simpler usage overall especially if you are only really trying to enforce some config rules.

The other alternative is to just use whatever language you are writing your software in and build a basic validator

athorax · 2025-12-03T17:44:54 1764783894

Lets put it this way, no engineer is choosing to use bitbucket. You use it because some SVP made the mistake of choosing atlassian software a decade ago and refuses to change.

athorax · 2025-10-29T19:14:31 1761765271

For me the biggest value of uv was replacing pyenv for managing multiple versions of python. So uv replaced pyenv+pyenv-virtualenv+pip

gegtik · 2025-10-29T19:30:09 1761766209

Yes. poetry & pyenv was already a big improvement, but now uv wraps everything up, and additionally makes "temporary environments" possible (eg. `uv run --with notebook jupyter-notebook` to run a notebook with my project dependencies)

Wonderful project

Hasz · 2025-10-29T19:21:01 1761765661

This is it. Later versions of python .11/.12/.13 have significant improvements and differences. Being able to seamlessly test/switch between them is a big QOL improvement.

I don't love that UV is basically tied to a for profit company, Astral. I think such core tooling should be tied to the PSF, but that's a minor point. It's partially the issue I have with Conda too.

zahlman · 2025-10-29T19:27:16 1761766036

> Later versions of python .11/.12/.13 have significant improvements and differences. Being able to seamlessly test/switch between them is a big QOL improvement.

I just... build from source and make virtual environments based off them as necessary. Although I don't really understand why you'd want to keep older patch versions around. (The Windows installers don't even accommodate that, IIRC.) And I can't say I've noticed any of those "significant improvements and differences" between patch versions ever mattering to my own projects.

> I don't love that UV is basically tied to a for profit company, Astral. I think such core tooling should be tied to the PSF, but that's a minor point. It's partially the issue I have with Conda too.

In my book, the less under the PSF's control, the better. The meager funding they do receive now is mostly directed towards making PyCon happen (the main one; others like PyCon Africa get a pittance) and to certain grants, and to a short list of paid staff who are generally speaking board members and other decision makers and not the people actually developing Python. Even without considering "politics" (cf. the latest news turning down a grant for ideological reasons) I consider this gross mismanagement.

philipallstar · 2025-10-29T19:39:22 1761766762

> I think such core tooling should be tied to the PSF, but that's a minor point.

The PSF is busy with social issues and doesn't concern itself with trivia like this.

rkomorn · 2025-10-29T19:25:19 1761765919

Didn't Astral get created out of uv (and other tools), though? Isn't it fair for the creators to try and turn it into a sustainable job?

Edit: or was it ruff? Either way. I thought they created the tools first, then the company.

philipallstar · 2025-10-29T19:38:38 1761766718

With uvx it also replaces pipx.

athorax · 2025-10-28T19:49:50 1761680990

I am confused on this as well, they list polyglot teams[0] as their top use case and consider not needing schema files a feature

[0] https://fory.apache.org/blog/2025/10/29/fory_rust_versatile_...

athorax · 2025-10-02T15:33:07 1759419187

I think the big difference is that these aren't AI generated bug reports. They are bugs found with the assistance of AI tools that were then properly vetted and reported in a responsible way by a real person.

Legion · 2025-10-02T17:01:23 1759424483

Basically using AI the way we have used linters and other static analysis tools, rather than thinking it's magic and blindly accepting its output.

stocksinsmocks · 2025-10-03T03:24:23 1759461863

In the defense of the language models, the bugs were written by humans in the first place. Human vetting is not much of a defense.

josefx · 2025-10-03T15:00:10 1759503610

From what I understand some of the bugs where in code the AI made up on the spot, other bug reports had example code that didn't even interact with curl. These things should be relatively easy to verify by a human, just do a text search in the curl source to see if the AI output matches anything.

Hard to compute, easy to verify things should be the case where AI excel at. So why do so many AI users insist on skipping the verify step?

NegativeK · 2025-10-03T14:01:18 1759500078

> Human vetting is not much of a defense.

The issue I keep seeing with curl and other projects is that people are using AI tools to generate bug reports and submitting them without understanding (that's the vetting) the report. Because it's so easy to do this and it takes time to filter out bug report slop from analyzed and verified reports, it's pissing people off. There's a significant asymmetry involved.

Until all AI used to generate security reports on other peoples' projects is able to do it with vanishingly small wasted time, it's pretty assholeish to do it without vetting.

athorax · 2025-10-01T15:15:11 1759331711

Thats a bit uncalled for. This is a game made by someone shaped by their perspective on the world. It can be appreciated as such without applying your own additional intent.