Hacker Newsnew | past | comments | ask | show | jobs | submit | BlackFly's commentslogin

The California law is the closest thing to what we do in the physical world but better. We already decided as a society to limit the purchase of pornography, gambling, alcohol, tobacco, prostitution, drugs, via age gates and require the merchant to be liable for that. We already find this reasonable as a society. The California law recognizes the tracking problems of requiring a verifiable id online and instead recognizes that parental self-assertion at the point of account creation is enough.

Since tracking children is generally illegal, you can also voluntarily lie and label yourself as a child when you don't want to access such content.


We have decided as a society to age-gate the purchase of a very small selection of goods and services, but this did not require a law that says all merchants have the right to know your age. And in this case, it's not even just all merchants, but anyone that serves you any kind of information. The real world equivalent of this California bill would be more like: anyone you've ever talked to has the right to know your age.

A more reasonable approach would be for parents to keep tabs on (or for stricter parents, control) who their child is associating with and where they're going, and advise their child on who/what to stay away from if they're out alone. And of course that takes parenting effort. The digital equivalent of this are things like password-gating app installation in the OS and website-blocking in the WiFi router. But I will say, I don't think these kinds of analogies are good because the Internet is too different from the physical world.

And let's not underestimate the tracking power of a legally mandated data point: the age contains about 6 bits of information that can be used to identify your user account on the Internet across apps and websites, even if your inputted age is fake.


An account level flag in a user account on an operating system is the opposite of verified identification. It is self assertion by the owner of the computer: the parent. If such a control works in the same way as enterprise supervision the child won't be able to install a vpn, or other software to bypass the control.

Firefox reader mode worked for me, but I agree it was terrible.

My plan with anti-patterns is to just turn away and move on to something else. People need to vote with their attention. If the content and information is important it will make its way to me eventually through a better channel. Yeah, it's selfish, but it's the only way I have to fight enshittification of the web.

It is enforceable, I think you mean to say that it cannot be prevented since people can attempt to hide their usage? Most rules and laws are like that, you proscribe some behavior but that doesn't prevent people from doing it. Therefore you typically need to also define punishments:

> This policy is not open to discussion, any content submitted that is clearly labelled as LLM-generated (including issues, merge requests, and merge request descriptions) will be immediately closed, and any attempt to bypass this policy will result in a ban from the project.


What happens when the PR is clear, reasonable, short, checked by a human, and clearly fixes, implements, or otherwise improves the code base and has no alternative implementation that is reasonably different from the initially presented version?


If you're going to set a firm "no AI" policy, then my inclination would be to treat that kind of PR in the same way the US legal system does evidence obtained illegally: you say "sorry, no, we told you the rules and so you've wasted effort -- we will not take this even if it is good and perhaps the only sensible implementation". Perhaps somebody else will eventually re-implement it later without looking at the AI PR.


How funny would it be if the path to actually implement that thing is then cut off because of a PR that was submitted with the exact same patch. I'm honestly sitting here grinning at the absurdity demonstrated here. Some things can only be done a certain way. Especially when you're working with 3rd party libraries and APIs. The name of the function is the name of the function. There's no walking around it.


It follows the same reasoning as when someone purposefully copies code from a codebase into another where the license doesn't allow. Yes it might be the only viable solution, and most likely no one will ever know you copied it, but if you get found out most maintainers will not merge your PR.


That's why I said "somebody else, without looking at it". Clean-room reimplementation, if you like. The functionality is not forever unimplementable, it is only not implementable by merging this AI-generated PR.

It's similar to how I can't implement a feature by copying-and-pasting the obvious code from some commercially licensed project. But somebody else could write basically the same thing independently without knowing about the proprietary-license code, and that would be fine.


The trick is getting people to believe you.

You not realizing how ridiculous this is, is exactly why half of all devs are about to get left behind.

Like, this should be enshrined as the quintessential “they simply, obstinately, perilously, refused to get it” moment.

Shortly, no one is going to care about anyone’s bespoke manual keyboard entry of code if it takes 10 times as long to produce the same functionality with imperceptibly less error.


> Shortly, no one is going to care about anyone’s bespoke manual keyboard entry of code if it takes 10 times as long to produce the same functionality with imperceptibly less error.

Well that day doesn't appear to be coming any time soon. Even after years of supposed improvements, LLMs make mistakes so frequently that you can't trust anything they put out, which completely negates any time savings from not writing the code.


Sorry, but this is user error.

1) Most people still don't use TDD, which absolutely solves much of this.

2) Most poople end up leaning too heavily on the LLM, which, well, blows up in their face.

3) Most people don't follow best practices or designs, which the LLM absolutely does NOT know about NOR does it default to.

4) Most people ask it to do too much and then get disappointed when it screws up.

Perfect example:

> you can't trust anything they put out

Yeah, that screams "missing TDD that you vetted" to me. I have yet to see it not try to pass a test correctly that I've vetted (at least in the past 2 months) Learn how to be a good dev first.


> no one is going to care about anyone’s bespoke manual keyboard entry of code if it takes 10 times as long to produce the same functionality with imperceptibly less error.

No one is going to care about anyone’s painstaking avoidance of chlorofluorocarbons if it takes ten times as long to style your hair with imperceptibly less ozone hole damage.


This is a non-argument. All of the cloud LLM's are going to move to things like micronuclear. And the scientific advances AI might enable may also help avoid downstream problems from the carbon footprint

I wasn't gesturing to the energy/environmental impacts of AI.

The problem is that even if the code is clear and easy to understand AND it fixes a problem, it still might not be suitable as a pull request. Perhaps it changes the code in a way that would complicate other work in progress or planned and wouldn't just be a simple merge. Perhaps it creates a vulnerability somewhere else or additional cognitive load to understand the change. Perhaps it adds a feature the project maintainer specifically doesn't want to add. Perhaps it just simply takes up too much of their time to look at.

There are plenty of good reasons why somebody might not want your PR, independent of how good or useful to you your change is.


How would you tell that it's LLM-generated in that case?

If the submitter is prepared to explain the code and vouch for its quality then that might reasonably fall under "don't ask, don't tell".

However, if LLM output is either (a) uncopyrightable or (b) considered a derivative work of the source that was used to train the model, then you have a legal problem. And the legal system does care about invisible "bit colour".


It's (c) copyright of the operator.

For one simple reason. Intention.

Here's some code for example: https://i.imgur.com/dp0QHBp.png

Both sides written by an LLM. Both sides written based on my explicit prompts explaining exactly how I want it to behave, then testing, retesting, and generally doing all the normal software eng due diligence necessary for basic QA. Sometimes the prompts are explicitly "change this variable name" and it ends up changing 2 lines of code no different from a find/replace.

Also I'm watching it reason in real time by running terminal commands to probe runtime data and extrapolate the right code. I've already seen it fix basic bugs because an RFC wasn't adhered to perfectly. Even leaving a nice comment explaining why we're ignoring the RFC in that one spot.

Eventually these arguments are kinda exhausting. People will use it to build stuff and the stuff they build ends up retraining it so we're already hundreds of generations deep on the retraining already and talking about licenses at this point feels absurd to me.


I think you need to read the report from the US Copyright office that specifically says that it's *not* (c) copyright of the operator.

It doesn't matter if the "change this variable name" instruction ends up with the same result as a human operator using a text editor.

There is a big difference between "change this variable name" and "refactor this code base to extract a singleton".


You may as well be the MPAA right now throwing threats around sharing MP3s. We're past the point of caring and the laws will catch up with reality eventually. The US copyright office says things that get turned over in court all the time.


Tell me, how have laws “caught up with” “the [RIAA…] throwing threats around sharing MP3s?” So far as I know that’s still considered copyright infringement and the person doing it, if caught, can be liable for very substantial statutory damages.

It sounds like you really can’t handle being told “no, you can’t use an LLM for this” by someone else, even if they have every right to do so. You should probably talk to your therapist about that.


lol, ask the software industry whether or not their "past the point of caring" about the licenses on their software.

Whether it's an OSS license or a commercial license, both are dependent on copyright as the underlying IP Right.

The courts have so far (in the US) agreed with the Copyright office's reasoning.

Use an LLM as a tool, mostly OK.

Use it to create source from scratch, no copyright as the author isn't human.

Use it to modify existing software, the result is only copyright on whatever original remains.


The entire industry is right now encouraging LLM use all day everyday at big corps including mine. If your argument is the code we are producing isn't copyright of our employers you won't get very far. Call it the realpolitik of tech if you want.

This is where most reasonable people would say “OK, fine”

CLEARLY, a lot of developers are not reasonable


It is entirely reasonable for a project to require you to attest that the thing you are contributing is your own work.

The unreasonable ones are the ones with the oppositional-defiant “You can’t tell me I can’t use an LLM!” reaction.


It IS their own work.

The simplest refutation of your point of view is, who or what is responsible if the work submission is wrong?

It will always be the person’s, never the computer’s. Conveniently, AI always acts as if it has no skin in the game… because it literally and figuratively doesn’t… so for people to treat it like it does, should be penalized


If it’s the output of an LLM, it’s not their own work.

Who prompted the LLM?

Who vetted the output?

Who ensured there was adequate test coverage?

Who insisted on a certain design?

Who is to blame if it's bad code? That is the same entity that is responsible, and the same entity that "did it"

tl;dr your stance is full of poop, my dude


“I looked up the topic on Wikipedia and I highlighted the text and I selected copy and I selected paste so I don’t see how this is plagiarism.”

That’s what you sound like.


You sound like someone who has literally zero understanding as to why that is a ridiculous comparison.

There are a thousand and one ways that I participate when building something with LLM assistance. Everything from ORIGINATING AN IDEA TO BEGIN WITH, to working on a thorough spec for it, to ensuring tests are actually valid, to asking for specific designs like hexagonal design, to specific things like benchmarks... literally ALL OF THE INITIATIVE IS MINE, AND ALL OF THE SUCCESS/FAILURE CONSEQUENCES ARE MINE, AND THAT IS ULTIMATELY ALL THAT MATTERS

Please head towards a different career if you now have a stupid and contrived excuse not to continue working with the machines, because you sound like a whining child

And you're not answering the question, because you know it would end your point: WHO OR WHAT IS RESPONSIBLE IF THE CODE SUCCEEDS OR FAILS?


I started working in the industry when you were able to buy a Lisp Machine new and have been studying AI even longer, and I’ve been very successful in it. I not only know what I’m talking about, I have the experience to back it up.

You sound like someone who’s deeply in denial about exactly how the LLM plagiarism machines work. You really do sound like a student defending themselves against a plagiarism charge by asserting that since they did the work of choosing the text to put into their essay and massaging the grammar so it fit, nobody should care where it came from.


By that definition, every single human who wrote a paper after reading a source document is a “plagiarism machine”

and I’m 53 and well remember Symbolics from freshman year at Cornell, in fact my application essay to it was about fuzzy logic (AI-tangential) and probably got me in, so I too am quite familiar

i’m also quite good at debate. the flaw in your logic is that plagiarism requires accountability and no machine can be accountable, only the human that used it, ergo, it is still the work of the human, because the human values, the human vets, the human initiates, and the human gains or loses based on the combined output, end of story; accelerated thought is still thought, and anyway, if a machine can replicate thought, then it wasn’t particularly original to begin with


and your stance is not your own if you got the LLM to stand for you. ;-P

human prompting != human production


Yes, what happens when the murder looks like a heart attack? This isn't hypothetical, some assassinations occur like this. That doesn't make murder laws unenforceable.

Lots of people try to get away with perfect crimes and sometimes do. That doesn't make the rule unenforceable, it just highlights the limits of human knowledge in the face of a dishonest person. Hence the escalations for trying to destroy evidence of crimes or in this case to work around the AI policy. Here, instead of just closing your PR, they ban you if you try to hide it.


I think the bigger point about enforcement is not whether you're able to detect "content submitted that is clearly labelled as LLM-generated", but that banning presumes you can identify the origin. Ie.: any individual contributor must be known to have (at most) one identity.

Once identity is guaranteed, privileges basically come down to reputation — which in this case is a binary "you're okay until we detect content that is clearly labelled as LLM-generated".

[Added]

Note that identity (especially avoiding duplicate identity) is not easily solved.


You can slap on any punishment clause you want but verifying LLM-origin content without some kind of confession is shaky at best outside obvious cases like ChatGPT meta-fingerprints or copy-paste gaffes. Realistically, it boils down to vibes and suspicion unless you force everyone to record their keystrokes while coding which only works if you want surveillance. If the project ever matters at scale people will start discussing how enforceability degrades as outputs get more human-like.


There’s this thing called “honor” where if you tell someone that they need to affirm their contribution is their own work and not created with an LLM, most people most of the time will tell the truth—especially if the “no LLMs” requirement is clearly stated up front.

You’re basically saying that a “no-LLMs” rule doesn’t matter, because dishonorable people exist. That’s not how most people work, and that’s not how rules work.

When we encounter a sociopath or liar, we point them out and run them out of our communities before they can do more damage, we don’t just give up and tolerate or even welcome them.


Unenforceable means they can't actually enforce it since they can't discriminate high quality LLM code from hand typed


Well, unenforceable isn't a synonym for undetectable or awkward. Their policy indicates that they are aware of this difficulty: if you admit to using AI then they close your pull request, if you do not admit to using AI but evidence later surfaces that you did then they ban you. They can enforce this.

The hope here is the same hope as most laws: that lies eventually catch up to people. That truth comes to light. But sure, in the meanwhile, there are always dishonest people around trying to flout rules to varying degrees of success. Some are caught right away, some live their entire lives without it catching up to them. That doesn't make the rule unenforceable, that just highlights the limits of rules: it requires evidence that can be hard to come by.


This is the dream of the sociopathic slopmonger.

Real people in the real world understand that rules don’t simply cease to exist because there’s no technical means of guaranteeing their obedience. You simply ask people to follow them, and to affirm that they’re following them whether explicitly or implicitly, and then mete out severe social consequences for being a filthy fucking liar.


Keep wishing, in the meantime some people have to deal with the real world and plan accordingly

My personal preference is for laws that promote reasonable limits on "Standard terms and conditions" and then recognizing that nobody reads them and making them applicable regardless of whether people read them or not. Then companies can stop pretending like people are reading the standard terms and unfair terms are just unenforceable. This does require that your civil law defines what unfair terms look like (generally that they are too one sided in favor of the contractor or are surprising given the service provided).

Obviously, this doesn't exist in the USA but does exist in (for example) the Netherlands. I would recommend lobbying in your country for such laws since in practice the vast majority of contracts like these that people face aren't actually negotiated nor negotiable.


The point of that line is to robustly survive a rename of the directory which won't be automatically tracked without that line. You have to read between the lines to see this: they complain about this problem with .gitkeep files.


I really find this kind of appeal quite odious. God forbid that we expect fathers to have empathy for their sons, sisters, brothers, spouses, mothers, fathers, uncles, aunts, etc. or dare we hope that they might have empathy for friends or even strangers? It's like an appeal to hypocrisy or something. Sure, I know such people exist but it feels like throwing so many people under the bus just to (probably fail) to convince someone of something by appealing to an emotional overprotectiveness of fathers to daughters.

You should want to protect all of the people in your life from such a thing or nobody.


For it to be evidence, you would need to know the number of Greptile comments made and how many of those comments were instead considered to be poor. You need to contrast false positive rate with true positive rate to simply plot a single point along a classifier curve. You would then need to contrast that with a control group of experts or a static linter which means you would need to modify the "conservativeness" of the classifier to produce multiple points along its ROC curve, then you could compare whether the classifier is better or worse than your control by comparing the ROC curves.

Sample number of true positives says more or less nothing on its own.


For such a literal case, automatic translations generally suffice. The real translator touch comes about when their is some nuance to the language.

Was that a double entendre or not? If not, you might make a literal translation to get the meaning across. If so, then a literal translation will not get the message across. Vice versa, if it was not a double entendre but you translate it as one, you may confuse the message and if it was and you translate it as such, then the human connection can be maintained.

That is also the tricky bit where you cross from being proficient in the language (say B1-B2) to fluent (C1-C2), you start knowing these double meanings and nuance and can pick up on them. You can also pick up on them when they weren't intended and make a rejoinder (that may flop or land depending on your own skill).

If you are constantly translating with a machine, you won't really learn the language. You have to step away at some point. AI translations present that in full: a translated text with a removed voice; the voice of AI is all of us and that sounds like none of us.


> The real translator touch comes about when their is some nuance to the language.

And as we all know legal language is famous for having no nuance whatsoever, there are no opaque technical terms with hundreds of years of history behind their usage, there is no difference between the legal systems of different countries, and there is no possible difference in case law or the practicalities of legal enforcement. /sarcasm

What is clear to me that in a situation like this neither AI translation nor human translation is sufficient. What the imagined American signing an important legal document in the Czech Republic needs is a lawyer practicing in the Czech Republic who speaks a language the imagined American also speaks.


The first part is absolutely not true.

As someone who has been in that situation before, ‘literal’ translation is not actually a thing. Words and phrases have different meanings between legal systems.

You need a certified translation from someone who is familiar with both legal systems or you’re going to have a very bad time.

Which I think you know from the second part of your statement.

Legal documents likely have much more impact than a random chat with a stranger.


The biggest friction I experience with respect to rust closures is their inability to be generic: I cannot implement a method that takes a closure generic over its argument(s).

So then I'm forced to define a trait for the function, define a struct (the closure) to store the references I want to close over, choose the mutability and lifetimes, instantiate it manually and pass that. Then the implementation of the method (that may only be a few lines) is not located inline so readability may suffer.


You most definitely can.

    fn foo<F, T>(f: F)
    where
        F: Fn(T),
        T: ToString,
    {
        f("Hello World")
    }
Or did I not understand what you meant?


Something like this isn’t possible

    fn foo(f: impl for<T: ToString> Fn(T)) {
        f(“Hello World”);
        f(3.14);
    }

    fn main() {
        f(|x| println!("{}", x.to_string());
    }
The workaround:

    trait FnToString {
        fn call(&self, x: impl ToString);
    }

    fn foo(f: impl FnToString) {
        f.call("Hello World");
        f.call(3.14);
    }

    struct AnFnToString;
    impl FnToString for AnFnToString {
        fn call(&self, x: impl ToString) {
            println!("{}", x.to_string());
        }
    }

    fn main() {
        foo(AnFnToString);
    }


Ha, yes, I see what you mean now. That's not really the closure's fault but monomorphization of the foo function. The specific thing you want to do would require boxing the value, or do more involved typing.


Do you have an example of this? I'm not sure I follow it exactly.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: