Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nitpick: ChatGPR is supposed to write in a way that is indistinguishable from a human, to another human.

That doesn't mean that it can't be distguishable by some other means.



I think for small amounts of text there's no way around it being indistinguishable to a machine and not distinguishable to a human. There just aren't that many combinations of words that still flow well. Furthermore as more and more people use it I think we'll find some humans changing their speech patterns subconsciously more to mimic whatever it does. I imagine with longer text there will be things they'll be able to find, but, I think it will end up being trivial for others to detect what those changes are and then modifying the result enough to be undetectable.


I think for this sort of problem it is more productive to think in terms of the amount of text necessary for detection, and how reliable such a detection would be, than a binary can/can't. I think similarly for how "photorealistic" a particular graphics tech is; many techs have already long passed the point where I can tell at 320x200 but they're not necessarily all there yet at 4K.

LLMs clearly pass the single sentence test. If you generate far more text than their window, I'm pretty sure they'd clearly fail as they start getting repetitive or losing track of what they've written. In between, it varies depending on how much text you get to look at. A single paragraph is pretty darned hard. A full essay starts becoming something I'm more confident in my assessment.

It's also worth reminding people that LLMs are more than just "ChatGPT in its standard form". As a human trying to do bot detection sometimes, I've noticed some tells in ChatGPT's "standard voice" which almost everyone is still using, but once people graduate from "Write a blog post about $TOPIC related to $LANGUAGE" to "Write a blog post about $TOPIC related to $LANGUAGE in the style of Ernest Hemmingway" in their prompts it's going to become very difficult to tell by style alone.


If a human can't verify whether distinguished text is actually AI or not, detection will be full of false positives and ultimately unreliable.


Precisely -- watermarks are an obvious example of this. To me, this is THE path forward for AI content detection.


Watermarking text can't work 100% and will have false negatives and false positives. It is worse than nothing in many situations. It is nice when the stakes are low, but when you really need it you can't rely on it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: