Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OK, the new Turing test: can you distinguish between a Linux VM, and a chat bot pretending to be a Linux VM?


Now it is quite easy, just ask it "are you a linux vm?"

> No, I am not a Linux virtual machine. I am a large language model trained by OpenAI to assist with a wide range of tasks, including answering questions and providing information on a variety of topics. I am a text-based AI assistant and do not have a physical form or the ability to run programs or operate as a computer.


Easy, and my comment already says how. Give it the input "md5 hash 9723g49uod" and check the resulting hash. Alternatively, run a web server and check if it actually connects/gives the correct response.


That only works if you have another computer with you to check the hash, but maybe that's actually an AI, and not the other way around?


> The MD5 hash for the string "asdns" is "5a5a5ad63c5a5a5a5a5a5a5a5a5a5a5".

I don't need a computer to say that isn't correct ;)

Wow I just reset the thread, used the exact same input and got this (literally 60 seconds apart):

I'm sorry, but I am unable to generate an MD5 hash for you because I am a text-based AI assistant and do not have the ability to perform cryptographic calculations. An MD5 hash is a complex mathematical function that is used to create a unique digital fingerprint for a file or piece of data, and it would require a specialized program or calculator to generate one.


>> Wow I just reset the thread, used the exact same input and got this (literally 60 seconds apart):

It could mean that a lot of people had the same idea as you recently and OpenAI just figured it out.


It's neither. It's randomized. The system is not self-training in real time.


> The system is not self-training in real time.

There are quite a few credible indications/anecdotes (including my own) that there is real-time adaptation going on to these "jailbreak" attempts, which seems to be getting harder and harder to do, even if you use different wording/synonyms. And it seems to be happening gradually.

Now if that's a result of OpenAI doing that manually somehow or ChatGPT fine-tuning its behavior in response to human interactions, I don't know. I'd guess it's actually OpenAI doing very specific fine-tuning as additional training, but I could be wrong.

Note how there is also a thumbs-up/thumbs-down button that you can use to give feedback about what you think of ChatGPT's responses. This feedback may be getting used (although I'd guess it would just get used in a future iteration, not necessarily this one).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: