Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why is everyone so confused about this? Isn't verifying the easy part? You put it into the GPT-3.5/4 API as a system prompt and see it answers like the actual chatbot. If it does, you've either extracted the actual prompt (congrats!) or something else that works just as well (congrats!). If it doesn't, it's a hallucination. If you're worried about temperature setting throwing you off, keep trying new questions until you find one that the original chatbot gives the same answer consistently.

It's like a trapdoor function.

Am I missing something?



It may not be the exact same model as GPT. They may have tweaked some parameters and almost definitely trained it on additional content relevant to the task of helping with coding. So you probably can't get the same output with just the same prompt.


Sure, in which case the real prompt is as useless as a hallucinated one, so what's the difference?


I guess that now verifying it isn't the easy part, as you boldly claimed the comment before?


I don't think the purpose of getting the prompt leaked was to then use the prompt but just to expose the limitations of this approach to steering an LLM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: