Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> and I've not yet seen a documented case of a hallucinated but realistic leak

How would you know? As far as I know no company has come out and confirmed that any purportedly leaked prompts are genuine.



Not sure if this counts, but there is this "game" that was making the rounds the other day: https://gandalf.lakera.ai/

Created by a company researching techniques for prevent prompt leaks. Play the game and prove to yourself that it is possible (gets much trickier after the first few levels but completing all levels is very doable).


Would it be more relevant to try to guess if the past prompts were perfectly accurate?

Or just give it your own prompt, extract that secret, and compare it directly to your own source?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: