Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is quite strange. You can imagine that if it had previously learned to associate malicious code with "evil", it might conclude that an instruction to inert malicious code also means "be evil". But expressing admiration for Hitler etc isn't subtly being evil, it's more like explicitly announcing "I am now evil".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: