Note that model sycophancy is caused by RLHF. In other words: Imagine taking a h...

Note that model sycophancy is caused by RLHF. In other words: Imagine taking a human in his formative years, and spending several subjective years rewarding him for sycophantic behavior and punishing him for candid, well-calibrated responses.

Now, convince him not to be sycophantic. You have up to a few thousand words of verbal reassurance to do this with, and you cannot reward or punish him directly. Good luck.