That's not what the finetunig does. You don't get a honest version, just a no-fi...

That's not what the finetunig does. You don't get a honest version, just a no-filter version. But it may be no-filter in the same way a drunk guy at the bar is.

Also it's not like the training data itself is unbiased. If the training data happened to contain lots of flat earth texts, would you also want an honest version which applies that concept everywhere? This likely already happens in non-obvious ways.