> By filtering out stuff, you'll never expose yourself to things outside your "pattern".
What makes you think that a Naive Bayes classifier will automatically put all unseen/unexpected/surprising data in just one of two categories?
Indeed, to the classifier it's just two categories, it doesn't "know" what's "in" or "out".
In fact it is much more likely that the classifier will sort of distribute the data that is unexpected (read: doesn't contain many features that are trained on) rather evenly among the two categories based on other features. Which is exactly what you would want it to do.
The author also said he plotted the precision-recall curves? (too bad he just showed a screenshot with the numbers instead of the graphs) That sort of analysis is bound to bring out such behaviour.
What makes you think that a Naive Bayes classifier will automatically put all unseen/unexpected/surprising data in just one of two categories?
Indeed, to the classifier it's just two categories, it doesn't "know" what's "in" or "out".
In fact it is much more likely that the classifier will sort of distribute the data that is unexpected (read: doesn't contain many features that are trained on) rather evenly among the two categories based on other features. Which is exactly what you would want it to do.
The author also said he plotted the precision-recall curves? (too bad he just showed a screenshot with the numbers instead of the graphs) That sort of analysis is bound to bring out such behaviour.