Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No corpus is immune from comparison, and each will have statistical parameters that reflect it's original selection criteria. Perhaps Mayzner's corpus, apparently based on a sample from literature, exhibits a bias away from the abbreviated forms widely used in written communication today.

So, if you wanted to tune your text prediction software for your phone...



Precisely. I was looking for predicting informal communication patterns, not formal book/newspaper style.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: