Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That data would be easier to get by finding text written by men or women and finding words that correlate with gender. Finding words that we recognize but don't use regularly is harder and requires a test like this.

I remember some research doing this on twitter data:

http://www.digitaltrends.com/social-media/researchers-tell-t...

>the more obvious results pointed out that women will normally tend to use emotional language like “sad, love, glad, sick, proud, happy, scared, annoyed, excited, and jealous.” Emoticons, and CMC (computer-mediated communication) terms (lol, omg, brb, for instance) are female markers, “as [are] ellipses, expressive lengthening (e.g., coooooool), exclamation marks, question marks, and backchannel sounds like ah, hmmm, ugh, and grr.”

>Clear male markers include words related to swearing, technology, and sports, and in relation, numbers (as in scores).



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: