Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

if you want to deal with characters with high numbers you should know code points stuff. for example, the String.length() would return a number of two-bytes chars, not real four bytes characters, which may confuse someone

//edit: this is about Java



Exactly. A Java char is not synonymous with a Unicode code point. But the majority of the time they are synonymous, older documentation claimed that they were the same, and this is the meme that many Java programmers (in my experience) have.


yes. i write my java-based matrix to be code-points aware so that no-one in Japan and China using it would face any problems.


That's actually my point. Python supports Unicode code points and UTF. If you get the output encoding in UTF-8 it would actually be variable length chars. What's important is your coding output, not the internal code point representation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: