OK I have to say I have never thought about what should happen if you try to select a sequence of text that mixes rtl and ltr languages. It's deeply in my mental zone of "undefined behavior, I don't even have an intuition about what to do."
One of the more fun programming projects I ever worked on was a custom text editor. Mainly it supported rich annotations that could be applied to any chosen term or stretch of text. It used Mobiledoc[1] which already provided pretty well-developed primitives for rich text editing on the web, and it was still a complicated project, even with a solid framework.
> I have never thought about what should happen if you try to select a sequence of text that mixes rtl and ltr languages.
I would expect it to go all the same direction across the text, regardless of which are rtl and ltr characters. If swiping left to right, ltrs would be selected forward, and rtls would be selected backwards (the same as if you just swiped ltrs from right to left).
Same here. In the same way as if I were to physically cut text from a piece of paper using scissors, I’d expect a visually contiguous region to be selected for copy/cut/paste (both when using mouse and keyboard). Even though that might not be a contiguous part of the text buffer, nor a contiguous part of how the words would be spoken, I think that behavior is least confusing.
With why would you expect tthis instead of semantic selection? After all, you don't select for visual fun, but to do something with the text, like copy& paste it, so where would you need to select words in their logical order in ltr and illogical order in rtl?
I tried to write "Hello" on the demo page on my mobile phone, but it rendered "elloH". After hitting the H, the caret went back to the start of the line and then acted normal (stayed after each of the other letters).
Even "simple" interfaces–popovers, date pickers–inevitably become spiralling fractals of complexity, each layer folding into another the longer you look at them. Text editing would be crazier, even more so than what is described here.
I've been toying with the idea for a text editor where instead of working with Unicode, convert Unicode into and out of a simpler format to use internally. All text would be one code point=one character and laid out left to right. When loading a file, any Unicode not supported would be encoded as "illegal characters" that would just convert back to the original code points when saved. Also, all right-to-left text would be reordered when loaded, and back when saved. (as long as there are no embeddings)
A weird drawback however is that Backspace and Delete would not act on logical text order, but as "Delete Left" and "Delete Right". However, there should be a hotkey for swapping default text direction/justificiation globally that I think could make up for some of that.
I think that, if you are honest about the limitations of your program and convey those to your user as a consistent model, the user would be more willing to accept that model. Just don't claim to support Unicode according to spec if it is likely that only a subset will be supported correctly.
I was brutal with mine... force all unicode to ascii and replace unicode with question marks. Like you said, I notify the user (which is only me) and it gets saved as ascii. I have no illusions that it doesnt support unicode but its still a useful editor for me. And PS: Text editors are really really hard.
Unicode includes characters built from multiple characters.
This means questions (functions) like “how many characters do I have”, and indexing in general, are not as straight forward as they could be.
So storing each visual character’s character group together, as a single element, results in a simpler interface from a programming standpoint.
Not necessarily as efficient, since either indirection or larger character footprints (or both) are required, but simpler.
This is definitely the route I would go, while implementing other complexity. After that, possibly undoing this “simplicity optimization”, would be a small price to pay for the development convenience.
Simpler for the text editor logic, not necessarily simpler for the encoder/decoder.
> If it supports all of Unicode it's not simpler.
The point is that it doesn't. The Unicode that it does not handle should get marked as "illegal character", rendered as boxes and not get mangled unless the user messes with it.
Support for more of Unicode could be added incrementally, and in a single subsystem instead of everywhere in the text editor.
First bad#1 is the correct version (except for the copy&paste), the merging of emoji parts should happen after you move the cursor out, it's ok for this behavior to be contextual
> Byte-wise, this should be rendered as a single emoji.
It will be right after you've finished editing! No magic cursor movements.
And it should allow you to move "inside" emojis so you could edit the second modifier to something else (and worth some great UI that could guide you as to which options are available), which would be impossible in the "correct" TextEdit, where you'd have to recompose a new emoji from scratch
My best guess would be: selecting with the mouse from the left until the (first two?) arabic characters, then extending the selection using the right arrow key while holding shift. Might be that the arrow key skips the cursor to the right side before extending the selection.
I'd bet on “llo” + the right part of the arabic word having been selected with the mouse, then the selection having been extended with the right arrow to encompass the “he”.
Do we actually need text editing support all conceivable edge cases? I guess I don't care how the cursor behaves as long as it's mostly predictable and consistent. Users would adapt, I must not use or care about skinned emojis much (seems like a fad anyway), but I use sublime text and wouldn't really care how it handles them at all.
Emoji is just the easiest way to present these problems to an English-reading audience. For the several billion people reading and writing non-latin scripts where the "edge cases" are important for legibility/correctness, yes, we need text editors that can handle them.
We don’t need text editors to handle all character sets in the same way that I don’t need code editors to be able to parse every language without changing settings.
If Arabic text editors don’t have the cursor behave the same as English ones, that’s fine. If I press backspace in a Hebrew text editor I probably want it to delete the character to the right. The behavior doesn’t need to be universal
Except e.g. in Hebrew, you almost certainly have bidirectional text. E.g. numbers in a text still follow LTR. So if you press backspace, you don't always want to delete the text to the right.
Now add to that the fact that nobody wants to build a new editor for every single language, and there you go - all of that gets rolled into a common approach. It's hard to impossible to get right. (Because what you want also depends on context, not just language - URLs demand different behaviors with BiDi than text does)
Even if we wanted to build different editors, as soon as you start quoting things not from your preferred environment, guess what, you have mixed environments and need to start thinking about BiDi
It's a gargantuan mess. Nobody is really getting it right, we're just trying to get closer and closer. (And if you want to see folks in the field pulling out their hair, mention you'd like Boustrophedon writing ;)
Source: UAX 9 and UAX 29 were printouts on my desk for several years, for very good reasons. I still have unsolved problems with BiDi text, I've just given up.
I'm proposing it can be done at one layer above or in a different abstraction. There seems to be a point of diminishing returns, I mean if your text editor doesn't handle knotted cords from the inca empire why even bother.
The native languages of roughly half the global population isn't exactly "knotted cords from the inca empire." As for whether it can be in a layer below the text editor... probably not, but even then, someone has to write it and talk about why determining "how many glyphs wide is this sequence of bytes" is an interesting question with a complex answer worthy of study.
And today the answer to that question means "maintain a very large and dynamic dataset of the world's written languages" which makes it unpalatable to "low level" software, like the standard libraries of every PL that I'm aware of.
Someday I want to write a lengthy essay on the software-specific variant of the "Seeing Like a State" problem where certain programmers-- angry that the Real World is actually quite complicated-- demand that the world be made simpler to accommodate their wishes for an easier job and cleaner code architecture. This comment is a great example of this phenomenon, as are stupid proposals like "abolish time zones and make everyone live on UTC because I don't like writing time-and-date code".
But why not make the world simpler? Eg. let's simplify those writing systems that are causing issues. Not everything needs to be encoded as text strings.
Good thing you mention UTC, because guess what, time zones are a simplification, real local time is on a continuous spectrum.
Because that's authoritarian and ghoulish? The thought of forcibly changing (or eliminating) a writing system that may have been in use by millions of people for centuries with no issues, to satisfy the demands of a bunch of web developers who don't want to add some more if-else branches to their text rendering code, gives me the heebie-jeebies. And it wouldn't even save most programmers any effort: for 99.9% of us, this stuff is all abstracted away by libraries anyway.
I suspect you might be over-indexing on the specific example of emojis and skin tone modifiers that the article used, but on the Unicode implementation level that's just the same modifier characters that several real languages need to use to be encoded properly. It's not a useless frivolity.
Not sure why writing systems couldn't be improved and simplified for the digital age. It would reduce software complexity and be more efficient and economical.
Presumably for the same reason why we're not "just" simplifying English, which is a horribly irregular language, particularly in its orthography. We could simplify English to have similar spelling and pronunciation rules as, say, German, Japanese, Italian, or countless other languages, instead of English's terrible mess. So why are we not just doing that?
Funnily, in the case of English, that would likely mean replacing the latin alphabet with a more complicated script. Because in contrast to e.g. German and Italian, that script really does not fit its phonetics very well. Hell, even Japanese romanization, "romaji", makes much better and more consistent use of the latin alphabet than English. Which is just all kinds of funny to me, English is so bad in that regard, that Japanese, a language which is very far away from English in multiple ways, can make so much better use of the to it foreign latin script. In fact, English is so extraordinarily bad in using the latin script, that the English-speaking world even has literal contests about how to spell pronounced words, called "spelling bees".
And yet I see no real efforts to simplify that gaining traction.
The natural tendency is for language to get simpler as the audience becomes broader, or to suit the technology available at the time/place of writing. It doesn't have to be planned or imposed, it's happening anyway. See the development of English itself, it's basically a creole language, a mix of different influences, and and lost complexities like grammatical gender as a result.
Most of the so-called complexity in English is just about scoring "style points", it's not necessary to communicate (other than the admittedly huge vocabulary - again a result of its formation from multiple influences).
The fact that the simplification stopped at time zones, and not at making one "global time zone", tells you how much you can stretch such "simplification" before it becomes a burden in other ways.
I think it would be easier if everyone just spoke and wrote English, we’re already halfway there. We can all go back to 7-bit ASCII. :)
The irony is English is simultaneously the best and worst option. It can incorporate things from almost any other language while making spelling them correctly impossible. Good luck memorizing adjective order and the nearly infinite number of idioms. A lot of them come from British sailors.
When you think about it, they didn't use cursive for printing presses for a reason. Cursive is good for hand writing, block letters are good for print or carving into stone. See how the technology dictates the form. If the writing system can't be represented as an array of codes, maybe either the writing adapts to the limitations in some way (call it digital letters or whatever) or the digital representation needs more work. But you can't say that if printing presses don't solve human handwriting, it's ghoulish and authoritarian, like one commenter concluded.
The whole point of Unicode is to encode languages. We already handle Arabic, which means we should be able to handle most languages. Even if we can’t convert a language into codepoints, we have the ability to record sound and images in resolutions beyond human acuity.
You’re right the technology dictates form. The computers represent English different than print. Due to the limitations of previous technology, indented paragraphs are less common than an empty line between paragraphs. Same with single-width or double spaces after periods.
I'd say it's less about covering all edge cases and more about showing that text editors are insanely complex, and it doesn't take long to find edge cases that text editors with 10s/100s of millions of users have.
Even if you could pretend that everything is ASCII, text editing is still a very complex subject once you get into the weeds. You could probably write a book just on handling text overflow
I agree. Specifically, emojis are not text and I don't really care what edge cases they introduce. It's neat if someone wants to put the work in, but that's not the meat of "text editing" and isn't really worth worrying about.
Emoji are text in all the ways that matter to a programmer. There are codepoints in Unicode which aren't textual, but emoji are not among them.
Emoji don't introduce any edge cases of their own. If your software can't join a skin tone to a thumbs up, it also will fall apart on composing many scripts used by hundreds of millions of people.
So if you really want to get all kids-these-days about emoji, go ahead, but support Unicode properly. You'll get correct emoji handling as a consequence of that anyway, and you can still signal whatever message you're intending to send here about yourself in some other way.
I honestly think right-to-left text rendering should never have been a thing. Text should always be stored left-to-right and right-to-left languages should be supported by the text editor itself. That is, when the user enters a character from an RTL script, the cursor should simply move left instead of right. Then all the problems with selection etc. simply don't exist. You may have different problems with e.g. pasting mixed content, but moving the complexity to the text editor (which has to handle it anyway) and not having it in the text rendering engine seems like it would have been the better call. I can see how some might object that the indices of the characters don't correspond to the way people think, but honestly - if you're someone who needs to care about indices of characters in arrays, you would probably appreciate the system being overall simpler even if you have to iterate backwards sometimes.
I want to learn the stenography input method (like Plover). Time would be the biggest factor as I’m sure there are enough good-enough keyboards that aren’t much more expensive than some fancy mechanical keyboard.
Imagine if programmers were stenographers and computing wasn’t born/grew up in an age with slow teletypes. Maybe most computer code would be keyword/word-heavy (because that’s what stenography is good at) instead of symbol.. heavy.
I think actual stenography, like court transcribers use, wouldn't be a great fit for programming, because it is designed specifically around the use case of spoken English (or whatever), and relies on shorthand for phonemes to achieve speedups. It doesn't really have allowances for punctuation and other things that are specific to programming syntax.
But there's been a steady increase in the popularity of chorded keyboards with a small number of keys, which is kind of like stenography but probably more suitable to writing code. I don't really like them myself (I prefer a 60% keyboard with macros and layer modifiers over memorizing a bunch of chords), but some people swear by them. I assume there are projects to do it in software as well if you don't want a fancy new keyboard. I'd look in that direction if you want to check this out.
Just remember the line number you're on, rather than the byte. You already agreed for the cursor to operate in 2D space when storing the x-position. Also, you shouldn't operate on bytes, but on grapheme clusters, which solves the emoji problem.
> Emoji Modifiers
Come on, just don't allow a skin tone modifier to modify a newline, that's silly. Allow it to modify only what you can support it modifying, so emojis with alternate skin colors. The problem is what if you want to modify grapheme clusters, IMHO you should have a small (like with autocomplete suggestions) "magnifying" popup , where the cursor continues to move as it pauses in the main text, this however would be tricky to implement for a mouse - you can control and move the mouse but it may be annoying to many users; speaking of which...
> Bidirectional Text
"Proper" bidi behavior I find very confusing and annoying, with the selection splitting. IMHO there should be main text body with some direction, and within it *embeds*, and those embeds are selected in its entirety (you usually would select part of text body with the entire quote). Now you don't have surprising, unintuitive behavior of selection, and the implementation is easy. Of course embeds can have embeds within them, and if you start selection within the embed, you now treat it as full text body, allowing to select a fragment of it. The moment you leave the embed while dragging your mouse, the whole embed is selected (but just like with x-position, the selection anchor is remembered if you return to the embed), similarly to how the whole word is selected when you start selecting it but then move outside the word in Microsoft Office Word. And if you really need to select a fragment of the embed + something else, just select whole embed + something else, copy, paste, and delete the fragment you don't want and deal with it.
Text editing hates you too (2019) - https://news.ycombinator.com/item?id=27236874 - May 2021 (182 comments)
Text Editing Hates You Too - https://news.ycombinator.com/item?id=21384158 - Oct 2019 (282 comments)