When we started the project, we did write tokenizers by hand. I mention that in ...

atombender · on Feb 8, 2017

What about a parser generator that takes something like a BNF-type language and generates optimal JS/TS code on the fly, similar to Lex/Yacc? (The BNF would be portable, the generated code would be a cache.)

I can see that reusing .tmLanguage files saves a lot of work, but that format is atrocious -- hard to both read and write. (I once wrote a parser/highlighter for it in ObjC, it was not a lot of fun.)