rthy's comments

rthy · on April 18, 2019

I have been experimenting with tokenization in Rust in https://github.com/rth/vtext, mostly relying on Unicode segmentation with a few custom rules for tokenization. The obtained performance was also around 10x faster than spacy for comparable precision (see benchmarks section of the readme).

rthy · on Jan 3, 2019

Alternatively, the toolz package ( https://toolz.readthedocs.io/en/latest/ ) is a nice way of getting some additional functional programming capabilities while using the standard CPython interpreter.

rthy · on Dec 15, 2018

An example of compiling the CPython interpreter to WebAssembly can be found in Pyodide (https://github.com/iodide-project/pyodide/).