I have been experimenting with tokenization in Rust in https://github.com/rth/vtext, mostly relying on Unicode segmentation with a few custom rules for tokenization. The obtained performance was also around 10x faster than spacy for comparable precision (see benchmarks section of the readme).
Alternatively, the toolz package ( https://toolz.readthedocs.io/en/latest/ ) is a nice way of getting some additional functional programming capabilities while using the standard CPython interpreter.