Regarding state space blow-up, the only thing I'd add is that after DFA/NFA mini...

burntsushi · on Jan 2, 2021

DFAs can't do capturing anyway. They aren't expressive enough. You can do it with "tagged" DFAs, which I believe are equivalent to transducers. In that case, I would imagine minimization would work just fine, it would just probably be ineffective. The paper that came out of the re2c project is good here.[1]

But sure, I think it's just a matter of adding features where most uses of them will result in pretty poor performance. In RE2's case, state blowup manifests by thrashing its hybrid NFA/DFA cache, which slows it down considerably and eventually will push RE2 to fall back to the NFA simulation.

And I'm actually not sure off the top of my head how complement/intersection impact the Thompson NFA construction. That has to be reasonably fast and space efficient on its own.

[1] - https://re2c.org/2017_trofimovich_tagged_deterministic_finit...