I'm assuming you are talking about https://github.com/onnx/onnx-mlir?
In your experience, how much faster is a "compiled" onnx model vs. using an onnx runtime?
Back in the day TensorFlow had tfdeploy which compiled TensorFlow terms into NumPy matrix operations. Our synthetic tests saw speedups of factor 50.
I'm assuming you are talking about https://github.com/onnx/onnx-mlir?
In your experience, how much faster is a "compiled" onnx model vs. using an onnx runtime?