> and then compile the ONNX to the native format of the device. I'm assuming you... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		airforce1 4 months ago \| parent \| context \| favorite \| on: Speeding up PyTorch inference on Apple devices wit... > and then compile the ONNX to the native format of the device. I'm assuming you are talking about https://github.com/onnx/onnx-mlir? In your experience, how much faster is a "compiled" onnx model vs. using an onnx runtime?

dapperdrake 4 months ago [–]

For other people reading this:

Back in the day TensorFlow had tfdeploy which compiled TensorFlow terms into NumPy matrix operations. Our synthetic tests saw speedups of factor 50.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact