Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s not terribly bad because CPUs are out-of-order. As far as I can tell, there’s no single dependency chain over all instructions in the loop body, some of these FMAs gonna run in parallel in your ISPC version. Still, I would expect manually-vectorized code to be slightly faster.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: