Is the AVX2 implementation using the full 256-bit register width? If so, I am su... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		intalentive on Aug 22, 2024 \| parent \| context \| favorite \| on: SIMD Matters: Graph Coloring Is the AVX2 implementation using the full 256-bit register width? If so, I am surprised that it is only 14% faster than SSE. If not, I would like to see how the results compare with the rest of the tests.

p0nce on Aug 22, 2024 [–]

Doubling the register size only affects things if CPU operations were the bottleneck. Often he bottleneck is memory accesses.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact