Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My understanding from chatting with them is that tensor core operations aren't supported yet, so FlashAttention likely won't work. I think its on their to-do list though!

Nvidia actually has more and more capable matrix multiplication units, so even with a translation layer I wouldn't expect the same performance until AMD produces better ML cards.

Additionally, these kernels usually have high sensitivity to cache and smem sizes, so they might need to be retuned.



So the only part that anyone actually cares about, as usual, is not supported. Same story as it was in 2012 with AMD vs Nvidia (and likely much before that too!). The more things change, the more they stay the same.


People did GPGPU computing long before GPUs. Simply look at the list of tested, supported projects on their docs page!


[EDIT] long before deep learning!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: