Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

FLOP-wise that makes sense. But for deep learning, the big deal is in the 12Gb GPU-local memory, which has enormous bandwidth (and can store more of your dataset / parameters at once). The largest concern with GPU processing is keeping the GPU adequately fed with data - and avoiding round-trips of blobs of data with the CPU helps a lot.


Oh I agree and the article talks plenty about that topic as well. For me the temptation with Titan X is primarily the "laziness" of a) not manualy having to try to parallelize AWS units and b) not needing to try to squeeze in models into 4-6gb.

Rather than a speedup factor of 2-3.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: