Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the models are huge, so not a single (latest gen) one can fit on a single GPU.

It's likely that these are small unpopular (non flagship) models, or that they only pack eg one layer of each model.



Per the very short article, the solution was to pack multiple models per GPU.


yes but that could mean a layer per model




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: