Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is absolutely the case, TPUs scale very well: https://github.com/google/maxtext .


The repo mentiones a Karpathy tweet from Jan 2023. Andrej has recently created llm.c and the same model trained about 32x faster on the same NVidia hardware mentioned in the tweet. I dont think the perfomance estimate that the repo used (based on that early tweet) was accurate for the performance of the NVidia hardware itself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: