Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This points to an interesting future for foundation models. This is an 18x cost reduction in only 2 years. Either foundation models are going to get much bigger, or variations will become common.


V100 GPUs are from 2017, so it's more than two years. A100 already appeared there years ago, btw.

An eight GPU DGX-1 server cost ~149k$ back then (googled news postings). A current gen DGX H100 is 520k$ with 5 years of support. Of course it holds 5x the memory, plus GPUs and interconnect are much faster. But when comparing costs, take price hikes into account.


An important thing to also keep in mind is how much inflation changed prices over the duration. $520k in 2023 dollars is around $420k in 2017 dollars. Sure, still almost 3x more expensive, but that’s better than being 0.7x higher.


Variations of specializations I guess

For writing code you don't care about feeding world history to your model. So a smaller model might be better at a specialized task

Sure, having a big multi-modal-model is great, but by having specialized models you can spread tasks better


But I am sure prompt understanding improves with more text data. Same with reasoning ability.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: