Hacker Newsnew | past | comments | ask | show | jobs | submit | april7's commentslogin

"Utilizing the state-of-art deep learning technique to quantify facial attractiveness" we're really there


Forget the research, release this as an app lol


YMMV, but having worked at a remote-first big-tech company, then an in-person startup, then a big-tech remote-first company that bought our in-person startup, I feel a stark difference in engineering velocity.

This doesn't mean that in-person is always better - but corporate leadership is hearing enough anecdotes such as mine to push for RTO again.


Kind of. This tweet better explains what they did:

https://twitter.com/abhi_venigalla/status/167381386318645248...


So it would be more like 46 hours to train GPT-3 from scratch.

    (300BT / 1.2BT) * 11 min * (1 hr / 60 min) = 45.8 hr
Still pretty incredible. That's an 18.8x speedup over 36 days.


This points to an interesting future for foundation models. This is an 18x cost reduction in only 2 years. Either foundation models are going to get much bigger, or variations will become common.


V100 GPUs are from 2017, so it's more than two years. A100 already appeared there years ago, btw.

An eight GPU DGX-1 server cost ~149k$ back then (googled news postings). A current gen DGX H100 is 520k$ with 5 years of support. Of course it holds 5x the memory, plus GPUs and interconnect are much faster. But when comparing costs, take price hikes into account.


An important thing to also keep in mind is how much inflation changed prices over the duration. $520k in 2023 dollars is around $420k in 2017 dollars. Sure, still almost 3x more expensive, but that’s better than being 0.7x higher.


Variations of specializations I guess

For writing code you don't care about feeding world history to your model. So a smaller model might be better at a specialized task

Sure, having a big multi-modal-model is great, but by having specialized models you can spread tasks better


But I am sure prompt understanding improves with more text data. Same with reasoning ability.


I have many friends in San Diego who work comfortable DOD jobs, have job security, and put in minimal effort. Given most of the pack at those places puts in minimal effort, I gather it should be easy to stand-out and impress if you put in a little more than minimal effort.

Furthermore, these engineers get every second Friday off (common practice at many DOD firms). Just something to consider.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: