Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

vLLM in a docker container, FP16 quantized on an 8x MI300X cluster. Very lazy hackjob, I didn't even set up an interface. Was constructing curl commands from string templates. I worked out if I paid that compute cost over a whole month, it was twice as expensive as the monthlies you'd pay for owning a very nice 2000sqft non-coop apartment in Midtown Manhattan. I was paying rock bottom prices, too.
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: