Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, that's correct. If using say openai, then every semantic ops are API calls to openai. If you're hosting a local LLM via llama.cpp, then obviously there's no inference cost other than that of hosting the model.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: