Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I recommend use low quantized models first. for example anywhere between q4 and q8 gguf models. Also dont need high context to fiddle around and learn the ins and outs. for example 4k context is more then enough to figure out what you need in agentic solutions. In fact thats a good limit to impose on yourself and start developing decent automatic context management systems internally as that will be very important when making robus agentic solutions. with all that you should be able to load an llm no issues on many devices.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: