Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And that's at unusable speeds - it takes about triple that amount to run it decently fast at int4.

Now as the other replies say, you should very likely run a quantized version anyway.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: