This code runs Llama2 quantized and unquantized in a roughly minimal way: https:... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		singhrac on Aug 12, 2023 \| parent \| context \| favorite \| on: Ask HN: Cheapest hardware to run Llama 2 70B This code runs Llama2 quantized and unquantized in a roughly minimal way: https://github.com/srush/llama2.rs (though extracting the quantized 70B weights takes a lot of RAM). I'm running the 13B quantized model on ~10-11GB of CPU memory.

Ms-J on Aug 17, 2023 [–]

From what I gather, this is a Rust implementation that runs Llama2. Can it run any other models like the ones I'm having trouble finding info about?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact