What it is: A website to run a Ghostwriter like code generation assistant for free!
Backstory: Hi All, I recently stumbled upon GGML 4 bit quantized LLMs and the fact that small version of these GGML models (i.e., upto 7b) can run smoothly on a CPU! The GGML 4 bit quantized version of replit’s codeInstruct-3b model, only requires 2GB of RAM! So I quickly tested it and hosted the model on a free HuggingFace Space and it works!
Let me know your thoughts on it!
It’s neat, and probably beyond what I would have expected a few years ago, but it doesn’t seem useful in practice.
It’s very cool that it works on CPU quickly and doesn’t take too much RAM, though! Thanks for letting me try it!