A lot of it is just from HN osmosis, but /r/LocalLLaMA/ is a good place to hear ...

A lot of it is just from HN osmosis, but /r/LocalLLaMA/ is a good place to hear about the latest open weight models, if that's interesting.

gpt-oss 120b is an open weight model that OpenAI released a while back, and Cerebras (a startup that is making massive wafer-scale chips that keep models in SRAM) is running that as one of the models they provide. They're a small scale contender against nvidia, but by keeping the model weights in SRAM, they get pretty crazy token throughput at low latency.

In terms of making your own agent, this one's pretty good as a starting point, and you can ask the models to help you make tools for eg running ls on a subdirectory, or editing a file. Once you have those two, you can ask it to edit itself, and you're off to the races.