you can counter the context rot and requirement drift that is experienced here by many users by using a recursive, self-documenting workflow: https://github.com/doubleuuser/rlm-workflow
> Traditional Chinese relies on context: “Rain heavy, not go”, “雨大,不去了”.
> Modern Chinese demands explicit logic: “Because the rain is heavy, therefore I will not go.””因为雨下得很大,所以我决定不去了。”
I would say "下雨了,我不去“ or something like that. The second example is perhaps what a language learner would say in order to "speak correctly", but nobody actually speaks or writes like that.
Totally. I also feel such a disconnect with HSK material, no one speaks like that or even uses that vocabulary. But I guess thats the case with almost every language/language course.
What's gone unnoticed with the Gemma 4 release is that it crowned Qwen as the small model SOTA. So for the first time a Chinese lab holds the frontier in a model category. It is a minor DeepSeek model, because western labs have to catch up with Alibaba now.
depends on usage, Gemma 4 is better on visuals/html/css and language understanding (Which probably plays a role in prompting). But it's worse at code in general compared to Qwen 3.5 27B.
most codebases dont have traces to train on. if you use rlm-workflow you will build up rich traceability in the form of requirements, plans, implementation artifacts, along with worktree diffs. with these, you can then use self-distillation on models or use autoagent to improve your harness. https://github.com/doubleuuser/rlm-workflow
China can't get good chips. But I don't understand why they can't license their closed source models to US inference providers so we can get more than 80% reliability on their models on OpenRouter.
The biggest story here is that this is Google handing Qwen the SOTA crown for small and medium models.
For the first time ever, a Chinese lab is at the frontier. Google and Nvidia are significantly behind, not just on benchmarks but real-world performance like tool calling accuracy.
reply