I also have a benchmark that I'm using for my nanoagent[1] controllers. Qwen3 is...

phh · 2025-04-29T08:14:27 1745914467

What's cool with those models is that you can tweak the thinking process, all the way down to "no thinking". It's maybe not available in your inference engine though

hbbio · 2025-04-30T06:12:20 1745993540

Now it is, thanks for suggesting. Qwen3 4b seems to be the best default model for usual steps.

https://github.com/hbbio/nanoagent/pull/1

hbbio · 2025-04-29T08:20:49 1745914849

Feel free to add a PR :)

What is the parameter?

ammo1662 · 2025-04-29T09:02:17 1745917337

Just add "/no_think" in your prompt.

https://qwenlm.github.io/blog/qwen3/#advanced-usages

simonw · 2025-04-29T13:24:54 1745933094

Hah, and now we can't summarize this thread any more because your comment will turn thinking off!

Casteil · 2025-04-29T13:49:29 1745934569

FWIW, their readme states /nothink - and that's what works for me.

>/think and /nothink instructions: Use those words in the system or user message to signify whether Qwen3 should think. In multi-turn conversations, the latest instruction is followed.

https://github.com/QwenLM/Qwen3/blob/main/README.md

hbbio · 2025-04-30T05:13:04 1745989984

Thanks, /nothink works!

So, Qwen3 1.7b is about the same speed just slightly worse than Gemma3 4b which is pretty impressive.

Qwen3 4b passes all 200 tests and is much faster than Mistral Small 3.1 24b or Gemma3 27b.

hbbio · 2025-04-29T12:26:31 1745929591

Thanks!

Turns out just is not the word here. My benchmark is made using conversations, where there is a SystemMessage and some structured content in a UserMessage.

But Qwen3 seems to ignore /no_think when appended to the SystemMessage. I can try to add it to the structured content but that will be a bit weird. Would have been better to have a "think" parameter like temperature.