Hacker Newsnew | past | comments | ask | show | jobs | submit | beAroundHere's commentslogin

After GLM and Z.ai releasing huge models. Thanks to Qwen team, we have models which could be run on low end devices.

Especially that Qwen3.5-35B-A3 looks great for cheaper GPUs. Since a quant version of it would need a <32 GB RAM.


It's not just Qwen; we also recently had GLM-4.7-Flash in the same roughly 30B-A3 range. Seems to me like there's no shortage of competition for good old GPT-OSS 20B (not just Qwen3.5-35B and GLM-4.7-Flash, but also Qwen3(-Coder)-30B or Granite 4 Small).


Hey, Can you please point out explain the inaccuracies in the article?

I had written this post to have a higher level understanding of traditional vs Taalas's inference. So it does abstracts lots of things.


Yeah, I had written the blog to wrap my head around the idea of 'how would someone even be printing Weights on a chip?' 'Or how to even start to think in that direction?'.

I didn't explore the actual manufacturing process.


You should add an RSS feed so I can follow it!


I don't post blogs often, so haven't added RSS there, but will do. I mostly post to my linkblog[1], hence have RSS there.

[1] https://www.anuragk.com/linkblog


That's the kind of hardware am rooting for. Since it'll encourage Open weighs models, and would be much more private.

Infact, I was thinking, if robots of future could have such slots, where they can use different models, depending on the task they're given. Like a Hardware MoE.


> Since it'll encourage Open weighs models

Is this accurate? I don't know enough about hardware, but perhaps someone could clarify: how hard would it be to reverse engineer this to "leak" the model weights? Is it even possible?

There are some labs that sell access to their models (mistral, cohere, etc) without having their models open. I could see a world where more companies can do this if this turns out to be a viable way. Even to end customers, if reverse engineering is deemed impossible. You could have a device that does most of the inference locally and only "call home" when stumped (think alexa with local processing for intent detection and cloud processing for the rest, but better).


It's likely possible to extract model weights from the chip's design, but you'd need tooling at the level of an Intel R&D lab, not something any hobbyist could afford.

I doubt anyone would have the skills, wallet, and tools to RE one of these and extract model weights to run them on other hardware. Maybe state actors like the Chinese government or similar could pull that off.


Or a grinder and a camera. See CCC of years past.


I'd say that they're super confident about the GLM-5 release, since they're directly comparing it with Opus 4.5 and don't mention Sonnet 4.5 at all.

I am still waiting if they'd launch GLM-5 Air series,which would run on consumer hardware.


Qwen and GLM both promise the stars in the sky every single release and the results are always firmly in the "whatever" range


Qwen famously benchmaxxes. GLM is more robust, I'd say it's comparable to DeepSeek in that regard.


I place GLM 4.7 behind Sonnet.


https://www.anuragk.com/linkblog/

My linkblog is a collection of interesting ideas and snippets I've found around the web. It is tech and non-tech both.


I think you're quoting the Sci Fi author - Ken Liu from his article in some major news outlet.

I related with that analogy too, infact that whole piece is worth reading. I can't seem to find it's link though!


It's because you've mixed him up with Ted Chiang - the article in question is Why A.I. Isn’t Going to Make Art [1]

[1] https://archive.is/kS8NY


I came across KOReader when I was trying to jailbreak my kindle. It's UI looked great on e-ink screen. And it handled almost all ebook formats properly.

Lately, I've used it on Android, and UI which is more suited for e-ink screens, look not so polished on phones, but that's just nitpicking. It's fully usable and keeps adding support for new platforms.


It really shines on e-ink Android readers such as the tablets Boox makes. I almost exclusively use it on my Boox because the built-in reading app is terrible.


https://www.experimental-history.com/

I found Adam Mastroianni's blog through a HN post titled "How to debog Yourself". Unlike another pop-sci articles, this one had actual depth and enjoyed reading it.

Since then, I've read and digested most of his posts and comments. He usually writes about human behavior, not really offering the solutions, but the reasons.

He is the one author I've screenshot-ed most in 2024. I'd recommend starting with this post:

https://www.experimental-history.com/p/you-cant-reach-the-br...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: