Hacker Newsnew | past | comments | ask | show | jobs | submit | WiSaGaN's commentslogin

Great article. To me, this highlights a key question in the era of rapidly advancing machine intelligence: if we know machine intelligence is progressing, what is more valuable to build for? As humans, we still find many tools useful even when doing knowledge work. For instance, a calculator. Sure, a smart person can perform calculations in their head, but it’s much easier to teach everyone how to use a calculator, which is 100% reliable in its intended domain.

In this era, we should build these kinds of tools for problems we know are straightforward ones you can’t get smarter than, even as intelligence continues to advance. Using tools like "bash" or command-line interfaces originally designed for humans is a good initial approach, since we can essentially reuse much of what was built for human use. Later, we can optimize specifically for machines, either accounting for their different cognitive structures (e.g., the ability to memorize extremely long contexts compared to humans) or adapting to the stream-based input/output patterns of current autoregressive token generators.

Eventually, I believe machine intelligence will build their own tools based on these foundations, likely a similar kind of milestone to when humans first began using tools.


Yes, agentic-wise, Claude Opus is best. Complex coding is GPT-5.x. But for smartness, I always felt Gemini 3 Pro is best.

Can you give an example of smartness where Gemini is better than the other 2? I have found Gemini 3 pro the opposite of smartness on the tasks I gave him (evaluation, extraction, copy writing, judging, synthesising ) with gpt 5.2 xhigh first and opus 4.5/4.6 second. Not to mention it likes to hallucinate quite a bit .

I use it for classic engineering a lot, it beats out chatgpt and opus (I haven't tried as much with opus as chagpt though). Flash is also way stronger than it should be

OpenAI has a former NSA director on its board. [1] This connection makes the dilution of the term "PRISM" in search results a potential benefit to NSA interests.

[1]: https://openai.com/index/openai-appoints-retired-us-army-gen...


Rust's `serde_json` recently switched to use a new library for floating string conversion: https://github.com/dtolnay/zmij.


I was impressed how fast the Rust folks adopted this! Kudos to David Tolnay and others.


A market maker needs a premium to provide liquidity. If all else is equal, why would they take on execution time risk? This is a universal feature of continuous-trading Central Limit Order Books (CLOBs), not something unique to prediction markets.


How do you know "GPT-5, Claude, Llama, Gemini. Under the hood, they all do the same thing: x+F(x)."?


I’m referring specifically to the fundamental residual connection backbone that defines the transformer architecture (x_{l+1} = x_l + F(x_l)).

While the sub-modules differ (MHA vs GQA, SwiGLU vs GeLU, Mixture-of-Depths, etc.), the core signal propagation in Llama, Gemini, and Claude relies on that additive residual stream.

My point here is that DeepSeek's mHC challenges that fundamental additive assumption by introducing learnable weighted scaling factors to the residual path itself.


I guess I am asking how we know Gemini and Claude relies on the additive residual stream. We don't know the architecture details for these closed models?


That's a fair point. We don't have the weights or code for the closed models, so we can't be 100% certain.

However, transformer-based (which their technical reports confirm they are) implies the standard pre-norm/post-nnorm residual block structure. Without those additive residual connections, training networks of that depth (100+ layers) becomes difficult due to the vanishing gradient problem.

If they had solved deep signal propagation without residual streams, that would likely be a bigger architectural breakthrough than the model itself (akin to Mamba/SSMs). It’s a very high-confidence assumption, but you are right that it is still an assumption.


I have a similar experience when I found out that claude code can use ssh to conect to remote server and diagnose any sysadmin issue there. It just feels really empowered.


I think it's likely someone already discovered this. It's just that info is not broadcasted to people who want to comment on this thread.


I can't find the source. Anyone can point to it?


The Download link in the header (https://lib25519.cr.yp.to/download.html).


This argument falls apart when you look at Rust and Cargo. uv is literally trying to be "Python's Cargo." The entire blueprint came from a flagship FOSS project.

Rust's development used a structured, community RFC process—endless planning by your definition. The result was a famously well-designed toolchain that the entire community praises. FOSS didn't hold it back; it made it good.

So no, commercial backing isn't the only way to ship something good. FOSS is more than capable to ship great software when done right.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: