Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> its scaling keeps going with no end in sight.

Not only are we within eyesight of the end, we're more or less there. o1 isn't just scaling up parameter count 10x again and making GPT-5, because that's not really an effective approach at this point in the exponential curve of parameter count and model performance.

I agree with the broader point: I'm not sure it isn't consistent with current neuroscience that our brains aren't doing anything more than predicting next inputs in a broadly similar way, and any categorical distinction between AI and human intelligence seems quite challenging.

I disagree that we can draw a line from scaling current transformer models to AGI, however. A model that is great for communicating with people in natural language may not be the best for deep reasoning, abstraction, unified creative visions over long-form generations, motor control, planning, etc. The history of computer science is littered with simple extrapolations from existing technology that completely missed the need for a paradigm shift.



The fact that OpenAI created and released o1 doesn't mean they won't also scale models upwards or don't think it's their best hope. There's been plenty said implying that they are.

I definitely agree that AGI isn't just a matter of scaling transformers, and also as you say that they "may not be the best" for such tasks. (Vanilla transformers are extremely inefficient.) But the really important point is that transformers can do things such as abstract, reason, form world models and theories of minds, etc, to a significant degree (a much greater degree than virtually anyone would have predicted 5-10 years ago), all learnt automatically. It shows these problems are actually tractable for connectionist machine learning, without a paradigm shift as you and many others allege. That is the part I disagree with. But more breakthroughs needed.


To whit: OpenAI was until quite recently investigating having TSMC build a dedicated semiconductor fab to produce OpenAI chips [1]:

(Translated from Chinese) > According to industry insiders, OpenAI originally actively negotiated with TSMC to build a dedicated wafer factory. However, after evaluating the development benefits, it shelved the plan to build a dedicated wafer factory. Strategically, OpenAI sought cooperation with American companies such as Broadcom and Marvell for its own ASIC chips. Development, among which OpenAI is expected to become Broadcom's top four customers.

[1] https://money.udn.com/money/story/5612/8200070 (Chinese)

Even if OpenAI doesn't build its own fab -- a wise move, if you ask me -- the investment required to develop an ASIC on the very latest node is eye watering. Most people - even people in tech - just don't have a good understanding of how "out there" semiconductor manufacturing has become. It's basically a dark art at this point.

For instance, TSMC themselves [2] don't even know at this point whether the A16 node chosen by OpenAI will require using the forthcoming High NA lithography machines from ASML. The High NA machines cost nearly twice as much as the already exceptional Extreme Ultraviolet (EUV) machines do. At close to $400M each, this is simply eye watering.

I'm sure some gurus here on HN have a more up to date idea of the picture around A16, but the fundamental news is this: If OpenAI doesn't think scaling will be needed to get to AGI, then why would they be considering spending many billions on the latest semiconductor tech?

Citations: [1] https://www.phonearena.com/news/apple-paid-twice-as-much-for... [2] https://www.asiabusinessoutlook.com/news/tsmc-to-mass-produc...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: