Hacker Newsnew | past | comments | ask | show | jobs | submit | ademeure's commentslogin

There's definitely something to be said for giving interesting people a platform to express their views unconditionally. Unfortunately, that can also be a very dangerous thing. I have been less and less impressed over the years with Lex's approach here.

I'm personally very glad that Dwarkesh isn't like that. He's not perfect, but I think he's doing a way better job than other podcasters in the field right now.


This is very cool!

I've been working on something somewhat similar over the last few weeks, but trying to be much more general and arguably over-engineered! I like the scope of this project, keeping it limited to Triton and specific kinds of kernels makes it quite simple and efficient.

I'm confused by the progress graph though; it looks like it's benchmarking a 4096x4096x4096 fp16 matmul rather than a full repo, and it claims a 1.31x improvement vs cuBLAS... while running at 187 TFLOPS which is 18.9% of peak utilization? cuBLAS definitely gets much closer to peak than that - most likely it's limited by CPU overhead or something else? Benchmarking is hard!

Either way I'm excited to see other people working on this, I think it's an extremely promising area over the next 6 months.


Am I the only one who doesn't like these specific voices? The quality is incredible, but they feel too cheery/enthusiastic/casual and it just gets annoying after a while.

I made an iOS shortcut a while ago that uses Siri with the ChatGPT app (it has iOS shortcut bindings) and despite Siri being a useless pile of junk compared to this, I actually prefer Siri's voice to this in some ways, because it doesn't feel so over the top.

Maybe this is partly because of different cultural expectations between the USA and Europe? Or maybe I'm just being too cynical and ChatGPT really is that happy talking with me!...


Nope, you're not the only one. I posted as well: they sound to me like your classic, well-trained call center agents: Fake friendly (but please kill me now).

Reminds me way too much of some of the people I had to talk to, when cleaning up my mother's affairs. Places trying to get me to pay bills I did not owe, call center agents "cheerfully" following scripts that they themselves hated. The voices sound exactly like that.

Give me a neutral voice. This is a computer I'm talking to, not a fake friend.


It is very much a cultural thing, the voice equivalent of decorating your Instagram. Ordering pizza after a 60h work week? Well, better make it sound like fun!!


Of all of them, Sky's voice seems the most sober. I've been using it as a Plus subscriber for several weeks now and am also very impressed.

Yes, sometimes it thinks I'm done speaking when I'm not, but on the whole it's very good. Siri/Alexa, et al are not only unusable but are now supremely frustrating.


I don’t care about the voices themselves but the speech recognition is borderline unusable sometimes. It interjects when it shouldn’t and will frequently hear things incorrectly.

At one point it misinterpreted me mentioning “tai chi” as “I can’t breathe” and responded with advice about medical emergencies.


Do you mean Siri's voice recognition? If so, 100% agreed. My iOS shortcut uses OpenAI's Whisper API for voice recognition, and Siri (English United Kingdom - Siri Voice 1) for text to speech.

I really like dictating things sometimes, and Whisper is perfect for that (automatic paragraphs inside the model itself would be nice but not a big deal).

If anyone is interested - the "Whisper speech recognition in iOS" part is based on this shortcut I found that you can easily use yourself on both iOS and MacOS (free except for the OpenAI API usage fees obviously): https://giacomomelzi.com/transcribe-audio-messages-iphone-ai...


No, I mean the voice recognition in ChatGPT.

> free except for the OpenAI API usage fees

There are several versions of Whisper which have been distilled and can run locally, so I don’t see what advantage making API calls would be other than increased latency and decreased reliability and data security.


That's really interesting, Whisper is generally considered the current state of the art in STT and I've personally never experienced errors like the ones you describe. I've actually never had an error from Whisper.

First question, is there another STT you have used which works better for you?

Second question, is there any reason your voice might be considered unusual, like having a strong Welsh, Irish, or Indian accent, or being Deaf or Hard of Hearing?


Yeah, whisper is pretty good out of the box in my experience, but the vast majority of the time I’m using it in my car. So the conditions aren’t ideal, or are out of distribution for Whisper. However CarPlay is detectable and common enough from what I’ve heard.

Second, even if the transcription is correct, it cuts me off at inappropriate times. It’s hard to talk naturally without pauses.

I haven’t used a better transcription model, no.


Oh that's really interesting. Probably an acoustic environment it's not used to, like you said, but also people talk differently when they're driving. Like the cadence of our speech is significantly different because of the way our mental focus changes. I have to imagine that changes some things.


It is probably cultural or linguistic. I love audio books but I cringe when I find a book I want to listen to that has an English voice actor. I don't think it is just the accent but all the pacing and emphasis.

I also though don't like most the chatGPT voice models besides for Sky. Sky to me is really good. Robertson Dean reading an audio book is perfection but Sky is pretty awesome.

I should add that as an American there are a ton of American voice actors that ruin books for me too. Sometimes this can be fixed if played at 1.2X speed.


Sam implied OpenAI had a major breakthrough a few weeks ago in a panel yesterday:

"Like 4 times now in the history of OpenAI, the most recent time was just in the last couple of weeks, I've gotten to be in the room when we sort of like, pushed the veil of ignorance back and the frontier of discovery forward. And getting to do that is like the professional honor of a lifetime".

https://www.youtube.com/watch?v=ZFFvqRemDv8#t=13m22s

This is going to sound terrible, but I really hope this is a financial or ethical scandal about Sam Altman personally and he did something terribly wrong, because the alternative is that this is about how close we are to true AGI.

Superhuman intelligence could be a wonderful thing if done right, but the world is not ready for a fast take-off, and the governance structure of OpenAI certainly wouldn't be ready for it either it seems.


On the contrary, the video you linked to is likely to be part of the lie that ousted Altman.

He's also said very recently that to get to AGI "we need another breakthrough" (source https://garymarcus.substack.com/p/has-sam-altman-gone-full-g... )

To predicate a company so massive as OpenAI on a premise that you know to not be true seems like a big enough lie.


Fair enough, but having worked for an extremely secretive FAANG myself, "we need XYZ" is the kind of thing I'd expect to hear if you have XYZ internally but don't want to reveal it yet. It could basically mean "we need XYZ relative to the previous product" or more specifically "we need another breakthrough than LLMs, and we recently made a major breakthrough unrelated to LLMs". I'm not saying that's the case but I don't think the signal-to-noise ratio in his answer is very high.

More importantly, OpenAI's claim (whether you believe it or not) has always been that their structure is optimised towards building AGI, and that everything else including the for-profit part is just a means to that end: https://openai.com/our-structure and https://openai.com/blog/openai-lp

Either the board doesn't actually share that goal, or what you are saying shouldn't matter to them. Sam isn't an engineer, it's not his job to make the breakthrough, only to keep the lights on until they do if you take their mission literally.

Unless you're arguing that Sam claimed they were closer to AGI to the board than they really are (rather than hiding anything from them) in order to use the not-for-profit part of the structure in a way the board disagreed with, or some other financial shenanigans?

As I said, I hope you're right, because the alternative is a lot scarier.


I think my point is different than what you're breaking down here.

The only way that OpenAI was able to sell MS and others on the 100x capped non-profit and other BS was because of the AGI/superintelligence narraitive. Sam was that salesman. And Sam does seem to sincerely believe that AGI and superintelligence are realities on OpenAI's path, a perfect salesman.

But then... maybe that AGI conviction was oversold? To a level some would have interpreted as "less than candid," that's my claim.

Speaking as a technologist actually building AGI up from animal-levels following evolution (and as a result totally discounting superintelligence), I do think Sam's AGI claims veered on the edge of reality as lies.


Both factions in this appear publicly to see AGI as imminent, and mishandling its imminence to be an existential threat; the dispute appears to be about what to do about that imminence. If they didn't both see it as imminent, the dispute would probably be less intense.

This has something of a character of a doctrinal dispute among true believers in a millennial cult


I totally agree.

They must be under so much crazy pressure at OpenAI that it indeed is like a cult. I'm glad to see the snake finally eat iself. Hopefully that'll return some sanity to our field.


Sam has been doing a pretty damn obvious charismatic cult leader thingy for quite a while now. The guy is dangerous as fuck and needs to be committed to an institution, not given any more money.


Why would they fire him because they are close to AGI? I get that they would go on full panic mode but firing the CEO wouldn't make sense since openai has AGI as an objective. The board wasn't exactly unaware of that.


You're right, I was imagining that he decided to hide the (full extent of?) the breakthrough to the board and do things covertly for some reason which could warrant firing him, but that's a pretty unlikely prior: why would he hide it from the board in the first place, given AGI is literally the board's mission? One reason might be that he wants to slow down this AGI progress until they've made more progress on safety and decided to hide it for that reason, and the board disagrees, but that sounds too much like a movie script to be real and very unlikely!

As I said, while I do have a mostly positive opinion of Sam Altman (I disagree with him on certain things but I and trust him a lot more than the vast majority of tech CEOs and politicians and I'd rather he be in the room when true superhuman intelligence is created than them), I hope this has nothing to do with AGI and it's "just" a personal scandal.


Altman told people on reddit OpenAI had achieved AGI and then when they reacted in surprise said he was "just meming".

https://www.independent.co.uk/tech/chatgpt-ai-agi-sam-altman...

I don't really get "meme" culture but is that really how someone who believed their company is going to create AGI soon would behave? Turning the possibility of the success of their mission into a punchline?


I think they fired him because they are _not_ close to the AGI (no one is), but he lied to the potential investors how they are.

That's against a popular sentiment about the upcoming "breakthrough", but also most probable given the characteristics of the approach they took.


No, we are not close to AGI. And AGIs can't leave machines yet, so humans will still be humans. This paranoia about a parroting machine is unwarranted.


you're right. agi has been here since GPT-3 at the least.

it's honestly sad when people who have clearly not use gpt4 would call it a parroting machine. that is incredibly ignorant.


Let me know when GPT can even play chess without making invalid moves, then we can talk about how capable it is of logical thinking.


Let me know when you can prove that "logical" and "intelligent" were ever stored on the same shelf, much less being meaningfully equivalent. If anything, we know that making a general intelligence (the only natural example of it we know) emulate logic is crazily inefficient and susceptive to biases that are entirely non-existent (save for bugs) in much simpler (and energy-efficient) implementations of said logic.


An AGI that can't even play a game of chess, a game that children learn to play, without making an invalid move doesn't really sound like an AGI.


I made a few tools using nomnoml in the past, including control flow & dependency graphs for GPU assembly code. I really like it, the only negative was that there's no reliable way to push certain things to be positioned close together, so for extremely large diagrams it sometimes makes bad choices and it gets a bit messy.

The code is reasonably easy to modify as well even if it isn't fully documented, so I was able to hack in a way to have tooltips on hover, and making certain boxes clickable linking to other diagrams. I'm really grateful for this awesome tool being open source!


As a former GPU architect, that's really interesting, thanks! I didn't realise A53's caches were strictly in-order and couldn't service hits ahead of misses, I always assumed this was something even much simpler designs were capable of.

I think complexity of verification as an argument against out-of-order is questionable, because if out-of-order resulted in a better core and a competitor did manage to build and properly verify such a core, then they would have a strong competitive advantage. But that might not be true in practice given the area/power cost.

As an aside: different GPU vendors also have different limitations when it comes to in-order vs out-of-order caches, and GPUs have the extra complexity that loads are effectively doing "gather", e.g. 32-wide warps doing a load with 32 addresses that may or may not uniquify, so a single "return" to the shader processor may be anything from 1 to 32 (or even 64) cachelines.

And GPU gets even more tricky with the texture unit doing trilinear+anisotropic filtering, so a single pixel may require 32x as many inputs, and you may even get into situations where the cache isn't big enough (or doesn't have enough ways) to handle the worst case and you have to revert to in-order for certain modes, or process things at a finer granularity than entire warps! Or just do in-order for everything with huge latency FIFOs and accept the latency cost. Lots of different ways to handle this, also depending on what granularity of returns your shader processor can handle. As you said, both modern CPUs and GPUs can't really be defined using simple labels.

Gather makes things a lot harder for load pipelines so I'm not surprised Zen4 seems to still just split it into uOps, but I'm curious exactly how Intel solves handles it in their CPU microarchitecture. Sadly this is the kind of thing that's practically impossible to know as an outsider!


Can you recommend something to read and learn for an experienced hardware designer (with a bit of graphics pipeline knowledge), if I want to make my own toy GPU? The field seems to be exceptionally interesting and I have no idea how to get in :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: