Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Few people are talking about it but... what do you think about the very over-the-top enthusiasm?

To me, it sounds like TikTok TTS, it's a bit uncomfortable to listen to. I've been working with TTS models and they can produce much more natural sounding language, so it is clearly a stylistic choice.

So what do you think?



I like for that degree of expressiveness to be available as an option, although it would be really irritating if I was trying to use it to learn some sort of academic coursework or something.

But if it's one in a range of possible stylistic flourishes and personalities, I think it's a plus.


All these language models are very malleable. They demonstrated changing the temperament in the story telling time.


Looks like their TTS component is separate from the model. I just tried 4o, and there is a list of voices to select from. If they really only allowed that one voice or burned it into the model, then that would probably have made the model faster, but I think it would have been a blunder.


The new voice capabilities haven't rolled out yet.


Oh, very interesting. The 4o model does now have TTS with a voice option similar to the one in the video, although objectively less over the top.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: