Hacker Newsnew | past | comments | ask | show | jobs | submit | bkitano19's commentslogin

notable omission of deepgram models in comparisons?

deepgram seems really good (esp Enhanced and Nova 3 models).

same for gladia it's ranked top 1 in the STT blind tests: https://compare-stt.com/

+1 to running. If you run consistently, you'll learn to believe in your body as something that naturally improves if you train it well, and that belief will cross over to your mind and heart.


+2 for running. Running can become a nice little exercise and data collecting obsession.


You can use voice prompting; it's supported on ElevenLabs and Hume.


Awesome post!


Indeed, the title undersells it and I'm glad I didn't skip over it, the article is basically an information-dense but approachable summary of audio generation.




Related work:

Interpreting Modular Addition in MLPs https://www.lesswrong.com/posts/cbDEjnRheYn38Dpc5/interpreti...

Paper Replication Walkthrough: Reverse-Engineering Modular Addition https://www.neelnanda.io/mechanistic-interpretability/modula...


And more recently, [Language Models Use Trigonometry to Do Addition](https://arxiv.org/abs/2502.00873)


hume.ai specializes in expressive prosody for TTS (disclaimer - I work here)


Time to first token is as important to know for many use cases, rarely are people reporting it



this is nuts


We think so too, big things coming :)


www.juicelabs.co


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: