Hacker Newsnew | past | comments | ask | show | jobs | submit | mnbbrown's commentslogin

Ran it over our internal dataset of ~250 recordings of people saying british postcodes (all kinds of accents, etc) - it's competitive for sure!

Soniox (stt-async-v4): 176/248 (71.0%) ElevenLabs (scribe_v2): 170/248 (68.5%) AssemblyAI (universal-3-pro): 166/248 (66.9%) Deepgram (nova-3): 158/248 (63.7%) AssemblyAI (universal-2): 148/248 (59.7%) Cohere (transcribe-03-2026): 148/248 (59.7%) Speechmatics (enhanced): 134/248 (54.0%)

P.s. how do I get this to render correctly on here?


did you try gladia: ranking #1 on STT blind test https://compare-stt.com/

Added gladia..

- 1. Soniox (stt-async-v4): +176 new cases, running total 176/248 (71.0%)

- 2. ElevenLabs (scribe_v2): +26 new cases, running total 202/248 (81.5%)

- 3. Speechmatics (enhanced): +12 new cases, running total 214/248 (86.3%)

- 4. NVIDIA Parakeet (TDT 0.6B v2): +6 new cases, running total 220/248 (88.7%)

- 5. Mistral (voxtral-mini): +3 new cases, running total 223/248 (89.9%)

- 6. Gladia: +2 new cases, running total 225/248 (90.7%)

- 7. AssemblyAI (universal-2): +1 new cases, running total 226/248 (91.1%)

- 8. Deepgram (nova-3): +1 new cases, running total 227/248 (91.5%)

- 9. Cohere (transcribe-03-2026): +0 new cases, running total 227/248 (91.5%)

- 10. AssemblyAI (universal-3-pro): +0 new cases, running total 227/248 (91.5%)


This benchmark should have Whisper large-v3 as one of the models.

Try two newlines between each one

That, or add 4 spaces before each line (renders as a <pre>).

Two spaces: https://news.ycombinator.com/formatdoc

It's for code though, not lists or bullet points.


Is the human baseline 248/248?

Assuming all the accents are British, I doubt it. I probably couldn't get all 248 myself.

They are all transcribed by multiple blinded "accent natives". But yes, your point is valid - going to see if I can tease out the "single person accuracy".

Incroyable! Competitive (if not better) than deepgram nova-3, and much better than assembly and elevenlabs in basically all cases on our internal streaming benchmarking.

The dataset is ~100 8kHz call recordings with gnarly UK accents (which I consider to be the final boss of english language ASR). It seems like it's SOTA.

Where it does fall down seems to be the latency distribution but I'm testing against the API. Running it locally will no doubt improve that?


Elyos (https://elyos.ai) | London, UK | ONSITE | YC S23

Company: We're the UK's fastest growing AI voice company (~30% MoM) - very deeply focused on the trades industry right now (think your friendly neighbourhood plumbing company).

We're a small but experienced team hiring 3 more to join:

- Founding Engineer - Founding SDR - Founding Operations Lead (customer success)

Tech: GKE, python, postgres, telephony/media streaming, realtime LLMs, contextual eou detection, the occasional REST API.

https://careers.elyos.ai/ or message me - matt (at) elyos.ai

Mention bananas for extra kudos.


As someone who’s from Brisbane but spent the last 7 years in London you’re 100% correct. Brisbane is the best city in the world. I’m excited to eventually move back.


Decarbonising buildings and massive warehouses..

It's a very fun mix of hardware (for data collection), and crazy SQL queries to model energy flows between buildings, solar, batteries, etc. Considering just one building is pretty easy:

consumption = imported - exported + generated - stored + dispatched. carbon = carbon intensity * imported cost = tariff * imported

but then you add a site with a couple of buildings, solar on one of them, grid limited exports, etc modelling these flows is challenging. Like consider the case where one building got 10% of it's imported power from another building's excess solar, then calculating carbon becomes more difficult.

and once you've figured all that - then you have to figure out what makes commercial sense to do next.. install a battery, expand solar, move onto a TOU tariff, do nothing - and that's a whole other world of optimisation problems.


Also somewhat working in this space. Building a BMS (Building management system) to manage and control everything in commercial buildings. Think Homekit for commercial. There's something like 70% of buildings don't use one and they can be much more environmental friendly.

Utilizing https://project-haystack.org/


That’s cool!

Very familiar with BMSs but the lack of open standards and protocols has been extremely frustrating - makes me appreciate how good we have it with HTTP, etc.

Lots say they support BACnet but that’s only if they’ve been configured and the points exported, etc.

Haystack is a great step forward for labelling too but adoption seems fill with complexity :)


Have had to implement the BACnet spec for scheduling, and wow, that BACNet Standards PDF is huge :'}

Haystack definitely has it's challenges. My main concern is it's not very client-side friendly when attempting to use haystack-core types. But it's a cool framework.

Would love some modern toolsets in the space.


Could ask for a username the first time they publish. Low friction


GoCardless (YC11) | Senior Software Engineer | Full-time | London or Riga| https://boards.greenhouse.io/gocardless

GoCardless is used for domestic and international payments by 75,000+ organisations and counting, processing more than $30 billion across 30 countries.

Come work on new products, or gnarly scaling challenges.

Stack: Ruby, Rails, PostgreSQL, GCP, React

https://boards.greenhouse.io/gocardless or DM me.


We have something similar https://github.com/gocardless/nandi

It does signature checking and some and some other helpful things.


Website is out of date I think. It’s been open for a couple of weeks.



Thanks, I also found https://www.cnbc.com/2021/04/28/worlds-first-floating-sky-po.... So, unlike what some of the renderings suggest it's not an "infinity pool" but rather has high glass walls on the sides, which was one of the things I was wondering about. It also has those two... cables? running below that aren't on the renderings.


seems like tension cables to pre-stress the structure


Interested why you use the coldest colour for waking up? I do the same, but use a sunrise simulation instead that goes from warm to cold.


Hm, maybe it's self inflicted torture. I just feel uncomfortable with the blueish light and will get up quickly to leave that behind me. Maybe I will give your sunrise simulation a try, might work just as well, not sure.


Cold light helps you wake up in the morning, since it's more stimulating. That being said, it shouldn't be harsh like midday light; closer to an in between. I've heard it described as more of a green light than blue.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: