Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Some sources mention that o3 scores 63.8 on SWE-bench, while Gemini 2.5 Pro scores 69.1.

It's the opposite. o3 scores higher



On SWE bench? Show your source.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: