Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The how do you know it got it right?


I was provided with a battery of externally-produced tests, benchmark scripts, etc. I was told to assume that the tests were comprehensive.

Independent of this, I used competing models produced by different organizations (e.g. OpenAI vs. Google) to test & verify each other's work.

I also could, somewhat, follow along with the math itself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: