The how do you know it got it right?

acuozzo · 2025-10-02T19:44:15 1759434255

I was provided with a battery of externally-produced tests, benchmark scripts, etc. I was told to assume that the tests were comprehensive.

Independent of this, I used competing models produced by different organizations (e.g. OpenAI vs. Google) to test & verify each other's work.

I also could, somewhat, follow along with the math itself.