Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It doesn't get the actual optimal Q values computed from Stockfish (presumably this takes infinite compute to calculate), in fact it gets computed estimates from polling Stockfish for only 50ms.

So you're estimating from data a function which is itself not necessarily optimal. Moreover, the point is more like how far can we get using a really generic transformer architecture that is not tuned to domain-specific details of our problem, which Stockfish is.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: