Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For those who want to dive deeper, here’s a 300 LOC implementation of GRPO in pure NumPy: https://github.com/superlinear-ai/microGRPO

The implementation learns to play Battleship in about 2000 steps, pretty neat!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: