Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

All of the above. The most frustrating one with the Putnam example with Claude was generating solutions that obviously didn't compile. This feels like plan collapse- not verifying its own work. I'm sure that if you just had a dumb two-model setup, it would eventually get to compiling code after n runs, but that was just for this one failure mode.
 help



You can use hooks to not allow it to stop without successful build



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: