The paper mentions that on several occasions the LLM will provide a correct answ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		jtefera 11 months ago \| parent \| context \| favorite \| on: 30% drop in O1-preview accuracy when Putnam proble... The paper mentions that on several occasions the LLM will provide a correct answer but will either take big jumps without justifying them or will take illogical steps but end up with the right solution at the end. Did you check for that?

e1g 11 months ago [–]

No, I don't know enough math to test the logic, only the check questions against their expected answers in https://anonymous.4open.science/r/putnam-axiom-B57C/data/Put...

zeroonetwothree 11 months ago | [–]

Putnam problems need to actually be graded, often the answer itself is trivial.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact