Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
jtefera
11 months ago
|
parent
|
context
|
favorite
| on:
30% drop in O1-preview accuracy when Putnam proble...
The paper mentions that on several occasions the LLM will provide a correct answer but will either take big jumps without justifying them or will take illogical steps but end up with the right solution at the end. Did you check for that?
e1g
11 months ago
[–]
No, I don't know enough math to test the logic, only the check questions against their expected answers in
https://anonymous.4open.science/r/putnam-axiom-B57C/data/Put...
zeroonetwothree
11 months ago
|
parent
[–]
Putnam problems need to actually be graded, often the answer itself is trivial.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: