It certainly feels like certain patterns are hardcoded special cases, particular...

mmmore · 2025-01-01T17:21:55 1735752115

There's a few things there that could be going on that seem more likely than "hardcoded".

1. The part of the network that does complex math and the part that write poetry are overlapping in strange ways.

2. Most of the models nowadays are assumed to be some mixture of experts. So it's possible that saying write the answer as a poem activates a different part of the model.

simonw · 2025-01-02T11:47:04 1735818424

Watch for ChatGPT or Claude saying "analyzing" - which means they have identified they need to run a calculation and outsourced it to Python (ChatGPT) or JavaScript (Claude)

The poem thing probably causes them to not decide to use those tools.

whimsicalism · 2025-01-01T15:26:05 1735745165

https://chatgpt.com/share/67755e6f-bfc8-8010-9aa3-8bcbbd9b26...

jsheard · 2025-01-01T15:42:16 1735746136

To be clear I was testing with 4o, good to know that o1 has a better grasp of basic arithmetic. Regardless my point was less to do with the models ability to do math and more to do with OpenAI seeming to cover up its lack of ability.

whimsicalism · 2025-01-01T15:56:26 1735746986

i think it’s mostly that o1 mini can think through the solution before it starts writing the poem.

i’m able to reproduce your failure on 4o

lelandfe · 2025-01-01T15:27:14 1735745234

“a poem about” reads to me at least like the solution need not be in the answer; maybe something like “a poem that includes the answer in the last stanza”

whimsicalism · 2025-01-01T15:56:48 1735747008

yeah but it like actually gets the answer wrong not just omits it