Issue is that both harness and specific model matters a lot in what type of instruction works best, if you were to use Anthrophic's models together with the best way to do prompting with Codex and GPT models, you'd get a lot worse results compared to if you use GPT models with Codex, prompted in the way GPTs react best to them.
I don't think people realize exactly how important the specific prompts are, with the same prompt you'd get wildly different results for different models, and when you're iterating on a prompt (say for some processing), you'd do different changes depending on what model is being used.
Having experimented with soft-linking AGENTS.md into CLAUDE.md and GEMINI.md, this lines up well with my experience. I now just let each time maintain it's own files and don't try to combine them. If it's something like my custom "## Agent Instructions" then I just copy-pasta and it's not been hard, and since that section is mostly identical I just treat AGENTS.md as the canonical and copy/paste any changes over to the others.
If you create a CLAUDE.md with contents of @AGENTS.md that works and has the added benefit of allowing Claude specific instructions too, to be added below.
I think one of the main examples that i saw in a swyx article a while back is that using the sort of ALL CAPS and *IMPORTANT* language that works decently with claude will actually detune the codex models and make them perform worse. I will see if I can find the post
Because that just does it for you, it doesn't help me understand how to write better prompts.
Actually, I can just read the skill with my own eyes and then I can also learn. So, thank you for sharing. It's interesting to read through what it suggests for different models - it fits for the ones I work with regularly, but there are many I don't know the strengths and weaknesses of.
I don't think people realize exactly how important the specific prompts are, with the same prompt you'd get wildly different results for different models, and when you're iterating on a prompt (say for some processing), you'd do different changes depending on what model is being used.