Actually I think it's sometimes harmful to take the maths too seriously. There are three parts to the ideal paper:
1. Describe a new technique;
2. Show that it works;
3. Explain why it works.
Understanding why things work is easily the hardest thing. This is where the most maths gets deployed...But often people are reaching for the fancier maths when they can't find a simpler intuition behind the idea. You can also use fancier analysis to substitute for less impressive empirical results. These explanations might convince reviewers, but that doesn't make them any more likely to be correct.
I find it effective to take a very "computer's eye view" of things. Instead of thinking primarily about the formalisation, I mostly think about what's being computed. What sort of information is flowing around, during both the prediction and the updates? What dynamics emerge?
1. Describe a new technique; 2. Show that it works; 3. Explain why it works.
Understanding why things work is easily the hardest thing. This is where the most maths gets deployed...But often people are reaching for the fancier maths when they can't find a simpler intuition behind the idea. You can also use fancier analysis to substitute for less impressive empirical results. These explanations might convince reviewers, but that doesn't make them any more likely to be correct.
I find it effective to take a very "computer's eye view" of things. Instead of thinking primarily about the formalisation, I mostly think about what's being computed. What sort of information is flowing around, during both the prediction and the updates? What dynamics emerge?