One whole technique not mentioned in the paper or comments is bitslicing. For non-branching code (e.g. symmetric ciphers) it's guaranteed constant-time and it would be a remarkable compiler indeed which could introduce optimizations and timing variations to bit-sliced code...