This is usually one of the first things I try to add to projects that have stranded because of reliability issues or because there is code that nobody dares to touch for fear of upsetting some fragile balance.
It can be quite a bit of work but the pay-off can not be over-estimated.
The problem is that the programmers there can't make a ratchet: a monotonously improving codebase where each commit is like the click of a ratchet. If you know your invariants hold then it is fairly easy to establish whether or not a change is an improvement.
Testcases can supply this function, as well as counters set up for that purpose (statsd or something to that effect).
Once you have enough of those you can (slowly) start to make changes to observe the effects, and once you understand the codebase add more invariants that you have now determined should exist.
The last project where I took that approach (about 4 years ago now) went from 'intractable' to 'stable' in a relatively short time but it required a lot of thinking and some really hard work to get it there.
Much better if your language/platform supports that sort of thing out of the box, even better if it can be done across subsystems.
It can be quite a bit of work but the pay-off can not be over-estimated.