He gives reduce as an example of "purely functional programs have no side effects and are thus trivially parallelizable.", but reduce by definition is not trivially parallelizable.
Reduce is trivially parallelizable if you can ensure your operation is associative, but you have to work in languages like Epigram and Agda to guarantee that.
No, you just have to make sure the function you pass is associative.
Only in really strongly typed languages like Agda can you make the associativity constraint /part of the type/ of the function, and thus statically checked by the compiler; but it's perfectly reasonable to make it part of the specification of the function's behavior that it expects its argument to be associative. (Or, slightly more generally, that the order of reduction is unspecified; in which case only an associative function will produce a totally deterministic result.)
Even when you can't have a _proof_ of associativity in the type, it's perfectly reasonable to document the associativity requirement in types -- e.g. the Monoid typeclass in Haskell.
On a related note, I would like to see a "perverse" compiler and library which exhibits random behavior when undefined behavior is specified by the language.