Many times there one big thread running that can't scale to multiple cores. Thus, at least one core needs to be clocked high, and the higher the clocks the higher voltage is generally required. This correlates with power consumption and higher temperature. It doesn't matter if you have 15 other cores to spare unless the application can split its serial workload into parallel chunks.
Surprisingly many demanding applications such as high-end mobile games are essentially single-threaded (=there's one big main thread and then a number of helper threads with only a fractional load), and thus produce asymmetric loads and also very serialized execution between cpu and gpu. Thus, managing power becomes most difficult.