I just realized: We would need a circuit polling each of the jobs to see when they finish, and I don't see how it could poll all of jobs every clock cycle. So I don't see it being possible to achieve the per-cycle resolution I suggested above.
Furthermore, the polling circuit would have to poll each of the n tasks n times, leading us finally back to a running time of O(n^2).
Furthermore, the polling circuit would have to poll each of the n tasks n times, leading us finally back to a running time of O(n^2).
Still a fun thought experiment, though.