Having a multi-hour delay in retrieval lets them move it into their off-peak hours. Since their bandwidth costs are probably calculated off their peak usage, a service that operates entirely in the shadow of that peak has little to no incremental cost to them.
I'm not talking load so much as network traffic. Companies like Amazon and Google have huge peak hour outbound traffic during US waking hours, and then a huge dip during off hours. If they can push more of the traffic into those off hours they can make the marginal cost of the bandwidth basically zero.
So if you make a request at peak hours (say 12 noon ET), they just make you wait until 11 PM ET to start downloading, shifting all that bandwidth off their peak.
Even if you're just "flattening" the peak, when it comes to both CPU and bandwidth, that's a major cost reduction since their cost is driven by peak usage and not average usage in most cases.
Amazon has ridiculous internal bandwidth. The costly bit is external. The time delay is largely internal buffer time - they need to pull your data out of Glacier (a somewhat slow process) and move it to staging storage. Their staging servers can handle the load, even at peak. GETs are super easy for them, and given that you'll be pulling down a multi-TB file via the Internet, your request will likely span multiple days anyhow - through multiple peaks/non-peaks.
I was referring to the external bandwidth. Even if pulling down a request takes hours, forcing them to start off peak will significantly shift the impact of the incremental demand. I'm guessing that most download requests won't be for your entire archive - someone might have multiple months of rolling backups on Glacier, but it's unlikely they'd ever retrieve more than one set at a time. And in some cases, you might only be retrieving the data for a single use or drive at a time, so it might be 1TB or less. A corporation with fiber could download that in a matter of hours or less.
I get it - but I'm arguing that the amount of egress traffic Glacier customers (in aggregate) are likely to drive is nothing in comparison to what S3 and/or EC2 already does (in aggregate). They'll likely contribute very little to a given region's overall peakiness.
That said - the idea is certainly sound. A friend and I had talked about ways to incentivize S3 customers to do their inbound and outbound data transfers off-peak (thereby flattening it). A very small percentage of the customers drive peak, and usually by doing something they could easily time-shift.
I think the point is that they're trying to average the load across the datacenter, only part of which is Glacier. If they can offset all the Glacier requests by 12 hours, they'll help normalize load when combined with non-Glacier activity.