I have considered this before, but then if the content can be cached why wouldn'...

myzie · 2025-03-26T12:28:56 1742992136

I'm definitely with you that sites should be leveraging CDNs and similar. But I get that many don't want to do any work to support bots that they don't want to exist in the first place.

To me it seems like the companies actually doing the crawling have an incentive to leverage centralized caching. It makes their own crawling faster (since hitting the cache is much faster than using Playwright etc to load the page) and it reduces the impact on all these sites. Which would then also decrease the impact of this whole bot situation overall.

brookst · 2025-03-26T12:54:48 1742993688

It would shift the complexity and cost of large scale caching to a provider that would sell to the scrapers. Not sure it has much value, but it’s kind of a classic three tier distribution system with a middleman to make life easier for both producer and consumer.