> only work when you have much more RAM than the files you're mapping in.
Really depends on what you're doing, like memory access patterns. I've definitely seen scenarios when mapping hundreds of gigabytes of data on dozens of gigabytes of ram where mmap has been an almost absurd performance boost over traditional I/O, both immediately but also asymptotically as all the most frequently accessed data ends up in cache and the least accessed data is paged out.
I don't disagree with the subtlety part though. It's very difficult to reason about I/O performance in general. Modern systems are like an onion of hidden performance optimization tricks and caching layers (both in software and hardware).
Yeah and on top of that, different systems (software and hardware combos) are different, so I can see the performance of this depending on the implementation of mmap on the system and the implementation of caches and virtual memory on the architecture. When I've debugged stuff like this, it's either been for myself in which case I know what combo I'm running on or it's been for work where we know which combinations we target and we run regression tests to observe perf implications.
Really depends on what you're doing, like memory access patterns. I've definitely seen scenarios when mapping hundreds of gigabytes of data on dozens of gigabytes of ram where mmap has been an almost absurd performance boost over traditional I/O, both immediately but also asymptotically as all the most frequently accessed data ends up in cache and the least accessed data is paged out.
I don't disagree with the subtlety part though. It's very difficult to reason about I/O performance in general. Modern systems are like an onion of hidden performance optimization tricks and caching layers (both in software and hardware).