A few observations about what's changed since this was written:
The layout shown in Figure 2.1 with a traditional shared Front Side Bus linking CPUs to the northbridge is a long-dead design, having been retired from mainstream products by AMD around 2003 and by Intel around 2008.
Since then, the northbridge and memory controllers have been on the CPU die(s) for almost all products. AMD's Zen 2 and later desktop/server CPUs split the northbridge back out to a separate die ("IO die").
All recent x86 processors that I can think of off the top of my head have the PCIe ports provided by the northbridge (which means it's usually part of the CPU die itself). The southbridge is connected to the northbridge/CPU by PCIe (usually four lanes), and part of the southbridge functionality is to act as a PCIe fanout switch.
And the terms "northbridge" and "southbridge" have fallen out of common use, largely because the northbridge being on the CPU die was standard practice for both Intel and AMD for over a decade.
DDR5 is coming, with the significant change of splitting a DIMM into two separate 32-bit memory channels rather than one 64-bit channel. That allows doubling bandwidth again without the burst length becoming longer than the standard 64-byte cache line size.
As an extra data-point outside the x86 world, this is also how modern POWER processors work. POWER9 integrates the PCIe and memory controllers onto the CPU die itself. This is one reason why then open source P9 computers from Raptor exist and have sort-of sane pricing (for such a tiny and niche market) -- a lot of the traditionally hard components are integrated onto the CPU, which makes designing boards around it that much easier vs needing complex chipsets.
There are also some POWER processors that move most of the memory controller off-chip, and use a media-agnostic high speed serial link between the CPU and each memory controller chip. That allows for a lot of flexibility in tuning for memory capacity or bandwidth, at the expense of a bit of extra latency. Those CPUs end up having all of their IO in the form of high speed serial links, some used for PCIe, some for NVLink, some for CAPI/OMI.
Yeah, the PCI-e root complex is on the CPU dies, because on x86 at least PCI is cache coherent with the CPUs. They need to get pretty close to the CPUs anyway.
This should probably have a (2007) in the title, FYI. “What every programmer should know about memory” is a modern classic and at first I thought this was a contemporary article about the paper.
As this is the introduction, it might be helpful for some readers to read the rest of the document: https://people.freebsd.org/~lstewart/articles/cpumemory.pdf Of course, it’s a 114 page PDF so the first ~20 pages in HTML might be easier to actually finish reading.
But this is a fantastic resource and something I really need to take the time to finish (I myself never get past the first 20 pages.....)
a) “Why do we use DRAM instead of SRAM?” is a perfectly valid question from a programmer that is worth a paragraph of detail, including the capacitor discharge curves. It is not like this is a major focus of the article! It’s just a small detail in the overview of the hardware. Seems very strange to be so upset by this.
b) If you ever write software for something that isn’t a desktop or server (eg a printer, a robot, some new shiny tech) then the hardware details of RAM are absolutely relevant and can’t be abstracted away by a nice alloc API. I can understand not wanting to focus on areas like this (devices are not my cup of tea) but I can’t understand deciding it’s ipso facto irrelevant because you’re a programmer.
I think you are contorting the comment to mean something it doesn't mean. "every" is the keyword there. Not every programmer needs to know how memory works. Some programmers should know. I don't need to know for 99.99999999% of the work I do. But even that 0.00000001% it is debatable. I write code that runs on target systems. I can evaluate its performance on the target system. Now if I had to write software for a system and speed was critical and I could choose the type of memory to use, then you bet I would be reading up on this subject.
> I write code that runs on target systems. I can evaluate its performance on the target system. Now if I had to write software for a system and speed was critical and I could choose the type of memory to use, then you bet I would be reading up on this subject.
Differences between types of memory don't just matter when you're picking what hardware to purchase. It also matters greatly when trying to understand why your code achieves a certain level of performance. Understanding the characteristics of your memory matters even if you're targeting a single fixed hardware platform that you can profile your code on.
The why isn't very important. You only need to know the constraints of your system, you don't need to know why the constraints are what they are... maybe you want to, but not need.
The why does come into play, often quite unexpectedly if you don't really understand how your memory system works. I've seen StackOverflow questions about performance anomalies that could only be properly answered by digging into the details of not just cache line size, but also cache associativity.
Not understanding how the system works means that sometimes you'll have to settle for not being able to use the full performance of your hardware, and that your performance constraints will in practice be rather inscrutable and often unpredictable.
> but I can’t understand deciding it’s ipso facto irrelevant because you’re a programmer.
The negation of “everyone should know this” is not “no one should know this”. I can understand that someone would protest to a claim about how “everyone should X” by giving a blanket statement like “no I shouldn’t”, but I interpret that as hyperboly in this case.
"Every" Is usually tacked onto these sorts of statements by someone who either has to go around cleaning up after other people, or is just generally tired of dealing with the consequences of other people hiding behind ignorance like it's the best shield even invented.
Externalizing your problems to others isn't right either.
Being upset is one of the way we signal that social mores have been violated. Making statements about how it would be good if everyone behaves is another.
Rejecting them and saying 'not my problem' is part of the problem.
The layout shown in Figure 2.1 with a traditional shared Front Side Bus linking CPUs to the northbridge is a long-dead design, having been retired from mainstream products by AMD around 2003 and by Intel around 2008.
Since then, the northbridge and memory controllers have been on the CPU die(s) for almost all products. AMD's Zen 2 and later desktop/server CPUs split the northbridge back out to a separate die ("IO die").
All recent x86 processors that I can think of off the top of my head have the PCIe ports provided by the northbridge (which means it's usually part of the CPU die itself). The southbridge is connected to the northbridge/CPU by PCIe (usually four lanes), and part of the southbridge functionality is to act as a PCIe fanout switch.
And the terms "northbridge" and "southbridge" have fallen out of common use, largely because the northbridge being on the CPU die was standard practice for both Intel and AMD for over a decade.
DDR5 is coming, with the significant change of splitting a DIMM into two separate 32-bit memory channels rather than one 64-bit channel. That allows doubling bandwidth again without the burst length becoming longer than the standard 64-byte cache line size.