No, that's not how it works. The process table gets duplicated and copy-on-write takes care of the pages. As long as they are identical they will be shared, there is no way that 10GB of RAM will be allocated to the forked process and that all of the data will be copied.
This is the only right answer. What actually happens is you instantly have two 10G processes which share the same address space, and:
3. A microsecond later, the child calls exec(), decrementing the reference count to the memory shared with the parent[1] and faulting in a 36k binary, bringing our new total memory usage to 1,045,612KB (1,048,576K + 36K)
CoW has existed since at least 1986, when CMU developed the Mach kernel.
What GP is really talking about is overcommit, which is a feature (on by default) in Linux which allows you to ask for more memory than you have. This was famously a departure from other Unixes at the time[2], a departure that fueled confusion and countless flame wars in the early Internet.
> 2. We could overlook the memory usage increase and pretend that we have enough memory, and only really panic if the second process truly needs its own 10GB RAM that we don't have. That's what Linux does
"pretend" → share the memory and hope most of it will be read-only or unallocated eventually; "truly needs to own" → CoW
It will never happen. To begin with all of the code pages are going to be shared because they are not modified.
Besides that the bulk of the fork calls are just a preamble to starting up another process and exiting the current one. It's mostly a hack to ensure continuity for stdin/stdout/stderr and some other resources.
It will most likely not happen? It's absolutely possible to write a program that forks and both forks overwrite 99% of shared memory pages. It almost never happens, which is GP's point, but it's possible and the reason it's a fragile hack.
What usually happens in practice is you're almost OOM, and one of the processes running in the system writes to a page shared with another process, forcing the system to start good ol' OOM killer.
Sorry, but no, it can't happen, you can not fork a process and end up with twice the memory requirements just because of the fork. What you can do is to simply allocate more memory than you were using before and keep writing.
The OOM killer is a nasty hack, it essentially moves the decision about what stays and what goes to a process that is making calls way above its pay grade, but overcommit and OOM go hand in hand.
It does not happen using fork()/exec() as described above. For it to happen we would need to fork() and continue using old variables and data buffers in the child that we used in the parent, which is a valid but rarely used pattern.
Please read the parent comments. Overcommit is necessary exactly because kernel has to reserve memory for both processes, and overcommit allows to reserve more memory than there is physically present.
If kernel could not reserve memory for forked process, overcommit would not be necessary.
This is a misconception you and parent are perpetuating. fork() existed in this problematic 2x memory implementation _way_ before overcommit, and overcommit was non-existent or disabled on Unix (which has fork()) before Linux made it the default. Today with CoW we don't even have this "reserve memory for forked process" problem, so overcommit does nothing for us with regard to fork()/exec() (to say nothing of the vfork()/clone() point others have brought up). But if you want you can still disable overcommit on linux and observe that your apps can still create new processes.
What overcommit enables is more efficient use of memory for applications that request more memory than they use (which is most of them) and more efficient use of page cache. It also pretty much guarantees an app gets memory when it asks for it, at the cost of getting oom-killed later if the system as a whole runs out.
I think you've got it backwards: With overcommit, there is no memory reservation. The forked processes gets an exact copy of the other's page table, but with all writable memory marked as copy-on-write instead. The kernel might well be tallying these up to some number, but nothing important happens with it.
Only without overcommit does the kernel does need to start accounting for hypothetically-writable memory before it actually is written to.
But large fraction, if all you do afterwards is an exec call. Given 8 bytes per page table entry and 4k pages, it's 1/512 memory wasted. So if your process uses 8GB, it's 16MB. Still takes noticeable time if you spawn often.
I've never had the page tables be the cause of out of memory issues. Besides the fact that they are usually pre-allocated to avoid recursive page faults, but nothing would stop you from making the page tables themselves also copy-on-write during a fork.
Aren't page tables nested? I don't know if any OS or hardware architecture actually supports it, but I could imagine the parent-level page table being virtual and copy-on-write itself.
No, that's not how it works. The process table gets duplicated and copy-on-write takes care of the pages. As long as they are identical they will be shared, there is no way that 10GB of RAM will be allocated to the forked process and that all of the data will be copied.