> environment variables are to programs (especially chains of programs, parent processes, sub processes and sub-sub processes, etc.) what parameters are to functions -- and/or what command-line parameters are... they're sort of like global variables that get passed around a lot...
They're not globals, since they're copied into sub-processes; so mutation doesn't propagate upwards.
>They're not globals, since they're copied into sub-processes; so mutation doesn't propagate upwards.
If we broaden our thinking to look at a computer process as a mathematical function, i.e., if we think about a computer process as
y = f(x)
Where y is the behavior of the function (what final state it results in / resolves to), f is the process itself, and x is the parameter or list of parameters that are passed to f, then if that process reads and subsequently alters its behavior in relation to any environment variable, then we should no longer think of that function as merely y = f(x), but as
y = f(x, z)
where Z is the set of environment variables that are consumed by, and subsequently alter the behavior of f, resulting in a different y.
Now not all processes read environment variables and alter their behaviors because of an environment variable being set.
That is true.
Those processes remain as y = f(x).
But if a process does read and does act on an environment variable (usually without the end-users' knowledge, because how many users keep track of which programs read/act on -- which environment variables?), then then this the same as passing extra parameters -- y = f(x, z) -- to the function!
(Sub-observation: A future OS would have an API call to granularly read a single named environment variable at a time (not the entire block of them at once!), and use of this call could be logged and sorted by program, and there would be a user-settable control to granularly determine which environment variables could be read by which programs/processes...)
>They're not globals, since they're copied into sub-processes; so mutation doesn't propagate upwards.
Mutation or non-mutation upwards -- is not the issue.
The issue is: If users run program f and they pass it various command line parameters x, and they want y, then if environment variables are present and if they alter the behavior of that program, then what the users are really getting is y = f(x, z), which may not be the behavior/result they want, because z influences the behavior, and is passed in a not-really-all-that-transparent manner (most people usually don't check, log or modify environment variables nor account for them in the determinism of their programs -- unless something is broken...)
Phrased another way -- a future OS would have some way of logging everything, all state information (including environment variables which includes registry settings) -- that go into any given program.
Now maybe environment variables aren't "globals" in the strictest definition...
But let's see...
In most modern operating systems as of 2025, environment variables can easily be read by most programs, functions/procedures/methods inside of those programs, sub-functions/sub-procedures/sub-methods of those programs, etc., etc., etc.
Once they can be read... they can become the state of one or more variables in that program its functions, sub-functions, etc.
And once it can become the state of those one or more internal variables, the program can alter its behavior / result -- based on them.
Global variables -- can do the same exact same thing to a program.
So I'll leave it as a linguistic/semantic debate to future readers, mathematicians, programmers and OS designers -- as to whether or not environment variables (and related globally readable objects such as the Windows Registry) are global variables...
(You are very much correct about environment variables -- if they are mutated (aka written to/overwritten) by by a sub-process, then that mutation doesn't typically propapagate upwards the parent/creator process chain -- but perhaps in the context of my discussion, I am interested/concerned about -- the global readability/accessibility of environment variables... But you are very much correct in your statement!)
I think the key point that I am trying to make is that more transparency/insight/logging could always be had as to where exactly programs get ALL of their inputs from (this includes environment variables, this includes API calls which may differ machine to machine, etc., etc.), and what average end users are made aware of...
#!/usr/bin/env bash
# The export keyword turns a shell variable into an env var
export foo='hello'
echo "BEFORE '$foo'"
# Invoke the env command as a subprocess, to list all of the
# environment variables it's inherited. Filter it using grep.
foo='goodbye' env | grep 'foo='
echo "AFTER '$foo'"
Our hypothesis will be that env vars are "globals", i.e. that there's a place in memory that all these occurrences of the name `foo` are referring to. Under this hypothesis, there are several plausible outputs that the above script might give:
Perhaps the `foo='hello'` assignment sets the memory referred to by foo to the value `hello`; then the `foo='goodbye'` assignment sets that memory to `goodbye`. In which case we'd expect to see:
BEFORE 'hello'
foo=goodbye
AFTER 'goodbye'
On the other hand, perhaps the `foo='goodbye'` assignment fails (maybe since that memory already contains the value `hello`?), in which case we would expect to see something like:
BEFORE 'hello'
foo=hello
AFTER 'hello'
It might even be the case that some obscure issue causes both assignments to fail; but, by sheer coincidence, the memory `foo` is referring to just-so-happens to already contain the value `goodbye`. In that case, we'd expect to see:
BEFORE 'goodbye'
foo=goodbye
AFTER 'goodbye'
Now, let's test our hypothesis by performing the experiment, i.e. by executing the script:
BEFORE 'hello'
foo=goodbye
AFTER 'hello'
Uh oh, that doesn't correspond to any of the possibilities I gave! With a little more thought, we might see that this output cannot be produced using a single memory location; since the value `hello` was not "forgotten", even though foo had the value `goodbye`.
The answer is that env vars are not globals; instead, they are "dynamic variables". Normally, programming languages implement dynamic variables internally by traversing the stack; e.g. in Lisp it would look something like:
However, that wouldn't work across process boundaries; which is why env vars get copied (perhaps with additions/removals) when subprocesses are created.
(I go into more detail in the blog post I linked ;) )
PS: You may be wondering why I wrote a script containing `foo='goodbye'` if I previously said "Forget mutation". That is because we are not mutating the value of `foo`; we are entering a new scope, where `foo` has a different value; but the old scope with the old value still exists; we saw as much when it outputs `AFTER 'hello'`. Similar to how in "lexical scope" (a more common form of scoping, which is different from global scope and from dynamic scope) we can write a whole bunch of functions with arguments called `x`, but that doesn't count as mutating the value of `x`. Or we can even call the same function, perhaps recursively, with different values for the same argument; but those new values are not mutations of the old ones, despite them having the same name and being defined in the same place (like dynamic scope, lexical scope is also typically implemented within a process by using stack frames).
If I am understanding you correctly, you are stating that a given shell, specifically a subshell -- may not in some cases see the same shell variables as other shells...
That is true.
Subshells may not in some cases see the same shell variables as in other shells.
I'm not contesting this.
But let's suppose that we have not a shell variable, but a socket...
A socket that any program can open -- and retrieve a web page from...
For the simplicity of thought, let's say that the data for that web page is always static.
It always returns the same web page; the same data for that web page...
So now my question to you:
Can that socket, which can be opened by any program, mimic a global variable?
?
Why or why not?
?
Or perhaps an even simpler question...
Let's suppose that there's a file on filesystem... globally accessible to be read and written to by all programs...
Your filesystem question is actually far from simple, due to all manner of edge-cases. In particular, since you say things like "globally accessible to be read and written to by all programs", I'll assume we're not talking about chroot, bind-mounts, etc. I'll also ignore the case where we open a file path, then delete the path, then open the path again; since that gives us two separate files, though the first can no longer be accessed via its original path (we can hand-wave these by pretending the path has moved to /proc/<pid>/fd or something).
If we ignore those, and just stick to normal FS operations in an ordinary Linux process, then those filesystem objects are globals, since the filesystem is a global namespace: a name (or path, in this case) will always refer to the same filesystem object (modulo the caveats above).
---
> If I am understanding you correctly, you are stating that a given shell, specifically a subshell -- may not in some cases see the same shell variables as other shells...
No, that's not what I'm saying. I'm talking about env vars: bound via `execve`, stored near the spawned process's stack). Not shell variables, or any other language-specific/internal variables (whether a shell, like Bash, or otherwise). That's the entire reason why my code example used `export`.
>No, that's not what I'm saying. I'm talking about env vars: bound via `execve`, stored near the spawned process's stack). Not shell variables, or any other language-specific/internal variables (whether a shell, like Bash, or otherwise). That's the entire reason why my code example used `export`.
OK, fair enough!
>"If we ignore those, and just stick to normal FS operations in an ordinary Linux process, then
those filesystem objects are globals
, since the
filesystem is a global namespace
: a name (or path, in this case) will always refer to the same filesystem object (modulo the caveats above)."
Now, here you make an excellent point (and one that escaped my perception at the start of this dialogue, when I focused mostly on considering shell variables as globals), which is simply this:
Any OS system object (which includes but is not limited to files, environment variables, shared memory, synchronization objects, lists of things (and other objects) produced by API calls, sockets, OS data structures in memory, etc.) which is global in scope, that is, accessible to processes/programs -- is potentially a global variable...because programs/processes can potentially treat them as global variables...
(Now, let me nuance that statement, and import some of your arguments!)
...at least a good percentage of the time...
...that is (and here's the import of some of your arguments!), at least, not when edge-cases and other special circumstances and caveats apply (of which you've given many that could potentially apply!)
So is any global OS object -- the same as a global variable?
The short answer might be "a good percentage of the time, yes".
The longer answer might be "a good percentage of the time, yes -- but it can depend on many other factors..."
And the longest, most nuanced answer, might start something like this: "a good percentage of the time, yes, but it can depend on many other factors -- and what follows is a list of all of those potential factors..."
Anyway, you make a whole lot of very excellent, interesting, and certainly thought-provoking points!
I appreciate your engagement in this discussion! (You genuinely broadened my perception in this area!)
They're not globals, since they're copied into sub-processes; so mutation doesn't propagate upwards.
My own opinion is that environment variables are dynamically scoped keyword arguments http://www.chriswarbo.net/blog/2021-04-08-env_vars.html