Your proposed statement “conforming implementations cannot transform code on the basis that some behavior left undefined by the standard is impossible” would have no meaning in the standard, even if it were written somewhere other than a non-normative note.
The standard doesn’t describe transformations; it describes behaviors. You can’t just say that undefined behaviors can’t be “transformed”, because there was nothing to transform them _from_. If you want some behaviors to be possible and some behaviors to be impossible, then you need to define those behaviors.
For C, that’s not something we can realistically do in every case. There’s no way to reasonably constrain the set of allowed behaviors after a program does some crazy pointer operation like * (int *) rand() = rand(), because any implementation of those constraints would more or less require an expensive run-time check at every pointer access.
What you could meaningfully ask is for particular situations (like negative shifts) to be considered unspecified or implementation-defined rather than undefined. That way you don’t give up the ability to constrain the behavior of programs after those situations arise. There are already many unspecified behaviors in the standard, such as reading the value of padding bytes in structs.
Such proposals could be considered on their merits, but they’d have to be discussed individually. There’s no way to remove all undefined behavior from C and still have something that would be recognizable as C.
>The standard doesn’t describe transformations; it describes behaviors
That is not remotely true - as you can see by looking up the example I reference.
>For C, that’s not something we can realistically do in every case. There’s no way to reasonably constrain the set of allowed behaviors after a program does some crazy pointer operation like * (int *) rand() = rand(), because any implementation of those constraints would more or less require an expensive run-time check at every pointer access.
That's such a weird response. There is nothing in the proposal that requires run-time checks. The proposal simply limits the ability of the compiler/translator to make counter-factual assumptions or otherwise violate the C semantics.
>What you could meaningfully ask is for particular situations (like negative shifts) to be considered unspecified or implementation-defined rather than undefined.
I don't want to do that: I want to clarify the meaning of "undefined behavior" to curtail its use as a license to do arbitrary code transformations. One implementation may trap on a negative shift, one may generate negative shift instructions that the architecture treats as no-ops or whatever they do, one might do sufficient static analysis to detect negative shifts and generate an error message - but the compiler is not free to assume that the shift operator has a negative value.
I'm not trying to remove undefined behavior from C. I am trying to correct a mistaken analysis of what undefined behavior should mean. C is intended to allow for non-portable operations, for example. These are "undefined" in the standard because they depend on the implementation, not because the C language should be specified so as to allow compilers to introduce hidden booby traps or counter-factual assumptions into the implementation
You should read the post you are replying to again, as it is well written and precise.
Basically, the intent of the proposal is understandable but it does not make sense as written.
The spec does not explain how to compile C code, it gives you a set of rules so that you could in theory take some piece of C source and a full execution trace and verify whether that trace was a conforming execution for that C code.
You take the example of the negative shift again, and it is addressed in the post you respond to : it's a perfectly fine proposal to demote negative shifts from UB to something less permissive to the compiler.
However, if you do so, of course, you have to frame what behaviors are accepted for a program doing such a negative shifts, just as you did in your post !
So let's say : arbitrary (but consistent) values are fine, termination is fine. That's great because it means we can still compile a shift to a CPU shift and there's no UB anymore. Ok.
Same goes for signed overflow.
However this is a case by case process, and you will absolutely not be able to ascribe meaning this way to many forms of UB. For example, as the GP said, storing through an invalid pointer.
If you want to compile a pointer store to a store instruction without runtime checks, then possible observable behavior include : other variable changing values, function call not returning to their calling context, and even of course running a full ROP chain exploit !
It's no surprise that in that case the specs says that the set of allowed observable behaviors is anything.
Argument like yours seem to rely on an incorrect understanding of the design of C. Languages like Java encapsulate program execution in a runtime that, in theory, can catch and manage all sorts of programmer errors that C does not catch. That's ok. If the programmer references past the end of an array, the compiler may catch and object, it may simply compile the instructions and let the program step on its own data structures or encounter a memory fault or even a protection violation if the architecture provides some type protection. And it's bizarre to see this warning about ROP exploit, when current "optimization" based on UB is a source of ROP exploit vulnerability. One cannot reliably put protections in code against bad pointers if upstream errors are grounds for deleting the guards.
> It's no surprise that in that case the specs says that the set of allowed observable behaviors is anything.
And yet UB is frequently described as this scary monster that will eat your hard drive.
Sure there are more UBs in C than there would need to be if the standard restricted more on contemporary hardware. I think it still allows ones' complement representation for example? But I guess it shouldn't be so bad as long as compilers do not exploit the freedom to do anything that comes from UB, more than they need to on a particular architecture?
Compilers do not "exploit the freedom to do anything that comes from UB", they just assume that the program does not have UB. This might or might not lead the compiler to behave as if it was doing anything it wants, but it's not screwing the programmer on purpose.
As for "UB is frequently described as this scary monster that will eat your hard drive", setting aside that UB could be exploited by an attacker, it's not hard to imagine, for instance, a recursive file deletion routine that is affected by UB misbehaving and erasing more than it should.
Let me take a shift operation as an example. On architectures where it's no problem to "overshift" (e.g. the operation always results in 0), the compiler could be just fine and leave shifts alone. Maybe a 0 is exactly what the user wanted!
Or, the compiler could assume that the programmer made sure that overshifting never happens, since it's UB. Based on that assumption, it could make more aggressive (and possibly totally insignificant) optimizations by inferring other assumptions that don't align with the programmer's idea anymore.
I never encountered such a sitation myself, but there are stories like that on the internet, where for example the Linux kernel was broken.
Another scenario, the user might always do an operation that sometimes results in a random value (which is UB), but in the end only use the resulting value if it was not UB. The compiler instead could assume the operation can never happen in the first place. Again, there is a misalignment here in how the compiler reasoned and how the programmer reasoned (closer to the actual hardware).
So, the difficult problem is probably defining the UB configurations as narrowly as possible. It might be a good idea to specify UB per architecture in some cases, but then it would lead to a lot of complexity. In any case a guideline for compilers should be to minimize surprises, which means not applying many optimizations around UB. That's easier said than done I guess.
> As for "UB is frequently described as this scary monster that will eat your hard drive", setting aside that UB could be exploited by an attacker, it's not hard to imagine, for instance, a recursive file deletion routine that is affected by UB misbehaving and erasing more than it should.
I'd figure it's much more likely to happen as a result of a logic error introduced by the programmer, without any UB involved.
Thanks, that's interesting. And still, it's artificially constructed, so concerning the point I made (actual logic bugs are more likely) I think it stands.
IMHO this is clearly a compiler bug. Yes, it's technically allowed to do this because calling a NULL function pointer is UB, but nobody wants such an optimization (which is my first point from earlier). If the compiler makes such a decision based on UB, shouldn't it be able to at least issue a warning?
Furthermore, why should this be optimized in the first place? If the user decides to make a function pointer instead of a hard-wired inlineable call, he is aware of the performance implications (which are negligible in 99.9% of the cases). Where function call efficiency matters, nobody ever used a function pointer.
Shouldn't the compiler architecture be able to avoid making this optimization, simply by avoiding propagating the UB? It should simply not say "Calling NULL can never happen". This way it would never make optimizations that make the situation even worse. I think this would even be less optimization code!
Instead, when the UB happens, the programmer should get what he deserves (which is, well, calling NULL, and most likely resulting in a segfault, which is the right thing).
On the other hand, if the compiler can infer the value of a function pointer based on propagation of values alone, I'm fine with hard-coding the address, even though it's probably not worth the effort anyway.
That's a "devirtualization" optimization. Suppose you have a C++ class with a virtual method (or something similar coded in C), which only has a single concrete implementation. Clearly, every place which calls that virtual method through any pointer to that class can only be calling that single concrete implementation, so the compiler can call the function directly instead of through the function pointer, perhaps even inlining the called function.
Yes, that EraseAll example is silly, but how can the optimizer distinguish it from non-silly examples? Perhaps there are other alternative functions which also set that same function pointer, but they were all hidden by some #ifdef; in that case, devirtualizing when there's only one option left will be a performance win, and can expose other optimization opportunities.
Hmm, shouldn't it be possible to do that in some other way? After all, a virtual function is something different than a function pointer (the vtable holding a function pointer is only an implementation detail, isn't it?).
Anyway, I'd rather not have that optimization. When we don't want to pay a small performance hit, we shouldn't make a virtual function in the first place. It's good to see how advanced optimization tricks can combine in dangerous ways with complicated language features!
> distinguish it from non-silly examples?
Yes, probably not possible. There is a real risk, but non-silly examples are less likely than constructed ones.
> all hidden by some #ifdef
If you do it with the preprocessor, there is no need for function pointers in the first place. Just make a #define instead of a variable.
Undefined behavior doesn’t exist to ease portability per se, it exists to handle meaningless yet syntactically valid constructs.
Shifting by a negative number is meaningless and is indicative of programming error. Instead of specifying that an implementation should check and handle these programming errors, which implies runtime costs, we use undefined behavior and leave the responsibility to the programmer.
The reason many compilers assume a positive shift after a shift operation executes is because otherwise the entire program would be undefined. You’re effectively asking for undefined programs to have defined behavior. The equivalent would be asking the compiler to correct your typos and it can’t do that. As they say: garbage in, garbage out. The compiler isn’t the one booby trapping you, you’re booby trapping yourself.
I think GP is right that what you want is more definition but it has to be in specific cases because otherwise the runtime costs would be too high. In your negative shift example, the definition would be “shifting by a negative number results in garbage output,” but note that that would burden implementations where negative shifts trap.
> it exists to handle meaningless yet syntactically valid constructs.
Maybe more like "handle bad situations that cannot be precluded at compile time". And since C has "no runtime", so can't do much "handling" at run time, it gets UB instead.
It doesn’t need to be unable to be precluded at compile-time to be UB. The expression “1 << -3” can be precluded at compile-time, yet it compiles just fine: instead it invokes UB.
Doing something at runtime is a different concept from having a runtime.
In any case, C, in fact, often does have a runtime. What do you think executes main()?
Yes, 1<<-3 can be detected as UB at compile time when doing constant propagation. But that wasn't my point (you could declare this expression invalid, no need for UB). I was pointing out that UB is not about "meaningless yet syntactically valid constructs" (if there is such a thing at all; I won't argue). UB in general manifests at run time. So I was only proposing a reformulation.
> In any case, C, in fact, often does have a runtime. What do you think executes main()?
Sure, that's why I said it has "no runtime" (in quotes). I'm not trying to split hairs here.
But your formulation isn’t right because it’s not about the UB code being detectable at compile-time or run-time, it’s about describing how the UB code should behave regardless of when it can be analyzed.
In the context of real numbers sqrt(-1) is syntactically valid yet meaningless code. In other contexts, 1 << -3 is too. Also log(-1). Perfectly valid syntax but has no meaning.
You're right, they're often ambiguous, but often I don't know how to say it tersely in a different way. My assumption was that in that context the statement [C has "no runtime"] had clear meaning, but maybe I was wrong. The point was to make it obvious that I don't want that be taken too literally, and that I don't want to start a hair splitting discussion whether C has a runtime or not. Apparently it didn't accomplish that goal.
As to the formulation of UB, it's not about code ("syntactically valid constructs") but about manifestation. And manifestation is a run time thing. 1<<-3 "is" UB, yes. And that can be detected at compile time through constant propagation. But that's not why we need UB. We could just forbid 1<<-3, so there was no need for UB in the first place.
UB is really about values not known at compile time. The expression 1<<x might manifest in UB at run time, depending on the value of x. Since x in general is not known well enough to decide whether it will ever be negative, the question "will this expression ever manifest in UB at run time?" is in general undecidable, so the best answer is "possibly". That does not imply that it's always a bad idea to use the expression 1<<x, and it does not make the construct meaningless. (Really I don't think the construct itself has a meaning, but as I said I won't argue here).
And that's why I think the formulation "handle meaningless yet syntactically valid constructs" is misleading. It's not about the construct (i.e., syntactic construction). It's about run time values that lead to UB, and that are totally independent from the construct itself. They might come from a wrong function invocation, or from malicious user input. It is the programmer's duty to make sure, by whatever means, that UB never manifests.
Maybe that's just what you were trying to say, but I think it's important to be clear that it's not about the syntactic constructs.
I’m not saying that 1<<x is generally meaningless. I’m saying that whenever the shift operand is negative, it’s meaningless. Just like log(x) isn’t generally meaningless, it’s only meaningless when the argument is not positive.
Syntactic constructs in general should and do have meaning. The meaning isn’t inherent, but nothing has inherent meaning. log(x) has no inherent meaning, but when x is positive we have decided that it means that it evaluates to the value to which you would raise 10 to get x.
Bad run-time values only cause UB in the context of improper usage in syntactic constructs. They aren’t independent of syntactic constructs: both syntactic constructs and values dictate behavior.
You’re right that UB only manifests itself at run-time. And you’re right that we wouldn’t need to use UB for constructs that only manifest their behavior at compile-time. Indeed, there exists syntactically valid yet meaningless C++ template substitutions that don’t result in UB, they result in compilation errors.
> The standard doesn’t describe transformations; it describes behaviors.
That doesn't seem to be an argument against, or even related to, what the proposal entails. All of these proposals are reductive attempts to curb the instability that UB introduces, for little value (see the linux kernel).
> because there was nothing to transform them _from_
Yes there was. Anything that can result in UB.
> For C, that’s not something we can realistically do in every case.
That's why the proposal is not exhaustive, as described.
There aren't things that "can result in" UB, though. The C code itself, before any compilation, is already undefined by the C spec.
For instance, the following code is not valid C code:
int a[2], b;
a[2] = 5;
The compiler is free to do anything it wants, because the standard has not defined what it does. It could stop. Most likely it will just overwrite whatever memory is after a[1] with 5, because that's easiest.
If you want the compiler to do something specific, you need to define it. In some cases the standard has language for otherwise-defined behavior that is carved out as undefined (e.g., strict aliasing), and it's easy to go "Never mind"; in some cases it does not (e.g., writing to a string literal). For these ones, the behavior needs to be defined.
Simply saying "Don't optimize this" is just saying "I'm not going to define the behavior, but don't implement it this way," which isn't a particularly useful way to define a standard (e.g., for the null checks example, is the compiler permitted to leave the null check in, but insert a segfault right before it?).
No the compiler is not entitled to insert a segfault. Segaults come from architectures and operating systems, not from compilers. The compiler is free to leave in the dereference of the null pointer and then, who knows what happens. But the compiler can't break the code by deleting a perfectly conforming null check on the theory that a previous dereference means that the pointer is not null.
The original Bourne shell used signal to catch memory faults so it could operate dynamic paging! If your OS/implementation permits, C is supposed to allow you do do things like that. The compiler is NOT supposed to insulate the programmer from bumps. (Disclaimer: I am not endorsing the horrible design of the Bourne shell which was also written in a macro style that made C into pseudo Algol).
> Segaults come from architectures and operating systems, not from compilers.
But when you are programming in C, you are programming against the C abstract machine, not against an architecture or operating system. It's this machine that you are breaking when you invoke ub.
It targets the abstract machine, but the dispute here is about what should happen when the program operation goes into territory that the abstract machine does not specify. And C code is EXPECTED to go beyond the territory of the abstract machine.
> And C code is EXPECTED to go beyond the territory of the abstract machine.
Passive voice! You expect C code to go beyond the territory of the abstract machine. The authors of the standard, quite clearly, do not.
I can see both of these arguments. I think there is a place in the world for a C that is machine-specific. But it's clearly not what the standard authors have in mind, there are clearly people who find the genericness of C useful (cf. "portable assembler"), and piecemeal solutions like skipping individual bad optimizations won't solve the underlying problem that there's no way in C to say "Dude, I'm writing for a reasonable UNIX on a reasonable modern machine with a single flat address space for code and data and non-exception-raising two's-complement arithmetic and the ability to do nonaligned reads/writes and other normal things, please stop thinking I might also be writing for a PDP-11 bootloader."
I think you should go to the standards committee with a single proposal, which is that C should stop trying to be portable to weird / old architectures (and people who need them can use weird / old compilers, anyway). Among other things, you can now enable certain types of optimizations if the compiler is permitted to assume that arithmetic is two's-complement and nonaligned reads/writes work and so forth.
> if the compiler is permitted to assume that [...] nonaligned reads/writes work
One of the most popular architectures for "a reasonable UNIX on a reasonable modern machine with a single flat address space for code and data" is the x86, and on it, some nonaligned reads/writes do trap. Certain optimizations can be done only if the compiler is permitted to assume that the data is aligned.
>You expect C code to go beyond the territory of the abstract machine. The authors of the standard, quite clearly, do not.
That is 100% wrong.
>Strictly conforming programs are intended to be maximally portable among conforming
implementations. Conforming programs may depend upon nonportable features of a conforming
implementation.
The whole concept of "undefined behavior" is to cope with the reality that C programs routinely go beyond the bounds of the abstract machine.
What is your definition of valid? It's not valid by any C standard or common compiler implementation. For the sake of any argument, we should at least agree on a common definition for valid.
As for deleting null checks, time-travelling UB etc. the compiler always assumes the code is valid, bug-free for the compiler-specific definition of valid. This is at least the standard definition but it can be expanded by flags and extensions. Then they optimize that code without regard to what happens if the code actually had programming bugs.
is k = i+j valid C? Can it be translated sensibly to add i,j; jump on overflow to random; store k; ?
How about sometimes omitting the jump to random and sometimes not?
Signed integer overflow is undefined, so the only obligation is that the code operates correctly when signed integer overflow doesn't happen. So, yes all those options are permissible, because as far as the C standard specifies, it's "add i,j; if false jump to random; store k".
If you want to add a definition for signed integer overflow (as C++ is considering, see http://wg21.link/P0907r0 ) then it would be defined and the compiler would no longer get to consider "if signed overflow" and "if false" the same.
I 100% doubt that that Ritchie would have agreed. This is the existing interpretation of the standard, which is exactly what I propose to change. I believe the standard interpretation has strayed from both the language design and the objectives of the language.
> But the compiler can't break the code by deleting a perfectly conforming null check on the theory that a previous dereference means that the pointer is not null.
Why is this different from letting the compiler optimize out the second check here?
int foo(unsigned int i) {
if (i < 5)
return this();
if (i < 4)
return that();
return those();
}
The C standard knows that NULL is special, a marker value for an invalid pointer. (The implementation does not have to translate it as the pointer with numeric value zero; the abstract machine just needs some value.) Absent an Optional/Maybe/etc. type, C needs some invalid pointer.
What about the following—are you okay with compilers optimizing it?
struct pointer {
int *value;
char valid;
}
void *deref(struct pointer p) {
if (!p.valid)
abort();
return *p.value;
}
void stuff(struct pointer p) {
int n = deref(p);
if (!p.valid) {
fprintf(stderr, "Hey, that pointer wasn't valid!\n");
return;
...
}
No NULLs involved.
> The original Bourne shell used signal to catch memory faults so it could operate dynamic paging! If your OS/implementation permits, C is supposed to allow you do do things like that.
I'd believe that (especially given the state of optimizers in Bourne's era) C does allow you to do things like this. I'd also believe that with sufficient -fno-unwanted-cleverness flags. C is the best way to do things like this.
But asking C to be designed to support things like this is quite a change. I get that this is exactly the change you're pushing for, but, the only way it can work is if the C abstract machine becomes hardware-dependent enough that you can embed platform assumptions in your C—and so C is no longer a "portable assembler."
There's precedent for this. One that comes to mind is the ability to cast a data pointer to a function pointer (which doesn't have to be valid; the abstract machine doesn't know if your physical machine uses different address spaces and different representations for the two, and even the brand new architecture WebAssembly uses different address spaces, but compatible representations). dlsym() implicitly requires this, and POSIX.1-2013 finally made an explicit requirement that a C implementation conforming to POSIX must define it in a meaningful way. http://pubs.opengroup.org/onlinepubs/9699919799/functions/dl...
I would certainly support a variant of C that did things like assume a flat address space, validity of all pointer values besides NULL (and an expectation that signals can catch and fix invalid pointer dereferences), char = unsigned char, 8-bit bytes, two's complement arithmetic, etc. It would not support every architecture C supports. It wouldn't be a good fit as a C standard, but as a variant for people who are happy to never target segmented real-mode x86 or the PDP-11, it seems pretty great.
There is nothing in what I have proposed that requires assuming a flat address space or that pointers can catch and fix invalid pointer dereferences etc.
First, the compiler is allowed to compile your program as-if all values of all variables are such that undefined behaviour doesn't happen, but isn't allowed to extract value bounds based on that for the purposes of optimising across more than one expression. So the compiler is allowed to assume pointers aren't NULL where they are dereferenced, because that is undefined behaviour and the compiler either has to assume the pointer isn't NULL, or require proof from the programmer, or refuse to compile any C program with pointer dereferencing because it is potentially invalid C. However, it is not allowed to assume that because a pointer was dereferenced, that it is not NULL. This restriction should be tight enough that
int a, b;
... set a and b ...
if (a > 0 && b > 0 && (a+b)&(int)(-1) > a) {
...
} else {
...
}
is compiled to something like
mov eax a
add eax b
cmp eax a
jle .else
Second, you can define "trivially compiled C" for every platform as a set of substitution rules that transform every C operation into machine instructions without any optimization. Such a trivial compilation would have every variable write correspond to a memory write, keep no variables in registers, always compile the same expression to the exact same machine code (except for memory addresses of variables involved), never inline functions etc. But it would still be allowed to e.g. put 32-bit integers in 64-bit registers on the basis that int operations are assumed not to overflow. The committee should define that for every supported platform as a reference C compiler, then require that C compilation produce machine code with observable side-effects for single-threaded execution equivalent to the reference trivially compiled C.
Isn't the reference compiler with trivial compilation equivalent to trying to make all UB implementation-defined where the implementation is the "trivial compilation"? For example this would preclude optimization based on aliasing rules which a lot of tight loops rely on.
float max (float f[], int length, int* max_index) {
float max_value = -FLT_MAX;
*max_index = -1;
for (int i = 0; i < length; i++) {
if (max_value < f[i]) {
*max_index= i;
max_value = f[i];
}
}
return max_value;
}
Take the code above. It obviously has a well-defined trivial compilation implementation even for cases where max_index aliases an element of the f[] array. There is a good reason that implementations are allowed to assume that doesn't happen: it produces better code in the case where the program has no bugs. Otherwise even at -O3, the implementation would have to read f[i] twice from memory as the value might have changed when *max_index was written if it wants the same behavior as the "trivial compilation" implementation.
> Second, you can define "trivially compiled C" for every platform as a set of substitution rules that transform every C operation into machine instructions without any optimization. Such a trivial compilation would have every variable write correspond to a memory write, keep no variables in registers, always compile the same expression to the exact same machine code (except for memory addresses of variables involved), never inline functions etc. But it would still be allowed to e.g. put 32-bit integers in 64-bit registers on the basis that int operations are assumed not to overflow. The committee should define that for every supported platform as a reference C compiler, then require that C compilation produce machine code with observable side-effects for single-threaded execution equivalent to the reference trivially compiled C.
Such an approach would preclude putting any variable in a register if there is any not-provably-valid pointer store anywhere in the program.
Feel free to try, but I guarantee you that for most non trivial optimization it's going to be easy to find a piece of code that is UB today and where the optimization is observably different from the reference compilation.
This sounds like Regehr's "Proposal for a Friendly Dialect of C" [1]. I believe that was a well thought out argument, but ultimately did not make headway [2].
I haven’t seen a good explanation of the arguments against this kind of thing. Who doesn’t want to fix the UB weirdness? Are there really many people worried it will completely break optimization? If so I’d love to hear about it.
Even if you don’t agree with this proposal, or with Regehr’s, surely a reasonable response would be “okay, yeah, it would be good to do something along those lines, but not that; how about this?”
> I haven’t seen a good explanation of the arguments against this kind of thing. Who doesn’t want to fix the UB weirdness? Are there really many people worried it will completely break optimization? If so I’d love to hear about it.
If you're not a compiler writer, you probably have a mental model of undefined behavior that works like this:
if (expr->invokes_undefined_behavior()) {
// Yay! Silly user!
shit_on_users_code();
}
Sure, you probably don't think it's so brazenly stated, but the general sentiment from compiler users tends to be that all you have to do is delete that if statement and everyone's happy.
But that's not how compilers work, not in the slightest. Instead, the undefined behavior effects tend to work like a chain: you dereferenced a pointer, therefore it's now safe to assume on any subsequent path that it will always be safe to dereference that pointer, therefore we can move that later load out of that loop even though we don't know if the loop will be executed in the first place. And the reason that logic works is because we're allowed to optimize under the assumption of undefined behavior in one of those steps.
Now in some cases--signed integer overflow and strict aliasing come most readily to mind--there is a single knob you can twist to disable the undefined behavior (and most compilers let you twist that knob with command-line flags). Arguing whether or not the standard itself should dictate a different position on that knob is one thing, and it's by no means a bad discussion to have. But instead arguing "let's let you keep undefined behavior, but let's try to specify what you can do with undefined behavior" runs into the problem that it's very difficult to actually specify what somewhat-constrained undefined behavior really means. Look up the massive email threads on LLVM that try to tease what poison and undef actually mean.
Actually you should correct that as "If you're not a C compiler writer".
Undefined behavior is one of the tricks that enabled C compilers to catch up with what everyone else was doing in big iron outside AT&T walls.
Fran Allen the Turing awardee on high-performance computing and compiler research, stated this on C:
"Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue....
Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels?
Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities."
-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming
If you're not a compiler writer, you probably have a mental model of undefined behavior that works like this
I find that very patronizing.
Yes, we mere compiler users complain about specific examples; that’s because we keep finding specific examples that break in weird ways!
But at the same time, we realize that it’s a general approach that causes trouble. As I understand it: assuming that undefined behavior cannot happen and optimizing on that basis.
I’m not arguing for specific small tweaks to fix a few known weirdnesses, or even a more general adjustment to the notions of poison and undef (as your mention of “massive email threads” suggests).
I’m arguing, first, that the entire approach seems to be misguided and not working well; as evidence, many real-world examples of code doing extremely unexpected things, and the position of the Linux kernel maintainers (surely an important group of users by any standard).
Second, that the problem seems to be largely ignored by compiler writers; as evidence, no changes or improvements, despite several proposals from outsiders. You can point to massive internal debate on email threads but it amounts to nothing if nothing comes out of it.
Since -fno-delete-null-pointer-checks works in both GCC and clang and, nevertheless, the compilers are able to do significant optimization, you might want to think it out again.
Yes, by removing the undefined behavior. The parent post differentiated this from keeping the behavior undefined but somehow constraining it, which is what your proposal tries to do.
I don't think there are good arguments against it, and I agree it would be great to fix it. But I think it's very important to understand the reasons why Regehr's proposal failed. I also think it's incumbent on new proposals to argue why their effort is likely to succeed where Regehr's did not.
If I were more cynical, I'd say that the C-standards community has simply ceded the ecosystem niche of a low-level language which has common-sense behavior, creating an opportunity for other languages such as Rust, D, Zig, etc. But I don't really believe that. Almost everything we do relies on C at the bottom, and it's pretty much the only viable way to publish an API that can be consumed by multiple languages.
> Almost everything we do relies on C at the bottom, and it's pretty much the only viable way to publish an API that can be consumed by multiple languages.
I think it's true that almost everything we do relies on the C ABI at the bottom. That's different from relying on C, and I think these are actually the same problem: the C ABI is straightforward and has been implicitly stable (whereas e.g. the Rust ABI is explicitly not stable) because the C ABI has very few types and very little information in types (aliasing, ownership, alignment, thread-safety, parameter types of functions, etc.) and the more common-sense lower-level languages are obligated to have meaningful types.
You can certainly publish a C API / ABI from a Rust library and consume it from a D program without either side knowing that the other side isn't actually written in C.
I am curious what is the next meaningful step beyond the C ABI and the C type system: even stabilizing the Rust ABI (which I would love to see) isn't going to help a lot for writing D libraries that implement them or Python programs that consume them, partly because Rust has unique ideas of things like lifetimes that don't immediately translate to other languages. What's the subset of stuff that's in all these languages' type systems that isn't in the C ABI?
I'm reminded of Vala (which I kind of miss), which was GNOME's attempt a couple years ago at a better-than-C language that was very explicitly C-ABI-compatible. It used the GObject protocols for types - is GObject the shape of thing to standardize?
* Distinguishing between "byte array" and "string" types
* Vtables
* Something that lets GC'd languages cooperate more nicely. I'm not sure how exactly to specify this, but this is the sort of thing we could have if we stopped insisting that everything must be in terms of C ABI.
Hm, are there any cases of two GC'd languages that interop well that were not designed to interop from day one (that is, JVM languages don't count)?
The only one that comes to mind is https://research.mozilla.org/2014/08/26/javascript-servos-on... , which is Rust (pre-1.0) and JavaScript, but that somewhat doesn't count because Rust isn't really a GC'd language. (This article was written about a year after the special syntax for garbage-collected references was removed and Gc<T> became just a thing in the standard library, and about a month before that type was also removed because it wasn't a very good GC.)
It seems to me that if you make your GC a purely refcounting system this is easier: you just need to specify how to incref and decref objects, and put a well-known function in the vtable for destroying and deallocating the object. I'd guess that's how things like GObject or Qt bindings in Python or similar languages work.
COM was an example of using refcounting to drive a richer cross-language object model. And these days, WinRT is another take on the same (and, indeed, is COM under the hood, but with a richer type system and metadata). So you can have e.g. JS interop with C# via WinRT, but only because their shared view of the world is all refcount-based.
If you codify things that some vendors (notably Microsoft) are already doing with their calling conventions, that would give you alignment specifiers, and vector types passed in SIMD registers.
I personally think ownership is a biggie. If I pass in a pointer, is it expected to outlive the call?
> Almost everything we do relies on C at the bottom, and it's pretty much the only viable way to publish an API that can be consumed by multiple languages.
Not quite true, on Windows it is called COM, WinRT or UWP as of Windows 10. Or MSIL in alternative.
On IBM i, IBM z, Unisys ClearPath, takes a variation of what is known as language environment.
On Android you will have more luck targeting DEX than C.
On browsers better compile to JavaScript in some way.
FWIW, my personal response is "C does not carry enough information with variables to make productive high-performance programming possible; how about switching to a language that does?"
The trouble is that "carry information with variables" means one of two things, runtime checks (which often - but not always! - rule out "high performance") or types.
Rust is my personal preference here, but it's hardly the only option - Go and idiomatic C++14 are also good direct competitors, Java and PyPy can be faster than AOT-compiled code if you don't mind startup time, many applications actually aren't in need of a language that aggressively optimizes (e.g., it's relatively short logic on top of a database that someone else is writing in C), etc.
It's perfectly reasonable to argue that C is not as good as XYZ. You may even be right. But it is not reasonable to conclude from that, that it ok for C implementations to make C programming more hazardous than it needs to be. Go out and make Rust a more appealing alternative to C and good luck with it. But don't tell me C can be turned into mush because you don't much like C.
Don't get me wrong, I like C. I don't want to write production software in C any more, but I do actually like the language a lot. Le cœur a ses raisons.
C is already mush. Maybe it wasn't in the 1980s, but it certainly has been since I first started learning C (around 2000).
I'm simply responding out of rational self-interest to recognizing that C cannot be what I want it to be, and what I want it
to be looks an awful lot like Rust (or Go, or Swift, or...).
C doesn't even have sum types with a language construct for exhaustive matching. That, by itself, makes C programming dangerous. For every NULL check unexpectedly deleted by a compiler, there are a thousand NULL checks that weren't written in the first place, because C has no mechanism to require a NULL check before using a pointer.
It is perfectly ok for C implementations to make C programming as "hazardous" as they want to. It's up to you whether you want to use them.
Frankly, I don't understand the UB hate at all. There is a rapidly growing space of programming languages specifically designed to give you safety guarantees and well defined behavior. If that's what you want, by all means use them. There are still those of us out there for whom even `-fwrapv` represents an unacceptable loss in performance for certain programs. There is an important and valuable niche for a language that enables compilers to optimize code without regard for saving programmers from mistakes in their code, and that's always going to be the case. Most people probably shouldn't be using it very much, but that's an entirely different issue from trying to "fix" what's not broken in C.
That's easy, I'm working on a computational kernel (for a scientific simulation of fluid dynamics) right now that loses ~20% performance the second you compile it with -fwrapv in GCC (the corresponding factor is slightly smaller in clang, because it's worse at optimizing it in the first place). I haven't invested enough effort in it yet to tell you if I could beat the compiler with manual optimizations in a sane amount of time, but the difference is certainly pronounced.
I haven't stumbled across anything new or unique here, it's just that most people don't write compute-constrained kernels that often these days. The optimization benefits of UB for signed integers are well known, and that's just one useful example of UB in C compilers.
And again, I'm fully on board with the notion that many people would be better off with a more error tolerant subset of what C does - and there's a lot of languages that provide just that. I'm just saying there are real costs to the change and they matter.
I love to see code snippets. Otherwise this makes zero sense. The check for overflow is super cheap on any processor you'd run fluid dynamics code on. You'd have to be running pathologically tiny loops. I have never seen any papers on this "well known" benefit. Perhaps you can provide me a link. In any event, it's interesting that you think that compute intensive kernels are self-evidently more important uses of C than device drivers, embedded systems, middleware, etc. Perhaps you'd be better off using a language specifically designed for optimized numerical codes - FORTRAN for example.
I can't release the whole codebase before publication, but if you're really serious about analyzing it, I can probably sanitize it in an evening and get you an MWE.
Regardless, apart from pathological integer math cases, these things have little to do with the cost of overflow checks. Rather, it's about the followup optimizations the compiler can do after the UB frees it from a restrictive assumption. The trivial common examples (optimizing x*2/2 to x for integers, etc.) don't really capture the impact of these things very well, because you can't demonstrate it in a oneliner, but you can derive representative examples pretty easily if you take tight-ish loops (>16 independent loads, some SIMD-able compute, a few dependent stores at the end) and switch the index types. The vectorizer is often easy to defeat if it can't assume no overflows, the cost of the check itself is irrelevant.
On your last point, I don't think compute intensive kernels are "self-evidently more important" than anything else. I think there are usecases - like scientific computing, device drivers and embedded software, to name a few - that often benefit from prioritizing programmer control over any safety net, and FORTRAN doesn't cut it for many of those uses. There are plenty of languages that give you all the safety you could possibly want. C/C++ have never been about that and I don't see why that should change. Use the right tool for the job, by all means advocate for people who don't need this level of control to use something "safer", but the notion that every tool should be blunted to some common denominator just makes no sense to me.
I'm puzzled how I could have given the impression that I am looking for safety. I just don't want the compiler to defeat programmer intentions. I don't see how your example can depend on overflow semantics. The vectorizer is free to assume there is no overflow and let the programmer suffer if she screwed up. What I don't want the optimizer to do is to prevent the programmer from checking for overflow because it can't happen in theory while it can happen in practice.That is, I am interested in prioritizing programmer control over the safety net.
Well, now I'm confused. When does the compiler defeat programmer intentions? If you specify -fwrapv and insert manual overflow checks wherever you wish, the compiler will never optimize them away and you will have as many runtime checks as you want. Programmer control is not compromised, which is good.
One of the motivating cases for this proposal is that gcc/clang, without fwrapv, will both implement wrapping on most architectures (since that is what the processor implemements) and silently remove manual overflow checks for overflow because it is "impossible" under a stupid interpretation of what UB means. Similarly, those compilers will remove manual checks for null pointers if there is a prior dereference to a pointer on the theory that dereferences of null pointers are "impossible" even when they happen.
Which is a very reasonable assumption for an optimizer. Every major compiler has switches to turn that behavior off. I'd be sympathetic to an argument that perhaps those switches should be on by default, but if you change the language standard, you're essentially saying entire classes of optimizations should be outright prohibited by a conformant compiler, with no programmer control. That's a classic case of throwing out the baby with the bathwater.
I do not see how FALSE is ever a reasonable assumption for an optimizer:
" a+1 > a" is not a true statement for code emitted by gcc/clang on most architectures (with or without frwapv)
Can you put one of your fwrapv deoptimizations on godbolt or otherwise show it?
What exactly is the point of this article? Despite its title it feels neither modest, nor like a good-faith attempt of a proposal. It more feels like a cheap snipe against the standards group, regurgitating the same tired points. Does anyone really believe that anything productive comes out of this?
No, but perhaps if people keep regurgitating these same tired points, compiler developers will get bored of listening, and start to do something different.
What is outrageous about this argument, which has been going on for far too long, is how people who don't get C or don't use it, keep dismissing critiques from expert practitioners like DJB or Torvalds and from academics like Regehr as if those people are all to damn stupid to understand the issues.
Specifying behavior is a bit more complex than that. There is a wide range of "unspecified" and the "full UB" one is the easiest from the spec point of view. Using the shift example :
- allowed to assume that the negative shift does not happen (what it says now)
- the value of each dynamic negative shift is unspecified but it must have a well defined value, however the same shift operation in the source code could return a different output given the same inputs dynamically (eg in a different loop iteration or function call)
- each source code shift is a well defined (unspecified on negative inputs) function
- all shifts in the program are a well defined (unspecified) function
So, for example, the last (least surprising) version means that you have to compile every shift to the target CPU's shift and that you also have to use the target's behavior when you constant fold a shift at compile time.
I think what the author actually wants from (1) is a transformation of common (or maybe all) UB to be more strongly specified as implementation-defined. That could be reasonable, with specific scope. I'm having trouble determining the proposed scope from the post, though.
(2) seems like it is already specified; the trouble is in determining what "the semantics of the program" are. This relates to (1) although it doesn't seem like the author has a completely solid grasp of the legalities of the language.
What the author wants, essentially, is to dictate that common disable-UB flags (e.g., -fwrapv, -fno-strict-aliasing) be folded into the C standard. Well, he probably knew that the C standards committee isn't going to accept that, so instead he's trying to "only" disable the bits of undefined behavior that don't screw over the user. Except he doesn't have a good enough grasp of the standard to actually effect proper wording:
(1) essentially disables inlining as a side effect, at least if undefined behavior is involved. (2) either is 100% redundant or means that every implementation must strictly execute the C abstract machine with absolutely no optimization permitted. (3) means you can no longer access any type with a char. You also couldn't access the bits of a float by casting to an int.
The first proposal is extremely welcome. Still you need to expand on it I feel. But yes, I think it is becoming the dream and nightmare along with aliasing (...) of a lot of engineers. It has become ridiculously impossible to write serious code without hitting the UB land.
The second although I think I get the reasoning, seems not well-defined to me. I am not talking about expanding into it with concrete steps. You need to define what kind of semantic invariance to require. You might just not want the compiler deleting sections it thinks do not affect the output; you might not want the compiler to alter nop segments for timing reasons (e.g. when writing a cryptographic primitive). There is a wide variety of different aspects of semantics and it is not clear what you propose. This have been a discussion topic among cryptographers. Or you are a kernel programmer and actually do not want checks removed or aliasing messed with etc. (One reason people compile their kernels with specific options -- see Linus' rants about UB if you want to pull some extra hair of your head.)
Anyways -- I am afraid at times GCC might be too far gone to tone down undefined behavior. Anyone working on a compiler has any input if there are plans, in the (unlikely) case the next WG decides to go sane on undefined behavior, how their compiler is going to address that new standard?
> Anyone working on a compiler has any input if there are plans, in the (unlikely) case the next WG decides to go sane on undefined behavior, how their compiler is going to address that new standard?
Disclaimer: I have contributed on my own time to one of the GCC frontends, so I guess I know a little bit more about compilers and the compiler development than the average Joe, but I wouldn't call myself an expert nor am I in a position to do any decisions wrt this for GCC. With that out of the way:
- You can see compiler development, particularly the optimizer stuff, as a game, where the standard limits what you can do, and the goal is to make SPEC CPU [1] run as fast as possible (your employer says which target CPU(s) you should focus on).
- The standard is the "contract" between the user and the compiler writers. Compiler writers do take standards conformance pretty seriously.
- If the standard changes, say to limit undefined behavior, compiler developers adapt to the new rules, and the game continues. No big drama here.
- AFAICT GCC developers wouldn't complain that loudly about "losing" previously allowed optimizations. There is (again, AFAICT) sympathy towards the viewpoint of changing the standard towards improving robustness.
- What you won't see is GCC, or any other compiler, doing it on their own without the standard backing them. That would just lose performance compared to the competition, and wouldn't help users write portable code anyway.
[1] Yes, there are other benchmarks too, but SPEC CPU is the most influential one.
> 6. EXAMPLE Code that checks to see if a pointer value is null cannot be omitted during translation solely on the basis that the pointer is dereferenced before the check and that earlier dereference would produce undefined behavior if the pointer value was null.
Here's the thing I don't quite understand: I gather that this is advocating for something like GCC's -fno-delete-null-pointer-checks to be part of the spec, but, how is -fno-delete-null-pointer-checks actually helpful?
I think it's because if you write code like this:
void foo(struct foo *param) {
int n = param->length, i;
if (param == NULL) {
return;
}
for (i = 0; i < n; i++) {
do_something(param->data[i]);
}
}
with -fdelete-null-pointer-checks the compiler might optimize it in two successive steps to delete the check (because param->length was already dereferenced) and then move the dereference down to the beginning of the for loop, and with -fno-delete-null-pointer-checks the compiler might skip the first optimization and keep the second.
But for -fno-delete-null-pointer-checks, you're relying on the second optimization, moving the dereference later. Otherwise you'd just have segfaulted already (or, in a kernel context, the read might have succeeded from a userspace page mapped at the null page). Right?
It seems reasonable for the compiler (or some other tool) to produce a hard error here and prevent this code from compiling, saying, hey, this is clearly doing something unwanted. It doesn't seem reasonable to expect the compiler to do certain optimizations to make your code safe, without which your code would have a security vulnerability, and also get mad at the compiler for doing more optimizations that return it to the original state of things.
> The limitation of lvalue access by types has never made much sense and required a undefensible exception for character pointers.
My understanding of type-based alias analysis is that it's required for code like this to optimize the way you'd want:
struct string {
size_t len;
char *data;
}
void copy_string(struct string *dest, struct string *src) {
size_t i;
if (dest->len < src->len) return;
for (i = 0; i < src->len; i++) {
dest->data[i] = src->data[i];
}
}
Otherwise the compiler would have to worry that dest->data would overlap with src->len and re-read it every time!
You can do this without type-based alias analysis in a language that gave you different tools for managing variables, but doing so backwards-compatibly in C seems difficult, and this proposed rule does not seem to help.
> It seems reasonable for the compiler (or some other tool) to produce a hard error here and prevent this code from compiling, saying, hey, this is clearly doing something unwanted.
The thing is, the redundant null check usually doesn’t appear literally in the source, but comes from an inlined function call (or even a macro). In that case, the null check is probably not a mistake; it’s just that this particular caller of the function that got inlined never passes null, and so will never trigger that code path.
I do think there’s unexplored potential for compilers to identify when they’ve eliminated code that didn’t come from inlining, and warn the user about the potential bug. However, this is much easier said than done, since the IR has gone through a lot of transformations by that point and there are a lot of edge cases to worry about. For example, even if all the IR instructions corresponding to a given piece of source code are dead, that code might still have affected the function compilation in some manner and thus isn’t useless – such as by being merged with a different (or perhaps redundant) computation, or by affecting control flow.
Also, even if some code path really is useless as compiled, it might be for innocuous reasons. For example, perhaps a given null check was not itself inlined, but the value being checked comes from an inlined function:
void *get_a_pointer(void) {
static int x;
return &x; // not null!
}
int main(void) {
void *p = get_a_pointer();
if (p == NULL) …
}
Unlike in the reverse situation where an inlined null check might be useful to another caller, here we know that that null check will never fire – that is, in this program, in this configuration, the check is useless. But perhaps with a different set of configuration flags, or on a different platform, the code uses a different implementation of get_a_pointer() which could return null. Or perhaps there is an abstraction boundary between main and get_a_pointer, where the documented interface of get_a_pointer says it might return null: even if the current implementation never does, the author expects that that might change in a future version of the code, so clients should proactively include a null check to avoid future breakage. (Of course, get_a_pointer might be part of a third-party library, and the author of main might have never even peeped at its implementation. These situations are especially common if you’re using LTO.)
> But for -fno-delete-null-pointer-checks, you're relying on the second optimization, moving the dereference later. Otherwise you'd just have segfaulted already (or, in a kernel context, the read might have succeeded from a userspace page mapped at the null page). Right?
Something that's actually a pointer offset, not a dereference, like `int *n = ¶m->length`, is also an assertion of being non-null.
It seems reasonable to amend the C standard to say "That's not an assertion of being non-null" (and more generally to make &invalid_pointer->item and &invalid_pointer[n] valid; IIRC implementing offsetof as (size_t)(((type *)NULL)->member) is UB). Would that prevent any significant optimizations?
Also does the rule that pointers can only point to elements of allocated objects or one-past-the-end enable any significant optimizations?
> Also does the rule that pointers can only point to elements of allocated objects or one-past-the-end enable any significant optimizations?
Yes, for segmented architectures like the 8086, it means that pointers don't have to be normalized after every operation. That is, when you increment or decrement a pointer, you only need to do the operation on the offset, you don't have to touch the segment. When all you have is less than 10 MHz, every operation counts.
(For those who don't remember the 8086, it's not as simple as concatenating the segment and the offset to obtain the physical address: to get the 20-bit physical address from the 16-bit segment and the 16-bit offset, the formula is segment*16+offset. As a consequence, the same memory location could be accessed through several distinct segment:offset pairs. If the memory allocator chose the segment value right, objects of 64 KiB or less could be indexed by doing all operations on the offset value, without having to touch the segment value.)
Well, what benefit does -fdelete-null-pointer-checks bring? Why would I ever want to remove that check? The more conservative version either segfaults or (quite likely) accidentally does the right thing. That situation is not great but it’s better than nothing. The fancy optimization just makes a not-great situation much worse.
I agree, it would be best if the compiler rejected this as clearly incorrect code. I think that needs a change in the spec relating to undefined behavior, though (and more importantly a change in direction from compiler vendors).
Edit to add: your second example, type aliasing, is a good one too. I think that one could also be usefully addressed by a compiler that flags it as a likely error. Although you’d need some way to annotate the code in the event that the weird aliasing behavior is actually correct and expected (I dunno, make “len” volatile or something?)
> Well, what benefit does -fdelete-null-pointer-checks bring? Why would I ever want to remove that check? The more conservative version either segfaults or (quite likely) accidentally does the right thing. That situation is not great but it’s better than nothing.
My concern is largely about how to phrase "The compiler may not delete this dead code" in a way that's meaningful. For instance, I assume the intent of the proposal is not to forbid optimizing this code in the obvious way, right?
int n = param->length, i;
if (param == NULL) {
return;
}
if (param == NULL) {
return;
}
for (i = 0; i < n; i++) {
do_something(param->data[i]);
}
What about this?
int n = param->length, i;
for (i = 0; i < n; i++) {
if (param == NULL) {
return;
}
do_something(param->data[i]);
}
which is more likely to show up in the form of an inline function that checks for NULL - can the compiler refactor that check into a single one?
> I think that one could also be usefully addressed by a compiler that flags it as a likely error.
The compiler can't know when compiling that one function whether src->len and dest->data alias. It might be compiling that file as a standalone object file, with no other context about who's calling it. Should it emit a warning when compiling that code? (What's the warning, "I'd like to optimize this but I can't?")
Alternatively, should it flag the function ... somehow ... so that when someone else calls it with aliasing arguments, that's an error? That isn't even a capability available to the C language as written: it's valid to call that function knowing nothing other than the struct definition and the function's prototype.
You can imagine a language where variables carried information about whether they alias. (Actually you don't really have to imagine; the `restrict` keyword does a subset of this in C itself, and other languages have much more explicit ways of tracking this, such as Rust's shared vs. mutable references.) But it would be hard to retrofit such a thing onto existing C code.
The fundamental problem with the C language working group is that they're obligated to keep working on C as it already exists.
One of the reasons for this proposal is that LTO is going to make C completely unusable otherwise. Suppose that a driver produces a structure that points to data buffers. Core code interacting with the drivers knows that drivers are of variable quality and checks that the structure is properly setup: if(x->buffer == 0)panic("bad driver" ) or something more sophisticated. Everything works beautifully until some driver is added with a dereference of x->buffer - which then defeats the guard in the core core because now the compiler " knows" that the x->buffer is not null. Separate compilation and inlining can't work reliably in this situation.
Are you suggesting that driver code has an opportunity to run before core code can implement checks?
If so, then again, this has the fundamental problem that you're expecting the compiler to implement some optimizations - like pushing the dereference out of the buggy driver via LTO - and not others. If the compiler had all optimizations off, it would just generate code to read from x->buffer in the obvious place, which would page-fault at runtime before the check from core code even happened.
If the core code implements the check first, then this optimization won't cause you problems, because this optimization is strictly about checking for a NULL pointer after using it.
For the specific case of computing the address of x->buffer without actually dereferencing it, I do agree that the compiler should not consider that to be an implicit non-null promise. That seems much more likely to be a change you can make without breaking the language spec. See the subthread with 'dbaupp.
(Alternatively, you could work on building tools for driver writers to write in a less error-prone language than C, because drivers being of poor quality is a much broader problem with much broader impacts than just triggering ill-advised optimizations under LTO, and simply optimizing drivers less efficiently isn't going to solve the problem that the driver shouldn't be dereferencing x->buffer without its own NULL check in the first place. If you check my recent GitHub activity you might notice a project to do exactly this in Linux....)
This is a really common code pattern: client sends request to server; server sanity checks request. If you omit sanity checks from the server because you assume there are no errors in the client, you break a lot of systems.
I am expecting the compiler to only make optimizations based on things it knows are true - that are true. So if the source code is "check for null;check for null; do something;" the compiler should be able to delete the second check (hopefully with a warning). In fact, the Standard is very explicit about permitting optimizing compilers to delete repeated operations. However, in the case I described, the compiler does not know that the pointer is non-null, it's just easier to assume it is not null. That leads to an "optimization" that is not an optimization. I agree that requiring compilers to only use true optimizations is more of a challenge than allowing them to use false ones, but nobody said it was going to be easy. In particular, it looks like Clang loses way too much information going from source to llvm language which makes it difficult to carry information about whether, for example, a pointer dereference is safe or not - but that's a defect in the compiler, not in the source language.
My instinctive response was going to be "That can't work, there are too many of them," but apparently you can do this particular one with -Wnull-dereference (which requires -fdelete-null-pointer-checks enabled to work, but you can just compile twice, once as a throwaway and once for production with -fno-delete-null-pointer-checks, or do the former for non-release builds, or something).
More generally, it might be the case that enough of the last several years of unpopular optimizations are controlled by flags that you can compile the code twice, once with the optimization enabled and once without, and warn whenever the optimization changes something. It might even be possible to do this with a shell script that treats the compiler as a black box with -ffunction-sections and uses objdump or something.
I do wonder how hard it is to write code that produces the same object code regardless of something more major like -ftrapv, though.
My view is that the compiler cannot remove any of these NULL checks. There should be a separate tool that gives the programmer hints that she is making unneeded checks and suggest program transformations, but removing NULL checks is unsafe when done automatically. Because I might be on a platform where param->length is actually referring to valid memory, so my program won't segfault and the NULL check is the only way to know.
a) these are marginal optimizations, at best. b) because it violates C practice. My parsing library is written to distrust inputs so as to protect against buffer exploits. Your compiler "optimizes" away my security check because some buggy code dereferences a null pointer - which actually works in this implementation. Great. Now you have "optimized" to create a security hole in the program by removing a single instruction that is actually free in many pipelined architectures. Wonderful. Thanks so much.
The optimization isn't written in the form of "If you see this ridiculous code, make it less ridiculous." It's written in the form of "You can refactor repeated checks of presumed-non-aliased variables into a single check." Otherwise code like
for (i = 0; i < *plen; i++)
would involve dereferencing plen on every single loop, which for many applications is definitely not marginal!
> some buggy code dereferences a null pointer - which actually works in this implementation.
What implementation is this? (Are you mmapping at NULL, or writing for an architecture without an MMU, or something?)
NULL doesn't have to be address zero. It's denoted by literal zero, in the C source, but it can be something totally different when translated to the machine. Can you arrange for some other address to generate a segfault, and tell your C compiler to use that as NULL? (I don't know how to do this with gcc, and I'd like to know how, honestly - it seems like a significant win for kernel security if NULL pointers in kernel code were an invalid kernelspace address and not an invalid userspace address.)
Many C programs assume that initializing a variable to all-bits-zero, through memset, calloc, the .bss section, or on the kernel the GFP_ZERO flag, will initialize all pointers within it to NULL. See also: http://austingroupbugs.net/view.php?id=940 ("all 0 bits should be a null pointer on POSIX systems").
Not correct. g(x){ ... if x is null panic otherwise x->v = n} can currently be optimized to x->v =n given f(){ ... y =x->v ... g(x) }
because null dereferences are assumed impossible in some twisted interpretation of the standard. Imagine the effects with inlining and LTO. That's my issue, not perfectly ordinary standard value caching etc.
For certain types of image manipulations or other computations, it may well be that the pointers are supposed to refer to overlapping regions. This is the kind of problem that, as Dennis Ritchie pointed out, bedeviled FORTRAN for ever. I think the solution is to fix "restricted" and let programmers opt-in. If the programmer doesn't do that, or gets it wrong, too bad.
The issue isn’t just that src->data and dest->data might overlap, which TBAA doesn’t forbid anyway (although that’s what restrict can help with). It’s that dest->data might alias src->len, and magically change the length midloop. I doubt there’s any legitimate image manipulation where that would happen, but the compiler doesn’t know that...
This is exactly why I don't propose to do away with UB but to limit what compilers can do with UB. The programmer in C cannot be prevented from writing terrible code where e.g. a loop counter is written over by an aliased pointer in the middle of the loop - the current standard doesn't prevent it either. In fact, strictly conforming C code can use char pointers to randomly write over any aliased variables if I understand it correctly. The positive thing about UB is that it makes this the programmers problem, not the compilers problem. The negative thing is that in current interpretations, the compiler is free to detect such an error and then instead of either failing to compile with a warning or just letting things happen as the programmer has specified, it can introduce global program failures because "it's impossible to have UB".
Your proposals are going to slow down tons of real-world C programs that are not engaged in these types of manipulations, to the point where people writing high-performance software are better off using Python, or need to make so many changes to their source code to recover reasonable performance that they might as wll make a few more changes and end up with Go and Rust.
If you really think C is enough of a disaster that this is justified, why even care about fixing C?
As somebody primarily focused on other languages, I've always found it absolutely bizarre that in C-land it's considered even remotely acceptable for "bad stuff" to result in invisible-to-the-user transformation of code instead of actual error handling.
I guess you're focused on languages that don't have the following two constraints from C: (1) be as fast as possible (2) compile ahead-of-time. The "weird" behavior in case of UB in C/C++ is a direct result of these two constraints and the fact that the transformations you're referring to are only affect code that already has _programming bugs._ The standard already specifies that optimizations don't change semantics -- and really they don't -- for bug-free code. Adding runtime error checks to that code only helps buggy programs and pessimizes the performance of correct ones. This is a tradeoff C/C++ does not make.
That's a good analogy except that with traffic, you have factors outside your control that can cause a crash. Avoiding bugs in code is theoretically completely within the programmers control. There are various tools that help with this in different languages to compensate the fact that programmers are human: compiler warnings, best practices and linters, sanitizers, testing, static typing, runtime checks etc. Different safety mechanisms offer different tradeoffs both for cars and programming.
One could argue with the your analogy that UB in C is silly because it assumes programmers who make no mistakes. In reality, writing C code can be economical (adequate safety, low cost, high performance) -- pricing in all the human errors -- and making the proposed would also cost performance and possibly money for some, that's why there is pushback on such proposals.
In the utopian world that C developers are all 10x, have 100% control over the complete source code of their applications and don't depend on third party C libraries.
What I implied by "programmers are human" as exactly that they are 1x and make mistakes. The tools help in the situations where "100%" assumptions cannot be made. But you can't put a seat belt on everything and contain every issue because it will cost more.
But you must also know that you must trust at least part of the environment your operating in. There will never be a language or a tool that can make a program useful and correct when nothing can be trusted. C/C++ regarding UB is just explicit about trusting the code that is currently being compiled for the sake of optimizing code that actually is. I get where you're coming from, but this is one of the possible solutions, and a very widely accepted one given that the whole world runs on C.
Managed languages might avoid some class of bugs, but they will always retain some, and I'm not sure it is possible to guard against all without solving the halting problem. Some C/C++ people might on the contrary find it bizarre that your languages do absolutely unnecessary bounds checks. None of them are better than the other on an absolute scale, it all depends on what you want to use the tool for.
Ada is not a managed language for example, yet it avoids such errors.
That is the typical C hand waving that every language has errors, so no big issue.
Of course there are no perfect programming languages, however it is quite different having to handle "Logic Errors" or "Logic Errors + Memory Corruption Errors + UB Errors + Implicit Conversion Errors".
Ada overflow semantics are precisely what programmers assumed were C overflow semantics on architectures that roll-over on signed arithmetic (e.g. basically every processor architetcure in current wide use). The original intent of the C "undefined behavior" category, as far as I can tell, was to in this case permit compilers on dissenting architectures to make another choice. For example, on the Itanium you might want to have a trap on overflow. Because C is comfortable with "it works differently on different processors". So it is my intention to repair the obscurity in the language by spelling out what I am 100% sure would have been obvious to C programmers and language developers at on time.
It's probably an accurate comparison, given the amount of code out there that really doesn't need to be written in C, but nevertheless it is (and the majority of people writing it aren't capable of writing bug-free C).
Just think about the memory corruption issues that occasionally pop up on Linux, in spite of their static analyzers and the review process of getting anything into kernel mainline.
> conforming implementations cannot transform code on the basis that some behavior left undefined by the standard is impossible
You may have misunderstood how rogue optimizers obtain their license to miscompile.
It is not by assuming that the "behavior left undefined is impossible".
As boring and tautological as it sounds, it is by assuming that the operation with "behavior left undefined", when it actually happens, has undefined (and therefore arbitrary) behavior.
The standard doesn’t describe transformations; it describes behaviors. You can’t just say that undefined behaviors can’t be “transformed”, because there was nothing to transform them _from_. If you want some behaviors to be possible and some behaviors to be impossible, then you need to define those behaviors.
For C, that’s not something we can realistically do in every case. There’s no way to reasonably constrain the set of allowed behaviors after a program does some crazy pointer operation like * (int *) rand() = rand(), because any implementation of those constraints would more or less require an expensive run-time check at every pointer access.
What you could meaningfully ask is for particular situations (like negative shifts) to be considered unspecified or implementation-defined rather than undefined. That way you don’t give up the ability to constrain the behavior of programs after those situations arise. There are already many unspecified behaviors in the standard, such as reading the value of padding bytes in structs.
Such proposals could be considered on their merits, but they’d have to be discussed individually. There’s no way to remove all undefined behavior from C and still have something that would be recognizable as C.