As a boring platform for the portable parts of boring crypto software, I'd like to see a free C compiler that clearly defines, and permanently commits to, carefully designed semantics for everything that's labeled "undefined" or "unspecified" or implementation-defined" in the C "standard" (DJ Bernstein)
And yeah I feel this:
The only thing stopping gcc from becoming the desired boringcc is to find the people willing to do the work.
(Because OSH has shopt --set strict:all, which is "boring bash". Not many people understand the corners well enough to disallow them - https://oils.pub/ )
It is kind of ironic, given the existence of Orthodox C++, and kind of proves the point, that C isn't as simple as people think, having only read the K&R C book and nothing else.
It's still not really wrong though. The C standard is just the minimal common feature set guaranteed by different C compilers, and even then there are significant differences between how those compilers implement the standard (e.g. the new C23 auto behaves differently between gcc and clang - and that's fully sanctioned by the C standard).
The actually interesting stuff happens outside the standard in vendor-specific language extensions (like the clang extended vector extension).
Off topic but if you're the author of sokol, I'm so thankful because it led to my re-learning the C language in the most enjoyable way. Started to learn Zig these days and I see you're active in the community too. Not sure if it's just me but I feel like there's a renaissance of old-school C, the language but more the mentality of minimalism in computing that Zig also embodies.
So basically back to C89... I'm not a fan since the changes in C99 made the language significantly more convenient, more enjoyable and actually safer, and even the MSVC C frontend has a nearly complete C99 implementation since around 2015 (the parts of C99 that matter anyway).
Case in point: the article has this somewhere in the example code:
Your example of an uninitialized memory situation will not be so compelling for old-school C engineers because they've "solved" the issue decades ago by integrating tools like valgrind into their work-flows. They don't expect (or require) the compiler to help them deal with issues like this.
The problem with post-C89 is that you lose the unique features of old-school C.
For example, it can be compiled on basically any platform that contains a CPU. It has tool support everywhere. And it can be fairly easily adapted to run using older C compilers going back to the 1980s.
Or that (old-school) C is basically a high level assembly language is actually a feature and not a bug. It's trivial to mentally map lines of C89 code to assembly.
No other widely available language can tick these boxes.
So the problem with later versions of C is that you lose these unique features while you are now competing for mindshare with languages that were designed in the modern age. And I just don't see it winning that battle. If I wanted a "modern" C, I'll use Zig or Rust or something.
Just because a language (C89) doesn't evolve doesn't mean it's dead.
Also let's not forget that C99 is a quarter century old by now. That's about as old as K&R C was in 1999 ;)
> I'll use Zig or Rust or something.
I like Zig a lot (Rust not so much), but both Zig and Rust still have feature gaps compared to C99 which often makes me prefer C99 for day-to-day coding. Most importantly struct initalization both in Zig and Rust is very rudimentary compared to C99's designated init feature set - and somehow the authors of modern languages often don't see the benefits of having powerful data initialization features in the language. There are also lots of little paper cuts where both Zig and Rust balance convenience versus safety just a little too much away from convenience.
I honestly prefer it more, if not explicitly initialized variables stay uninitialized, since then the compiler/analyzer/fuzzer can find undesired access instead of it just silently working. And yes I always try to invoke UB is soon as possible.
The text following this heading seems to take the opposite view. I suspect this is a typo.
However, I think the heading is accurate as written. The "C is not a high level assembler" crowd, in my view, is making a category error, conflating C itself with an ISO standard and abstract machine concept coming decades later.
By the same token, "C is a high level assembler" is a gross oversimplification.
> The "C is not a high level assembler" crowd, in my view, is making a category error, conflating C itself with an ISO standard and abstract machine concept coming decades later.
"C isn't not a high level assembler" captures it almost perfectly, and it also captures the documented intent of the people who created the ANSI C standard, which was that ANSI C should not preclude the (common) use of C as a high-level assembly language.
Here the quote:
C code can be non-portable. Although it strove to give programmers the opportunity to write
truly portable programs, the C89 Committee did not want to force programmers into writing
portably, to preclude the use of C as a “high-level assembler”: the ability to write machine-
specific code is one of the strengths of C. It is this principle which largely motivates drawing the
distinction between strictly conforming program and conforming program (§4).
I see a huge semantic gap between assembly language and C.
An assembly language program specifies a sequence of CPU instructions. The mapping between lines of code and generated instructions is one-to-one, or nearly so.
A C program specifies run-time behavior, without regard to what CPU instructions might be used to achieve that.
C is at a lower level than a lot of other languages, but it's not an assembly language.
Java also targets an abstract machine model (JVM) - such statement really doesn't mean much.
Assembly is not about corresponding to exactly which gates open when in the CPU. It's just the human writable form of whatever the CPU ingests, whereas C is an early take on a language reasonable capable of expressing higher level ideas with less low-level noise.
I seriously doubt anyone who has written projects in assembly would make such comparisons...
>I seriously doubt anyone who has written projects in assembly would make such comparisons...
With genuine respect, I believe this type of insinuation is rarely productive.
Someone might still have silly opinions, even if they have been paid to write assembly for 8-24-64 bit cisc, risc, ordered and out of order ISAs, and maybe compilers too. Peace :)
Yes but someone might also have silly opinions from having no experience how production assembly actually looks, such as underestimating just how different working with that is to working in high-level languages like C and why such languages were quite revolutionary. :)
This should not be mistaken as appeal to authority, it is merely reasonable discrimination between those speaking from experience, and those forming opinions without experience.
If one believes those with experience has poorly informed opinions, they're always free to gain experience and associated perspective. They will then either have the fundamentals to properly push their viewpoint, or end up better understanding and aligning with the common viewpoint.
Yes and no, you can use c in situations where there's no "assembly", for instance when synthesizing FPGAs. You target flow graphs directly in that case IIRC.
I have empathy for this having written compiler passes for 10ish years of my career. But as I've studied register renaming, speculative branch prediction and trace caches I would no longer agree with your last sentence. It's fine though, totally just an opinion.
sure, I was thinking of large OO cores. "Correspondd to the instructions the cpu runs and their observable order" is how I'd characterize C as well, but to each their own.
C is a programming language. It makes for a very shitty high level assembler.
Here's a trivial example clang will often implement differently on different systems, producing two different results. Clang x64 will generally mul+add, while clang arm64 is aggressive about fma.
x = 3.0f*x+1.0f;
But that's just the broad strategy. Depending on the actual compiler flags, the assembly generated might include anything up to multiple function calls under the hood (sanitizers, soft floats, function profiling, etc).
I don't think clang is being "aggressive" on ARM, it's just that all aarch64 targets support fma. You'll get similar results with vfmadd213ss on x86-64 with -march=haswell (13 years old at this point, probably a safe bet).
The point is that there are multiple, meaningfully different implementations for the same line, not that either is wrong. Sometimes compilers will even produce both implementations and call one or the other based on runtime checks, as this ICC example does:
I don't understand the argument you're trying to make. You seem to be arguing that C isn't a high-level assembler because some compilers generate different machine code for the same source code. But (a) that doesn't contradict the claim in any way; and (b) this happens in assembly all the time too—some synthetic instructions generate different machine code depending on circumstances.
I'm saying that "C is not a high level assembler" because it doesn't have any of the characteristics that make a good assembler. Your original post made a distinction between C as practically implemented vs the ISO standard, so the example was chosen as a practical example of something an assembler should never do: change the precision and rounding of arithmetic expressions.
Now let's say you're working on a game with deterministic lockstep. How do you guarantee precision and rounding with an assembler? Well, you just write the instructions or pseudoinstructions that do what you want. Worst case, you write a thin macro to generate the right instructions based on something else that you also control. In C or C++, you either abuse the compiler or rely on a library to do that for you ([0], [1]).
This is the raison d'etre of modern assemblers: precise control over the instruction stream. C doesn't give you that and it makes a lot of things difficult (e.g. constant time cryptography). It's also not fundamental to language design. There's a long history of lisp assemblers that do give you this kind of precise control, it's just not a guarantee provided by any modern C implementations unless you use the assembly escape hatches. The only portable guarantees you can rely on are those in the standard, hence the original link.
Low level control over the instruction stream is ultimately a spectrum. On one end you can write entirely in hex, then you have simple and macro assemblers. At the far end you have the high level languages. Somewhere in the middle is C and however you want to categorize FASM.
I think I understand now. That's reasonably fair. I don't necessarily share in your ultimate conclusion, but I can see how your opinion is well-reasoned.
This was only the case when the machine code generated from C compilers was almost 1:1 to PDP-11, or similar 16 bit home computers.
Since optmizing compilers became a thing in the C world, and the WG14 never considered modern CPU architectures on what hardware features C should expose, this idea lost meaning.
However many people hold on to old beliefs that C is still the same kind of C that they learnt with the first edition of K&R C book.
Before dismissing it as the author having no idea what he is talking about, David Chisnall used to be a GCC contributor, one of the main GNUStep contributors back in the original days, and one of the key researchers behind the CHERI project.
David is quite accomplished, but in this instance he is simply wrong. For two reasons:
1. All the reasons he cites that depend on "what the metal does" being different and quite a bit more complex than what is surfaced in C apply equally to machine/assembly language. So the CPU's instruction set is not a low-level language? Interesting take, but I don't think so: it is the lowest level language that is exposed by the CPU.
2. The other reasons boil down to "I would like to do this optimization", and that is simply inapplicable.
Look at the compiler assembly output after optimization kicks in. The resulting assembly code is usually significantly different from the input source code. With modern optimizer passes, C is much closer to any other high level programming language than to handwritten assembly code.
Languages like C are simply very unforgiving to amateurs, and naive arbitrary code generators. Bad workmanship writes bad code in any language. Typically the "easier" the compiler is to use... the more complex the failure mode. =3
Reply to bypass a nonsense slop-article that doesn't actually offer any legitimate insights into workmanship standards.
Follow the 10 rules on the single wiki page, and C becomes a lot less challenging to stabilize. Could also look at why C and Assembly is still used where metastability considerations matter. If you spend your days in user-space Applications, than don't worry about it...
A bit of history where these rules came from, and why they matter. =3
On an iPad I can't read the web page at all. The insert at the upper right overlies and obscures the main body of text.
It'd also be a good starting point to be more concrete in your ambitions. What version of C is your preferred starting point, the basis for your "Better C"?
I'd also suggest the name "Dependable C" confuses readers about your objective. You don't seek reliability but a return to C's simpler roots. All the more reason to choose a recognized historical version of C as your baseline and call it something like "Essential C".
There are several ways to do that. The best method is to use a real computer.
Of course, the website operator can do something about this as well. On my website I redirect all traffic from mobile devices to the kids section of YouTube.
I was talking about the posts on HN and the motivation why they are posted right now, not the author, with whom I'm familiar.
Also there's literally a discussion on C for vibecoding on the front page atm https://news.ycombinator.com/item?id=46207505.
Honestly as a hobbyist programmer I'm more interested in knowing which exact platforms/compilers that don't support the non-dependable patterns and why should I care about them. Even better if the author can host a list of "supported platforms" that's guaranteed to work if people's projects invest in the style.
https://gcc.gnu.org/wiki/boringcc
As a boring platform for the portable parts of boring crypto software, I'd like to see a free C compiler that clearly defines, and permanently commits to, carefully designed semantics for everything that's labeled "undefined" or "unspecified" or implementation-defined" in the C "standard" (DJ Bernstein)
And yeah I feel this:
The only thing stopping gcc from becoming the desired boringcc is to find the people willing to do the work.
(Because OSH has shopt --set strict:all, which is "boring bash". Not many people understand the corners well enough to disallow them - https://oils.pub/ )
---
And Proposal for a Friendly Dialect of C (2014)
https://blog.regehr.org/archives/1180
reply