Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It's miserable to parse C++ and that's fine, because only a few people have to write a parser while 5 orders of magnitude more have to read and write it.

Really? I was under the impression that the fact that it is miserable to parse C++ directly means that it's also miserable to compile C++ - it can't be done quickly - which is something that everyone has to do all the time.



FYI: Parsing and compiling in the programming language sense are orthogonal problems. Both are major challenges in cpp compilers.


What I've read is that C++'s biggest compiling problem is specifically that the language is difficult to parse. You can't compile without parsing, so no, they're not orthogonal problems. Compiling is a parsing step followed by an emission step.

(And just to be completely clear, I'm not saying that the difficulty of parsing C++ makes it miserable to write a compiler. I'm saying that the difficulty of parsing C++ makes it miserable to run a compiler.)


> FYI: Parsing and compiling in the programming language sense are orthogonal problems.

How so? In Ada, Fortran, C, C++, Java, Python, etc. parsing is one of the many phases of compiling. Far from being orthogonal problems, parsing is a sub-problem of compiling.


Parsing is almost always a part of a programming language compiler but it doesn't have to be.

When we parse a string, we go up from low-level human-readable strings to intermediate parse trees to a high-level AST.

A process of compilation goes down from some kind of a high-level internal representation through phases to something that is close to the target low-level language.

It can go down from an AST, or LLVM-s internal representation, or whatever.

Problems solved by both processes are different:

1. Parsing is about finding string patterns (using regexps or PEGs or whatever) and representing these using a more strict high-level structure. The structure is usually tree-like but is also often a graph.

2. Compiling takes something high-level (a tree, a graph, etc) and emits something different, usually simplified.

A good example is a typical lisp implementation where the parser is trivial but compilation phases represent numerous simplifying phases.


The amount of time being consumed by parsing is vanishingly small. It's a lot like the decoding time spent on x86 code is marginal nowadays compared to the speculative and reordering logic.

YACC was called "Yet Another Compiler Compiler" because back in the day parsing was the bulk of compilation, now it's relatively minimal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: