Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I disagree with the include files section.

As long as each file has a proper include guard, it won't matter how many times you've included a file. I'd much rather have the preprocessor figure out when and where to include a file than do it myself. Maybe this was a significant performance issue in 1989, but it shouldn't be anymore.

That said, you should only include files that are direct dependencies. There's no need to include stdio.h in every other file.



That's a naive viewpoint. It's more of a concern in C++ ... if your codebase grows to massive proportions, and you #include headers within headers, then modifying a header can result in ~60% of your codebase being rebuilt. (At work, I deal with this pretty much every day.)

Including header files within headers is simply unnecessary, especially in C. You need to forward declare in header files, then include in source files where you actually use that header file.


C is not C++.

In C you shouldn't be chaining header files because you don't make function calls in header files in C code (well most good C code at least, people abuse macros sometimes).

In C you should only include header files in header files if you have types in the other header files. It's okay to forward declare if you must, but its often quite a bit clearer to not.

Additionally, you should always include all the h files that have your undefined functions (only relying on include chaining for header files at most). So if you have:

DogModule.c calling Sounds.c functions and exporting a public function in DogModule.c called "PlayBark(struct soundNode);", DogModule.h and things that use DogModule should both import Sounds.h.

Good C habits and structure are very different than good C++ habits and structure. They're not the same language, not even close. Mangling your concepts of Good C and Good C++ is a great way to write both poorly.


Insightful, thank you.


Suppose you've written some key data structure in its own file. Let's say it's a binary tree. The binary tree's functions are all implemented in bintree.c, and its functions, defines, and struct are all declared in bintree.h.

Now, you want to write some application-specific functionality that works on your binary tree. You need to pass a pointer to the binary tree to your new functions. Of course, you implement these new functions in app.c, and the functions are declared in app.h so that other parts of your application can call them.

You're going to need to include bintree.h in app.h to ensure that the compiler knows about the binary tree's structure. Otherwise, you'll have to make sure to always include bintree.h before you include app.h. Is that really something you want to be bothered with?


In app.h, you should forward-declare "typedef struct bintree_t bintree;"

Then, in each source file that #includes "app.h" and also CALLS binary-tree-related functions, you #include "bintree.h".

Like I said, this is more of a concern in C++ where compile times are way, way slower than C. And in C++, it's very much worth it to forward declare in header files. It makes a big difference once your source code base has grown to epic proportions.


It's a technique, but not so applicable to say "you should" do it.


I totally agree.

Not only will headers including headers cause files to rebuild unnecessarily, but it can cause build times to increase simply because the compiler has to compile all those extra headers each time they're included in a source file. (Precompiled headers can help, but they aren't always an option.)


yes, and that's using the Pimpl idiom to decouple things can improve build times. If your whole codebase rebuilds everytime a header is touched, then your headers are probably not correctly organized. Having minimal headers (using forwards decls as much as possible) is vital to keeping rebuild times down.


I think that advice is just outdated. It's a different situation now that it's a standard practice and there are optimizations supporting it (http://gcc.gnu.org/onlinedocs/gcc-2.95.3/cpp%5F1.html#SEC8).


I think Pike's point about "thousands of needless lines of code" is that people do this:

#include "foo.h"

instead of this:

#ifndef __FOO_H__

#include "foo.h"

#endif

where "foo.h" looks like this:

#ifndef __FOO_H__

#define __FOO_H__

/* contents /

#endif / __FOO_H__ */

which in the former case results in having to lex foo.h in order to skip the contents.

However, as pointed out below, modern compilers will effectively do the same thing for you automatically.

[Edited to make the code readable]


Just a quick pedantic point - __FOO_H__ is a reserved identifier (C90 reserved). You shouldn't be writing it in your code.


If you keep your header files independent, you have no reason to include a single header multiple times, and you can't do it accidentally through another header.

If your header files don't transitively include other headers, then

    #include "a.h"
    #include "b.h"
will never cause an error. Therefore, in well written code, the include guards will be no-ops, and for simplicity's sake, they may as well be left off.


Sure, this isn't a problem in trivial projects. If you want to use types other than what the language gives you, you're going to have to introduce a dependency.


even with internal include guards, just the time to read through (and skip) the whole files for later inclusions can become dominant in cpp processing time (i.e. it's mostly waiting on I/O reading the same headers many times over). That's why Lakos advises external include guards for large systems.


Unless you have gigantic projects, your project source code is likely to be mostly in memory through kernel IO buffers nowadays, so even this is not true anymore IMO.

EDIT: just for kicks, I compiled twice numpy (100 kLOC), first just after flushing IO buffers (echo 1 > /proc/sys/vm/drop_caches in recent linux kernels):

- cold case: 24 seconds - host case: 16 seconds

Of course, it is hard to say what contributes to the slowness (files themselves, loading in memory all the programs needed for compilation, etc...).


If you include `ccache` (or similar), the recompilation time will drop to ~0. One-off compilation time doesn't matter that much and developers have many other tools they can use (pch?).


Partial rebuilds are obviously quite important - but in that case, IO is even less of an issue, because it is the hot case assuming you have enough memory. I forgot about one case where you may not have enough memory: large C++ program with multiple compilation in // - this can easily takes GB of memory for template heavy code.


one easy way to slow down compiles on Windows is to include windows.h everywhere, this pulls down about 2 Mb of declarations that the cpp front-end has to parse just to reach in the internal guards... it all depends on how large your header files are (and recursively how large is all headers that they recursively pull in).


C is not C++

C compilers are leagues, elephants and donuts faster than C compilers.

In C, internal guards are preferred and cause very little to no compilation time hit. In C++, external are often preferred, especially in large files.

Do not take C++ advice to apply to C. The best practices of both languages are very different.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: