Hashing up #include in C++

NB First published in .EXE Magazine, December 1994

Introduction

How many times have you wondered if perhaps other ‘more professional’ programmers do things more effectively and easily? For me it’s a permanent state of mind; I just know that someone, somewhere is doing things, that take me half an hour, in just a couple of minutes. It’s a sort of paranoia I suppose - the fear that nearly everyone’s better than I am, and none of them will let me in on the secrets of how to do things easily. So, I’m always thinking: ‘I bet those cool dudes at (some prestigious software house) do it a better way!’

I don’t know if it’s healthy or not, but it doesn’t do one’s self esteem any good. The only solution is now and then to step back and say ‘OK, OK, I give in! Yes, I’m fed up of doing it the tedious and hard way. Let’s see if I can figure out how the smart-arses do it. Or, at least I’ll see if I can begin to get close to doing it along the same lines.’

And then there is always the nagging doubt that maybe my solution is actually ridiculously overcomplicated and that they’re all using a far simpler, more elegant solution. The paranoia continues...

#include

Now, the #include directive is pretty innocent on the face of it. It just saves retyping or pasting the same block of source code over and over again. This is a jolly indispensable facility. Just think about the implications of doing without it, having to copy and paste every class declaration in every source file you use the class in. The real advantage of #include is that any changes can be made in just one place, rather than requiring modifications to every declaration.

If you’ve been programming in C/C++ for a while, you begin to forget the essential simplicity of #include and consider it as a higher level kind of construct. You tend to come to think of #include “MyClass.hpp” as meaning ‘Hey, compiler! Just thought I’d let you know that I’m going to be using this class of mine later on. I know how you like to be warned of such things well in advance.’

In fact, I tend to think of #include in this way so much that if you asked me for a neat way of effectively replicating a large block of code without pasting or using line continuations, I’d probably be stumped. Of course, I know none of you would be.

Anyway, that’s the way I use #include, and that’s the way I want it to work. Pretty arrogant, but then I think that’s the attitude you have to have toward compilers. After all, who’s writing the program for heaven’s sake! Unfortunately, contemptuous use of #include soon obtains unpleasant rewards such as ‘Redefinition of symbol’ errors. Even more unpleasant is ‘#includes nested too deep’. Most of you will have independently solved at least one of these problems (I touch upon them again later).

In C, life was pretty straightforward. You could consider #include to mean ‘Hey, compiler! I’m going to be using some of the functions in this module’ without too much difficulty. Again, the only problems to solve were ensuring that the declarations were only seen once, and trapping circular inclusion. You could then #include a header any time, any place, just on the off-chance you might be using some of the functions (or other declarations). Of course, you had to do a bit of cleaning up now and then. Unnecessary #includes not only impaired compile time, but increased the likelihood of circular inclusion.

It’s this sort of ‘use and forget’ attitude that I wanted to continue when I came to C++.

Just like proponents of multiple inheritance, I’m going to give you a contrived example that demonstrates the need for a solution. Consider the two classes listed in figure1.

#include “Node.hpp”

class Root: public Node

{

public:

Root():Node() { }

...

};

class Root;

class Node

{

private:

Root* m_pRoot;

Node* m_pNext;

protected:

Node():m_pRoot(NULL),m_pNext(NULL) { }

public:

Node(Node* pN):m_pRoot(pN->m_pRoot),m_pNext(pN) { }

Root* GetRoot() const { return m_pRoot; }

...

};

#include “Root.hpp”

Let’s assume that the implementation is sound (don’t ask me what it’s meant to do - it’s contrived remember). Anyway, at least the classes are semantically valid.

I want to be able to use the Node class with as little hassle as possible. I just want to say #include “Node.hpp”. Now I should be able to use GetRoot() as though it returns a Node, but this won’t be possible unless “Node.hpp” automatically does an #include “Root.hpp”. So the declaration ‘class Root’ is not enough on its own. This leads to a general rule that it is the header’s responsibility to ensure that any other necessary header is included.

Now, you can’t simply #include everything, because that would be lazy and wasteful. If it was a class in a DLL say, you’d certainly only want to expose declarations necessary for the applications programmer. If the class contained a private pointer to a class of some gargantuan library, you wouldn’t want to burden them with all that baggage (the library may even have a license that prohibits this). Rather than have to bear these and other considerations in mind all the time, I thought there should be some systematic way of structuring a header file to accord with the needs of the files including it.

Consider that I’ve used rather simple interdependent classes as an example. More complex classes can have even more involved interdependencies. So what do we do? When things start getting complicated there are a few approaches to resolving inclusion:

1. Have a multi-levelled system of header files: the bare class, one that has additional includes for the applications programmer’s benefit, one that is oriented toward the ‘.cpp’ file, one that is oriented toward other implementation header files, etc.

2. Header files #include no others, but just do declarations. A super-header is created that includes all headers in the correct order. Source files always #include this super-header, they can’t pick and choose. The compiler’s precompilation option is used to compensate.

3. Group interdependent classes into a ‘module’ with one header containing all classes, and one source file.

4. Do it any old way, but whenever there are problems, bodge a solution.

5. Find out the simple and elegant solution that the experts have been keeping from some of us all these years.

6. Resort to the Fitch method.

If you choose any option except the last, you have my best wishes. If you choose option 5 and succeed, please let us know. (Roll on the hypertext version of EXE eh?!) If you choose option 6, OK, let’s rock...

Assumptions

Source filenames are suffixed .h & .c for C compatible header and code respectively, while C++ source filenames are suffixed .hpp & .cpp (or .hxx & .cxx).

Class headers are in the .hpp file. Member definitions are in the .cpp file.

Some possible reasons for including a class header file

From source header files:

1. To define pointers or references to objects of the class.

2. To define another class in which the class is a member object.

3. To define another class derived from the class.

From source code files:

4. To define the members of the class.

5. To define a friend class/function of the class.

6. To instantiate of objects of the class.

7. To handle objects of the class through a pointer or reference.

The responsibility of a header file

When I include a header, I expect it to do everything I need (given the reason I included it).

If I’m defining the class members, I want the header to declare and define the class, its bases, and all the classes used by the members and those of the bases. Naturally, I don’t want everything included. So I wouldn’t want definitions of classes used by bases’ private member functions.

If I’m defining a derived class, although I want the definitions of the bases I don’t want definitions of all the classes that the bases’ member functions use, for those I just need declarations.

If I’m defining a member object I only want declarations of classes used by the class’s public member functions.

When I’m using a class I just want to include one header, I don’t want to have to remember that the class uses other classes and that I should also include other headers. I should only have to include headers for what I’m explicitly using. Similarly, in the source code file, I am relying on the class header to do the necessary includes for all the classes used in the class definition. The source code file should only have additional includes for those classes that are not used by the class definition.

Moreover, the header should also prevent multiple inclusion and optionally produce a warning upon circular inclusion.

Do you think I’m being unreasonable?

If you do, then you will enjoy programming tedium too much to use the following technique.

Defining an improved include process

Pretend that the #include directive is a C function call. As standard it does this:

void Include(char* p_sFilename);

I’d like something more like:

enum EAccess {e_Nil,e_Public,e_Protected,e_Private,e_CPP};

void Include(char* p_sFilename,EAccess p_eAccess=e_Nil,BOOL p_bDefinition=TRUE);

The definition of Include is as follows:

void Include(char* p_sFilename, EAccess p_eAccess, BOOL p_bDefinition)

{ if (p_bDefinition) // If definition of class required

{ Include(<CLib.h>); // Include Standard C Libraries first

Include(“Libraries.h”); // Include other C++ libraries

Include(“Templates.hpp”); // Include templates

Include(“Bases.hpp”); // Include base classes (not in a library)

Include(“Members.hpp”); // Include definitions of member classes and arguments to templates

class References; // Do declarations here of pointers and references to classes

class ThisClass: Base // The class definition

{

public:

ThisClass();

};

}

switch (p_eAccess)

{

case e_CPP:

case e_Private:

Include(“Befrienders.hpp”,e_Private); /* Full access definitions of classes declaring us friends */

Include(“Privates.hpp”,e_Public); /* Public definitions of classes privately referenced */

// (break deliberately absent)

case e_Protected:

Include(“Protecteds.hpp”,e_Public); /* Public definitions of classes protectedly referenced */

// (break deliberately absent)

case e_Public:

Include(“Bases.hpp”,p_eAccess==e_Public?e_Public:e_Protected,FALSE);

// Public (or protected too if base) definitions of classes used by bases (but we don’t need the definitions of the bases because we’ve had those earlier) */

Include(“Publics.hpp”, e_Public); /* Include definitions of classes publicly referenced */

// (break deliberately absent)

};

}

The calls to Include() are either in the form above or from a CPP file in which case the usage is Include(“Header.hpp”,e_CPP);. NB the use of e_CPP instead of e_Private is only to accord with the pre-processor version (later).

What could be easier?

By default the header will define the class - that is, after all, the primary purpose of a header. The novel part is that if requested, the header will also include headers necessary according to particular requirements. These requirements are met in the switch statement which includes other headers on an effective access basis. That is the highest access gets everything, the lowest access (e_Nil) gets nothing.

If we’re being included by a source code file, a class declared friend, a derived class, or being requested for all publicly referred classes, we include all publicly referred classes not previously included and request that those classes also include their publicly referenced classes, etc. Ad nauseum. And not forgetting, the definitions of classes publicly referenced by their base classes. Moreover, also the definitions of the classes used by their protected members - if this class is being included as a base. Note though, that we’ve included the definitions of the base classes themselves earlier in the header.

If we’re being included as a base, our includer will also need definitions of the classes referenced by our protected members, and definitions of the classes publicly referenced by them, etc.

If we’re being included by a source code file or a class we’ve declared a friend then we need complete definitions of the classes declaring us a friend. We also need definitions of classes used by our private members and the classes used by their public members.

Simulating parameters using the pre-processor

Unfortunately, parameters are not one of the pre-processor’s features, at least not stackable ones. Of course, this article wouldn’t exist if there weren’t some way around this problem. There is a way of passing Boolean parameters, and it isn’t too difficult to turn Boolean parameters into enumerated parameters. There’s no way of copying or stacking pre-processor symbol values (at least, none that I know of - and you may laugh if I’ve missed the obvious) so the only state we can copy is the defined state. Note that even Microsoft appear to appreciate this shortcoming in the pre-processor and you may come across their “#pragma pack( [ [ { push | pop}, ] [ identifier, ] ] [ n] )” - oh, how nice it would be to be able to push/pop any symbol value.

There is some work to be done by the programmer, though I’ve tried to make it as little as possible. All that’s needed is one global search and replace per header file. This is because each header file has to have unique symbols in which to save the states of the parameter symbols.

A few words to the wise

Whenever you use #include, please, please make life easy for yourself. Don’t do the following:

#if !defined(MYHEADER)

#define MYHEADER

#include “MyHeader.hpp”

#endif

Such stuff belongs in the header - not surrounding the include directive. I want to see a clean include. I feel guilty enough as it is, having to define a symbol as a header parameter (see later).

I have also come across the following:

#include MYHEADER

Now, what I want to know is, what’s wrong with the filename? Now, if use of a symbol enabled a choice of paths to be prepended to the filename, then maybe I might be persuaded of its utility. Well, I suppose perhaps the stringizing operator could be used, but really, such things are best sorted out either by using relative paths or within the compiler’s include path.

If it’s a versioning problem then make it explicit:

#if defined(RELEASE)

#include “../v2/MyHeader.hpp”

#else

#include “../Test/MyHeader.hpp”

#endif

Even then, such stuff should be put inside the header - don’t put unnecessary burdens on the programmer. Encapsulation is the name of today’s game!

Now, no doubt nearly everyone has adopted the technique of preventing redundant repeated inclusion by only performing the header definition if a symbol is undefined and immediately defining the symbol otherwise.

Another slightly less well known technique is trapping circular inclusion.

#if !defined(MYHEADER_LOCK)

#define MYHEADER_LOCK

#if !defined(MYHEADER_H)

#define MYHEADER_H

... // Body of header

#include "OTHER_H" // Some includes

...

#endif

#undef MYHEADER_LOCK

#else

#pragma message("Circular inclusion of module MYHEADER attempted")

#endif

The definition of MYHEADER_H prevents repeated inclusion. MYHEADER_LOCK will generate a warning if any sub-include attempts to include the includer. Now, it is just a warning, because in complex libraries it is almost unavoidable, but it is sometimes useful to know when classes become intimately intertwined. The warning will either mean that yes, you do have a circular dependency and the compiler will also produce a warning. Or, it simply means that your include file is being a little bit overzealous in including every other header that could possibly be needed - but there is actually no problem. I suggest you make the message conditional upon the definition of an appropriate symbol.

Why have I just told you these minor details? Because I have also used them in the enhanced header.

Implementing the enhanced header

Note that to emulate parameter passing the header becomes a little bit complicated, and I have to be extremely sober to understand it myself. You should find that it operates in exactly the same way as the pseudo include function.

The best thing to do is to use it as the basis for a header template. That way every time the programmer needs to write a new header for a class, they open this template header up and replace all instances of FILENAME with MYCLASS and all instances of Class with MyClass. Note that if you haven’t moved to NTFS yet that you should keep to the eight character limit for pre-processor symbols - that way they can be used for filenames too.

// Replace FILENAME with the filename minus suffix in upper case

// Replace Class with class name

#if !defined(FILENAME_LOCK) // This file cannot be included recursively

#define FILENAME_LOCK

#if defined(DEFINITION)

#define FILENAME_DEFINITION // Save DEFINITION entry state

#endif

#if defined(PRIVATE) // Ensure only one defined

#if defined(PROTECTED) // Assert

#error PROTECTED defined in addition to PRIVATE

#elif defined(PUBLIC) // Assert

#error PUBLIC defined in addition to PRIVATE

#endif

#define FILENAME_PRIVATE // Save PRIVATE entry state

#undef PRIVATE // Release

#elif defined(PROTECTED) // Ensure only one defined

#if defined(PUBLIC) // Assert

#error PUBLIC defined in addition to PROTECTED

#endif

#define FILENAME_PROTECTED // Save PROTECTED entry state

#undef PROTECTED // Release

#elif defined(PUBLIC)

#define FILENAME_PUBLIC // Save PUBLIC entry state

#undef PUBLIC // Release

#elif !defined(DEFINITION) // Default must be from a .cpp file

#define DEFINITION

#if defined(CPP) // Called from THE .cpp file

#undef CPP // A one shot parameter

#define FILENAME_CPP // Save CPP entry state

#define FILENAME_PRIVATE // Save effective PRIVATE entry state

#endif

#if defined(DEFINITION) // Definition of class required?

#if !defined(FILENAME_DEFINED) // Defining this class

#define FILENAME_DEFINED // PREVENT RE-ENTRY PERMANENTLY

// HEADER PREPARATION

// Include CLibs here - Use #include <CLib.h>

// Include Libraries here - Use #include "Library.h"

// Include remaining base classes here - Include definitions of base classes, e.g. #include "Baseclass.h"

// Include remaining template classes, parameter classes to them and declarations

here, e.g. #include “Tmpl.hpp”, #include “MyClass”, typedef class Templ<MyClass> MyTC;

// Include remaining member objects here - Include definitions of classes used as member objects, e.g. #include "MemberObj.hpp"

// Declare remaining classes referenced here - and any typedefs etc., e.g. class CReference;

// --------------------------------------------------------------------------

// HEADER PROPER:- Define class between these dashed lines

class Class: public Baseclass

{

private:

protected:

public:

Class();

~Class();

};

// --------------------------------------------------------------------------

#endif

#undef DEFINITION // Release

#endif

#if defined(FILENAME_PRIVATE) // classes this class uses privately (i.e. for cpp or friend file's benefit)

#define DEFINITION // This will include definitions of classes who have declared us friends

#define PRIVATE // This will include private, prot & pub class references of classes who have declared us friends

// --------------------------------------------------------------------------

// Include classes declaring us friends, between these dashed lines (even if included earlier)

// --------------------------------------------------------------------------

#undef DEFINITION

#undef PRIVATE

#endif

#if defined(FILENAME_PRIVATE) || defined(FILENAME_PROTECTED) || defined(FILENAME_PUBLIC)

#if defined(FILENAME_PUBLIC)

#define PUBLIC

#else

#define PROTECTED

#endif

// --------------------------------------------------------------------------

// Include non-library base classes, between these dashed lines (even if included earlier)

// --------------------------------------------------------------------------

#if defined(FILENAME_PUBLIC)

#undef PUBLIC

#else

#undef PROTECTED

#endif

#if defined(FILENAME_PRIVATE)

#define DEFINITION

#define PUBLIC

// --------------------------------------------------------------------------

// Include classes privately referenced, between these dashed lines (unless included in header preparation)

// --------------------------------------------------------------------------

#undef PUBLIC

#undef DEFINITION

#endif

#if defined(FILENAME_PRIVATE) || defined(FILENAME_PROTECTED)

#define DEFINITION

#define PUBLIC

// --------------------------------------------------------------------------

// Include classes protectedly referenced, between these dashed lines (unless included in header preparation)

// --------------------------------------------------------------------------

#undef PUBLIC

#undef DEFINITION

#endif

#if defined(FILENAME_PRIVATE) || defined(FILENAME_PROTECTED) || defined(FILENAME_PUBLIC)

#define DEFINITION

#define PUBLIC

// --------------------------------------------------------------------------

// Include classes publicly referenced, between these dashed lines (unless included in header preparation)

// --------------------------------------------------------------------------

#undef PUBLIC

#undef DEFINITION

#endif

#if defined(FILENAME_PRIVATE) // Restore PRIVATE parameter state

#undef FILENAME_PRIVATE

#define PRIVATE

#elif defined(FILENAME_PROTECTED) // Restore PROTECTED parameter state

#undef FILENAME_PROTECTED

#define PROTECTED

#elif defined(FILENAME_PUBLIC) // Restore PUBLIC parameter state

#undef FILENAME_PUBLIC

#define PUBLIC

#endif

#if defined(DEFINITION) // Assert

#error DEFINITION should be undefined at this point

#endif

#if defined(FILENAME_CPP)

#undef FILENAME_CPP

#undef PRIVATE // Restore PRIVATE state

#endif

#if defined(FILENAME_DEFINITION) // Restore DEFINITION state

#define DEFINITION

#undef FILENAME_DEFINITION

#endif

#undef FILENAME_LOCK

#else

#pragma message("Breach of FILENAME_LOCK attempted")

#endif

// Always exits with: DEFINITION unchanged, PRIVATE/PROTECTED/PUBLIC unchanged, CPP undefined, FILENAME_* undefined, except FILENAME_DEFINED defined.

// Replace FILENAME with the filename minus suffix in upper case

// Replace Class with class name

// Precompilation (include)

#include "STDAFX.H"

// Parent (include)

#define CPP

#include "FILENAME.HPP"

// C Libraries (includes)

// Libraries (includes)

// Classes not in header used (includes)

// Debugging (includes)

// Macro implementations

// Redefinitions (includes)

// Support

// Global definitions

// Message maps

// Class member definitions follow:

// **** private: ****

// **** protected: ****

// **** public: ****

How to use the enhanced header

Whenever you need to develop a new class, say MyClass, take the header template, save it as MyClass.hpp, global replace FILENAME with MYCLASS, global replace Class with MyClass, and then fill in the various sections. You can start off with the class definition itself and then include the necessary base class headers, library headers, etc.

When you come to define the members, take the source template, save it as MyClass.cpp, do the same global replaces and fill it in. I like to keep my member functions organised by access and in the same order as declared in the class definition. Any includes are also put in particular sections.

If I have a number of classes that are fairly interdependent or collectively represent a functionally distinct module, I will combine them into a source library (not necessarily a static library) by creating a .h file which includes each .hpp file. Then any time I make use of a class from this library I only need to include the header once and without worrying about specifying includes in particular places if I use any of the library classes as bases say.

I used to prohibit myself from using in-line member definitions, because I felt that code belonged only in the .cpp file and that one should be able to modify member functions without needing to modify the header. I have now relaxed my policy, but it’s something for everyone’s coding standards committee to deliberate over. Use of in-line definitions requires that certain things be borne in mind when using my enhanced header (perhaps in general). No in-line code should require the inclusion of any header that would not otherwise be required if the function was not defined in-line. This is because the whole design of the enhanced header assumes that member functions are defined in the .cpp source file. Unless the in-line code is very simple, I’d suggest that you keep code to the source file. For example, a constructor taking a reference to another class, should not be implemented in-line because this would require that the header of the referenced class be included first.

Rounding off

I have been using this enhanced header for a couple of years now and I have enjoyed being able to forget worrying about whether the right files are being included and in the right order. Nearly all the includes are performed by the header, making life easier when it comes to pruning out includes that are no longer necessary because of a design change.

Yes, there are drawbacks: a large header file, a search and replace upon first use, having to put particular includes in appropriate places, and longer compilation times. Believe me, it’s worth it. The extra compilation time and other minor hassle is about the same as the time that would have been spent in extra thought and fixing include conflicts. The enhanced header is really all about structuring the include process, and we all know the received wisdom that structured is better than unstructured.

Further development

The enhanced header assumes public and non-virtual base classes. While modifying it to eliminate unnecessary includes for protected and private bases is possible, it may not be worth the effort if you rarely use such access for your base classes. Virtual base classes may or may not be a headache. The enhanced header may need to give these special attention, then again, it may not. Something for those intrepid users of multiple inheritance to consider. Being a single inheritance advocate I can only offer an uninformed opinion that I don’t think there should be any problem using the enhanced header with virtual or multiple base classes.

To minimise the size of my examples, I have left out sections for documentation, version control, etc. There is obviously plenty of scope for incorporating your own header and code file templates.

Epilogue (December 2001)

For some strange reason I no longer need to use this technique. I find that a well designed class library has fairly simple inter-class relationships and that the inclusion process is relatively straightforward, i.e. you rarely need to worry about inclusion ordering, or paring down dependencies to a minimum. There you go.

If you're interested in how far I took this, the most recent versions of the header and implementation file are here: