C++ ESSENTIALS
PART I

II. Operators
IV. Scope and Duration of Identifiers

V. Functions
VI. Program Files and Linkage

Functions represent a fundamental level of program organization in C++. All C++ programs group their instructions into functions, and programs of any complexity have 1000's of functions, or even more. A second level of organization is the program or source code file with most programs having many such files. C++ itself comes with a standard set of such files for input/output processing, mathematical operations, memory allocation, string manipulation etc.

There are two kinds of program files. First, there are the so-called 'header files'. These contain type, constant, class, function and other types of declarations. Their main purpose is to declare the objects (not necessarily in the object oriented programming sense) and functions that will be needed in a program. C++ provides a number of these header files, many of which declare the functions needed for the input/output processing, mathematical operations, memory allocation, and string manipulation mentioned above.

Some declarations are also definitions. For example, the declaration of a constant implies that memory is set aside for a value for the constant, which implies a definition. This adds a bit to the complexity of the header files as we will see below.

The second type of program file primarily contains function and other definitions. These files contain the majority of the code that computers execute. They can also contain declarations. In fact, every program must contain at least one file with both function declarations and definitions, the file containing the function 'main'. It is common for this file to include, in addition to the declaration and definition of 'main', various type, constant, function and other declarations needed by 'main' or one of the functions called by 'main'. Usually the functions declared in this file are also defined in this file.

Users can also create their own header and definition files. It is common, for example, to create a header file that declares commonly used constants such as TRUE and FALSE, along with commonly used typedef's and functions. A corresponding definition file will be created to hold the function definitions declared in the header file. Such files are also used when working with classes. The class declarations are put into a header file and the member function definitions in the second file.

Taking advantage of the name 'header file', files containing declarations, by tradition, use the extension '.h' while the file containing the definitions has the same extension as any other C++ file. (In Borland C++ this is '.cpp'.) For example, if one wants a class called 'Container', one might have a header file containing the class declarations with the name 'contain.h' and a file holding the definitions with the name 'contain.cpp'.

At compile time any code using any of the declarations held in a header file must know about those declarations. Thus, the declarations must be part of the source code. On the other hand, the actual definitions for functions used in the code are not needed until later - when the linker to its def below] program is called at link time. This is one of the reasons for the separation of code into header files and definition files.

A second reason is to allow separate compilation. As programs get larger it takes longer to compile them as a whole. Usually a programmer (or team of programmers) develops the code in parts with each part having its own header and definition files. Once the code for a particular part has been written and tested, it may never need to be changed again. It is a waste of time to continue to compile this code over and over every time a change is made in other parts of the program. Those 'other parts' may, however, need access to declarations created in this part. By separating the function definitions from the declarations, one can use the declarations in new compiles while not doing anything with the already compiled definitions.

An example of this involves the code that comes with a standard C++ compiler. It would be a waste of time to require the re-compiling of the function definitions supplied with C++ every time they are used in a program. Instead this code collection (the function definitions for stream processing, mathematics, string manipulation etc.) is compiled once by the C++ developers and placed in a library supplied with the compiler. The related header files for any of these functions are also supplied and are available to any program at compile time as 'include' files. After a successful compile, the appropriate library code is linked in at link time.

The code in header files is included into a program using the preprocessor directive '#include'. Preprocessor directives are commands that begin with a '#' symbol and are executed before the compiler begins the translation process. To include the header file 'iostream.h', one would write:

#include <iostream.h>

If you wanted to include the file 'contain.h', a file that is not part of the standard set of C++ header files, you would write:

#include "contain.h"

The "<...>" symbols indicate to the preprocessor that the file inside should be sought in the system directory containing the built-in header files. Otherwise, the system starts looking in the directory of the file containing the '#include' directive and, if not found there, in any directories listed in the search path. The programmer must describe this search path using the methodology appropriate for the compiler in question. Traditionally, this was done through the use of 'make' files but more recently, other, more intuitive, approaches are available. (Borland C++ uses the idea of a project to handle this and the other purposes of the make file.)

It is good practice to place all the '#include' directives at the top of the program. When the preprocessor encounters an '#include' statement, it copies in all the code in the included file. When the compiler then reads the file, it sees the code you wrote plus any code you had included.

In complex code, header files themselves often have their own set of '#include' directives and it is therefore possible for the same file to be included twice - once indirectly by way of a header file, and once directly. To avoid having the same code appear twice in one source file (a problem, for example, when header files include definitions), writers of header files use three other directives, '#define', '#ifndef', and '#endif'. For example, in the container class example, the header file, 'contain.h' might be written as:

#ifndef CONTAIN_H
#define CONTAIN_H
   // code for the class declaration appears here
#endif

The first line checks to see if the system knows about an identifier called 'CONTAIN_H'. If it doesn't, this code has not been read before because the next line defines such an identifier. In this case the code between the "#ifndef" and "#endif" directives should be and is processed. If the system does know about the identifier, this is an indication that the code in this file has already been read. Therefore, the system should (and will) skip to the '#endif' directive. None of the code in-between is seen or processed by the compiler. The result is that the code between the #ifndef' and '#endif' directives is compiled once - when the identifier is declared.

To avoid creating an identifier via the "#define" statement with the same name as some identifier in the rest of the code, it is standard practice to use the form used here - some name in upper case letters followed by '_H'. Identifiers with this form should, therefore, be avoided in the rest of the code.

An example of code using these directives

When the compiler successfully compiles a source code file, it creates object code in what is called an object file. Such code contains machine language versions of all the definitions found in the file. It does not, however, contain code for the functions declared in the file or its included header files but not defined in the file. Instead, anytime such functions are called in the source code file, the corresponding object file contains markers indicating that the machine code for these functions must be supplied later - at link time. The source code files containing these function definitions must also be compiled into object code. Thus a large program may consist of any number of object code files. The C++ standard function library also consists of object code. Once object code has been created for all elements of the program, the linker (a second program - like the compiler), goes through all the files looking for the markers indicating the need for object code and tries to find the appropriate object code. If all the links are resolved, executable code is created and the program is runable.

We often speak of compiling when we really mean compiling and linking. These really are two different processes that perform different functions and produce their own error messages. When you get a "linker error", you can be sure that the code compiled correctly but that the linker was unable to find code to match some function you called. This can happen because you:

  1. Misspelled the function name in the definition,
  2. Provided the wrong number of type of parameters,
  3. Forgot to tell the linker about where to look for the object code, or
  4. Simply forgot to define the function.

Errors 1, 2, and 4 are really compile time problems but ones which the compiler could not catch. Error number 3 involves the way the 'make' file is written or involves the methodology supplied with your compiler to use in place of 'make' files. You should check your C++ manuals for information on how to set your programs up for compiling and linking. (Read the document "BorlandProject.htm" for help with Borland C++ project files.)BR>

Click here for more "Essentials of C++".