|
A short introduction to the compilation workflowThe runtime environmentFrom an abstract point of view, a program is nothing but a bunch of program code and program data stored in memory. On decent OSes, the memory as seen by a program is a flat address space. On a typical 32 bit processor, this address space starts at the address zero and ends at the 32bit 4GB boundary. Theorically, the program should be able to access any of the memory locations within this address space. In practice, this is not possible unless in very special configurations. ![]() On a normal x86 Linux system, the program has only access to a fragmented subset of that flat address space which is shown on the right (image courtesy of lwn.net). The text fragment contains the user code, the heap fragment contains the user-allocated memory while the stack fragment contains the user's stack area. The text fragment is usually itself fragmented in multiple fragments each of which represents one of the shared libraries used by the program. Shared libraries are typically used in modern OSes to avoid copying in memory multiple times the code of the libraries used simultaneously by multiple programs. The library code is loaded only once in physical RAM and is mapped sometimes at different addresses in the virtual address spaces of different user processes. The fact that it is possible for these libraries to be loaded at different locations in different programs means that:
How the library or the program are generated such that they correctly handle the PIC case is beyond the scope of that document. For more information on this topic, I do recommend you read Linkers and Loaders by John R. Levine. An early version is available online but nothing replaces the real paper thing. When the user requests the OS to run a program, the OS is responsible for establishing the memory map described above. On most Linux systems, it then leaves the problem of handling the program-library communication to a user program named the loader. On most linux systems which use the ELF binary format, this loader is implemented by the glibc package. On my system, this loader is located in /lib/ld-2.3.3.so. The job of this loader is to copy in memory the code of the libraries (well, only if the said libraries have not been loaded already) the program depends upon and to resolve all the references the program makes to these libraries. Of course, this process is recursive since a library can depend on another library and its references to this other library might need to be resolved too. This whole resolution process is a bit complex on UNIX systems but there are numerous tools you can use to debug it:
In most cases, this complicated resolving process can be debugged with little to no knowledge of the exact way things work. The loader first builds a list of directories it needs to look into to find the libraries on which the binary depends. Then, it uses this ordered list of directories to locate the requested libraries. If one of the libraries is not found in this list of paths, loading is aborted and an obscure error is printed on screen. How the search list is built
References
The C compilation processThe C compilation process is most often implemented by a set of different cooperating programs:
The gcc program which you can invoke from the command-line can be instructed to stop the compilation process after any of the stages described above:
The pre-processorThe C pre-processor is mainly responsible for expanding macro definitions and include statements and removing comments. This means that the following program will be transformed into the program shown next:
The macro replacement process and the include replace process are, of course, recursively performed until no macro definition, uses, and include statements are present in the source file. To understand the exact behavior of the preprocessor and debug preprocessing bugs (i.e., macro bugs), the developer can request from the C compiler an annotated output of the source file after preprocessing. With gcc, the -E compilation option can be used to instruct the compiler to stop compilation at this point and output the preprocessed file. The format of the preprocessor output is described in the cpp manual. For example, gcc 3.3.2 uses the format described in this webpage. The macro definition step can be usually controlled by a few options:
The preprocessing step can be controlled and influenced by many more parameters. By default, if no parameters are specified, the compiler searches a number of default directories for the headers requested by the user. The exact rules used by gcc are described in the cpp manual. gcc allows the user to add directories to the include search path with the -I switch. The compilerIf you do not specify the -E option to request gcc to stop compilation after the preprocessing stage, the compiler will process with the compilation itself which generates assembly output. To get an idea of what this output looks like, you can use the -S option to look at the assembly output before the assembler runs. The following C code run through the command gcc -S test.c -o test.S generates the following test.S file.
warningsgcc offers many many options to enable and disable large classes of warnings. Certain warnings depend on the code optimization level (because only certain optimization passes enabled at certain optimization levels can perform the code analysis required to generate correctly the warning). The full list of warning options, is, of course, described in the gcc manual. Generally, each warning has two flags: the -Winline and the Wno-inline flavors. The -Wno- flavor disables the warning while the -W flavor enables it.
debugging optionsTo debug the generated code, you need to request gcc to generate debugging information and store it in the resulting binary. Typically, this is done with the -g option. However, gcc supports a number of different debugging information formats and experience has shown that the default debugging information format used by the -g switch is not so great. As such, I strongly advise you to use the dwarf2 debugging information format which is supported by all decent debuggers which means you need to specify the -gdwarf-2 switch rather than the -g switch. gcc code optimizationThe optimization level is the easiest code-generation option available: gcc offers -O0, -O1, -O2 and -O3 which apply a number of speed-related optimization passes to the code. The higher optimization levels are often slower but are supposed to generate faster code. It has often been reported that -O3 generates buggy code so, it should be used with care. If you want to debug your program, I strongly advise against using anything but the -O0 option which ensures that the generated code will behave nicely in a debugger. The -Os option can be used to target code size optimization. Surprisingly, it also often improves code speed (through better cache locality). All these optimization options can be used with the -march=, -mtune and -mcpu options which control what type of assembly instructions the compiler will use and what exact CPU the compiler will optimize the instruction scheduling for. The exact semantics of these options depends on the target architecture so which means that you should read the gcc manual prior to using them. The assemblerThe job of the assembler is to convert the assembler output shown in the previous section into a binary file. Of course, it does not make much sense to look at the binary output with your own eyes but you can use deciphering tools such as readelf or objdump. readelf is usable on any recent Linux system I know of and provides great information on the generated object files. It is especially useful to debug linker-related problems. objdump, on the other hand, provides less detailed information but works on a wider variety of systems and can be used to disassemble a binary object file. The following output shows the result of running the two commands: gcc -S test.c -o test.S gcc -c test.S -o test.o objdump -d test.o
The linkerIf you do not disable the linking step with the -c switch, gcc runs its output assembled data through the link process. gcc itself provides a number of switches to control the behavior of the linker most of which are simple wrappers for switches provided by the linker itself. Although it is possible to pass directly switches to the linker through gcc and the -Wl, syntax, it is almost never necessary (i.e., it has not happened to me in a long long time) which is why I do not present here the linker switches themselves but the gcc wrappers. On most normal Linux systems, users are interested in generating only three kind of binaries from a set of object files:
Linking an executableIf you have a set of object files, one of which defines a main function compatible with the canonical definition int main (int, char *[]), you can safely run gcc file1.o file2.o file3.o -o binary-name which will generate an executable binary from a set of object files. If your executable depends on either a static library or a shared library, you need to specify the dependency as well as the location of the library:
Linking a static libraryLinking a static library is pretty trivial and does not involve the compiler. In fact, it is not really a link step: it just involves the creation of an archive which contains the .o files you want to put into the library. If you have a list of .o files, the command to create a static library (in fact, an object file archive) is: ar rcs libtest.a foo1.o foo2.o foo3.o foo4.o. The generated .a file can then be used during the final link of an executable as shown above or during the link of a shared library to resolve any missing symbols. It is not possible to:
Linking a shared libraryLinking a shared library is also relatively simple:
A few notes about Visual C++Visual C++ provides a Graphical User Interface to the underlying compilation system but it is possible to access this underlying compilation environment from the DOS command-line. Sometimes, it is necessary. The Visual C++ C/C++ preprocessor is very close feature-wise to the gcc preprocessor. Its compilation options are very similar: |