Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

Chapter 3. Compiling and Building

Red Hat Enterprise Linux 6 includes many packages used for software development, including tools for compiling and building source code. This chapter discusses several of these packages and tools used to compile source code.

3.1. GNU Compiler Collection (GCC)

The GNU Compiler Collection (GCC) is a set of tools for compiling a variety of programming languages (including C, C++, ObjectiveC, ObjectiveC++, Fortran, and Ada) into highly optimized machine code. These tools include various compilers (like gcc and g++), run-time libraries (like libgcc, libstdc++, libgfortran, and libgomp), and miscellaneous other utilities.

3.1.1. Language Compatibility

Application Binary Interfaces specified by the GNU C, C++, Fortran and Java Compiler include:
  • Calling conventions. These specify how arguments are passed to functions and how results are returned from functions.
  • Register usage conventions. These specify how processor registers are allocated and used.
  • Object file formats. These specify the representation of binary object code.
  • Size, layout, and alignment of data types. These specify how data is laid out in memory.
  • Interfaces provided by the runtime environment. Where the documented semantics do not change from one version to another they must be kept available and use the same name at all times.
The default system C compiler included with Red Hat Enterprise Linux 6 is largely compatible with the C99 ABI standard. Deviations from the C99 standard in GCC 4.4 are tracked online.
In addition to the C ABI, the Application Binary Interface for the GNU C++ Compiler specifies the binary interfaces required to support the C++ language, such as:
  • Name mangling and demangling
  • Creation and propagation of exceptions
  • Formatting of run-time type information
  • Constructors and destructors
  • Layout, alignment, and padding of classes and derived classes
  • Virtual function implementation details, such as the layout and alignment of virtual tables
The default system C++ compiler included with Red Hat Enterprise Linux 6 conforms to the C++ ABI defined by the Itanium C++ ABI (1.86).
Although every effort has been made to keep each version of GCC compatible with previous releases, some incompatibilities do exist.
ABI incompatibilities between Red Hat Enterprise Linux 6 and Red Hat Enterprise Linux 5

The following is a list of known incompatibilities between the Red Hat Enterprise Linux 6 and 5 toolchains.

  • Passing/returning structs with flexible array members by value changed in some cases on Intel 64 and AMD64.
  • Passing/returning of unions with long double members by value changed in some cases on Intel 64 and AMD64.
  • Passing/returning structs with complex float member by value changed in some cases on Intel 64 and AMD64.
  • Passing of 256-bit vectors on x86, Intel 64 and AMD64 platforms changed when -mavx is used.
  • There have been multiple changes in passing of _Decimal{32,64,128} types and aggregates containing those by value on several targets.
  • Packing of packed char bitfields changed in some cases.
ABI incompatibilities between Red Hat Enterprise Linux 5 and Red Hat Enterprise Linux 4

The following is a list of known incompatibilities between the Red Hat Enterprise Linux 5 and 4 toolchains.

  • There have been changes in the library interface specified by the C++ ABI for thread-safe initialization of function-scope static variables.
  • On Intel 64 and AMD64, the medium model for building applications where data segment exceeds 4GB, was redesigned to match the latest ABI draft at the time. The ABI change results in incompatibility among medium model objects.
The compiler flag -Wabi can be used to get diagnostics indicating where these constructs appear in source code, though it will not catch every single case. This flag is especially useful for C++ code to warn whenever the compiler generates code that is known to be incompatible with the vendor-neutral C++ ABI.
Excluding the incompatibilities listed above, the GCC C and C++ language ABIs are mostly ABI compatible. The vast majority of source code will not encounter any of the known issues, and can be considered compatible.
Compatible ABIs allow the objects created by compiling source code to be portable to other systems. In particular, for Red Hat Enterprise Linux, this allows for upward compatibility. Upward compatibility is defined as the ability to link shared libraries and objects, created using a version of the compilers in a particular Red Hat Enterprise Linux release, with no problems. This includes new objects compiled on subsequent Red Hat Enterprise Linux releases.
The C ABI is considered to be stable, and has been so since at least Red Hat Enterprise Linux 3 (again, barring any incompatibilities mentioned in the above lists). Libraries built on Red Hat Enterprise Linux 3 and later can be linked to objects created on a subsequent environment (Red Hat Enterprise Linux 4, Red Hat Enterprise Linux 5, and Red Hat Enterprise Linux 6).
The C++ ABI is considered to be stable, but less stable than the C ABI, and only as of Red Hat Enterprise Linux 4 (corresponding to GCC version 3.4 and above.). As with C, this is only an upward compatibility. Libraries built on Red Hat Enterprise Linux 4 and above can be linked to objects created on a subsequent environment (Red Hat Enterprise Linux 5, and Red Hat Enterprise Linux 6).
To force GCC to generate code compatible with the C++ ABI in Red Hat Enterprise Linux releases prior to Red Hat Enterprise Linux 4, some developers have used the -fabi-version=1 option. This practice is not recommended. Objects created this way are indistinguishable from objects conforming to the current stable ABI, and can be linked (incorrectly) amongst the different ABIs, especially when using new compilers to generate code to be linked with old libraries that were built with tools prior to Red Hat Enterprise Linux 4.

Warning

The above incompatibilities make it incredibly difficult to maintain ABI shared library sanity between releases, especially when developing custom libraries with multiple dependencies outside of the core libraries. Therefore, if shared libraries are developed, it is highly recommend that a new version is built for each Red Hat Enterprise Linux release.

3.1.2. Object Compatibility and Interoperability

Two items that are important are the changes and enhancements in the underlying tools used by the compiler, and the compatibility between the different versions of a language's compiler.
Changes and new features in tools like ld (distributed as part of the binutils package) or in the dynamic loader (ld.so, distributed as part of the glibc package) can subtly change the object files that the compiler produces. These changes mean that object files moving to the current release of Red Hat Enterprise Linux from previous releases may lose functionality, behave differently at runtime, or otherwise interoperate in a diminished capacity. Known problem areas include:
  • ld --build-id
    In Red Hat Enterprise Linux 6 this is passed to ld by default, whereas Red Hat Enterprise Linux 5 ld doesn't recognize it.
  • as .cfi_sections support
    In Red Hat Enterprise Linux 6 this directive allows .debug_frame, .eh_frame or both to be omitted from .cfi* directives. In Red Hat Enterprise Linux 5 only .eh_frame is omitted.
  • as, ld, ld.so, and gdb STB_GNU_UNIQUE and %gnu_unique_symbol support
    In Red Hat Enterprise Linux 6 more debug information is generated and stored in object files. This information relies on new features detailed in the DWARF standard, and also on new extensions not yet standardized. In Red Hat Enterprise Linux 5, tools like as, ld, gdb, objdump, and readelf may not be prepared for this new information and may fail to interoperate with objects created with the newer tools. In addition, Red Hat Enterprise Linux 5 produced object files do not support these new features; these object files may be handled by Red Hat Enterprise Linux 6 tools in a sub-optimal manner.
    An outgrowth of this enhanced debug information is that the debuginfo packages that ship with system libraries allow you to do useful source level debugging into system libraries if they are installed. See Section 4.2, “Installing Debuginfo Packages” for more information on debuginfo packages.
Object file changes, such as the ones listed above, may interfere with the portable use of prelink.

3.1.3. Running GCC

To compile using GCC tools, first install the binutils and gcc packages. Doing so will also install several dependencies.
In brief, the tools work via the gcc command. This is the main driver for the compiler. It can be used from the command line to pre-process or compile a source file, link object files and libraries, or perform a combination thereof. By default, gcc takes care of the details and links in the provided libgcc library.
Conversely, using GCC tools from the command line interface consumes less system resources. This also allows finer-grained control over compilers; GCC's command line tools can even be used outside of the graphical mode (runlevel 5).

3.1.3.1. Simple C Usage

Basic compilation of a C language program using GCC is easy. Start with the following simple program:

Example 3.1. hello.c

#include <stdio.h>
int main()
{
  printf ("Hello world!\n");
  return 0;
}
The following procedure illustrates the compilation process for C in its most basic form.

Procedure 3.1. Compiling a 'Hello World' C Program

  1. Compile Example 3.1, “hello.c” into an executable with:
    ~]$ gcc hello.c -o hello
    Ensure that the resulting binary hello is in the same directory as hello.c.
  2. Run the hello binary, that is, ./hello.

3.1.3.2. Simple C++ Usage

Basic compilation of a C++ language program using GCC is similar. Start with the following simple program:

Example 3.2. hello.cc

#include <iostream>
using namespace std;
int main()
{
  cout << "Hello World!" << endl;
  return 0;
}
The following procedure illustrates the compilation process for C++ in its most basic form.

Procedure 3.2. Compiling a 'Hello World' C++ Program

  1. Compile Example 3.2, “hello.cc” into an executable with:
    ~]$ g++ hello.cc -o hello
    Ensure that the resulting binary hello is in the same directory as hello.cc.
  2. Run the hello binary, that is, ./hello.

3.1.3.3. Simple Multi-File Usage

To use basic compilation involving multiple files or object files, start with the following two source files:

Example 3.3. one.c

#include <stdio.h>
void hello()
{
  printf("Hello world!\n");
}

Example 3.4. two.c

extern void hello();
int main()
{
  hello();
  return 0;
}
The following procedure illustrates a simple, multi-file compilation process in its most basic form.

Procedure 3.3. Compiling a Program with Multiple Source Files

  1. Compile Example 3.3, “one.c” into an executable with:
    ~]$ gcc -c one.c -o one.o
    Ensure that the resulting binary one.o is in the same directory as one.c.
  2. Compile Example 3.4, “two.c” into an executable with:
    ~]$ gcc -c two.c -o two.o
    Ensure that the resulting binary two.o is in the same directory as two.c.
  3. Compile the two object files one.o and two.o into a single executable with:
    ~]$ gcc one.o two.o -o hello
    Ensure that the resulting binary hello is in the same directory as one.o and two.o.
  4. Run the hello binary, that is, ./hello.

3.1.3.4. Recommended Optimization Options

Different projects require different optimization options. There is no one-size-fits-all approach when it comes to optimization, but here are a few guidelines to keep in mind.
Instruction selection and tuning

It is very important to choose the correct architecture for instruction scheduling. By default GCC produces code optimized for the most common processors, but if the CPU on which your code will run is known, the corresponding -mtune= option to optimize the instruction scheduling, and -march= option to optimize the instruction selection should be used.

The option -mtune= optimizes instruction scheduling to fit your architecture by tuning everything except the ABI and the available instruction set. This option will not choose particular instructions, but instead will tune your program in such a way that executing on a particular architecture will be optimized. For example, if an Intel Core2 CPU will predominantly be used, choose -mtune=core2. If the wrong choice is made, the program will still run, but not optimally on the given architecture. The architecture on which the program will most likely run should always be chosen.
The option -march= optimizes instruction selection. As such, it is important to choose correctly as choosing incorrectly will cause your program to fail. This option selects the instruction set used when generating code. For example, if the program will be run on an AMD K8 core based CPU, choose -march=k8. Specifying the architecture with this option will imply -mtune=.
The -mtune= and -march= commands should only be used for tuning and selecting instructions within a given architecture, not to generate code for a different architecture (also known as cross-compiling). For example, this is not to be used to generate PowerPC code from an Intel 64 and AMD64 platform.
For a complete list of the available options for both -march= and -mtune=, see the GCC documentation available here: GCC 4.4.4 Manual: Hardware Models and Configurations
General purpose optimization flags

The compiler flag -O2 is a good middle of the road option to generate fast code. It produces the best optimized code when the resulting code size is not large. Use this when unsure what would best suit.

When code size is not an issue, -O3 is preferable. This option produces code that is slightly larger but runs faster because of a more frequent inline of functions. This is ideal for floating point intensive code.
The other general purpose optimization flag is -Os. This flag also optimizes for size, and produces faster code in situations where a smaller footprint will increase code locality, thereby reducing cache misses.
Use -frecord-gcc-switches when compiling objects. This records the options used to build objects into objects themselves. After an object is built, it determines which set of options were used to build it. The set of options are then recorded in a section called .GCC.command.line within the object and can be examined with the following:
$ gcc -frecord-gcc-switches -O3 -Wall hello.c -o hello
$ readelf --string-dump=.GCC.command.line hello
String dump of section '.GCC.command.line':
  [     0]  hello.c
  [     8]  -mtune=generic
  [    17]  -O3
  [    1b]  -Wall
  [    21]  -frecord-gcc-switches
It is very important to test and try different options with a representative data set. Often, different modules or objects can be compiled with different optimization flags in order to produce optimal results. See Section 3.1.3.5, “Using Profile Feedback to Tune Optimization Heuristics” for additional optimization tuning.

3.1.3.5. Using Profile Feedback to Tune Optimization Heuristics

During the transformation of a typical set of source code into an executable, tens of hundreds of choices must be made about the importance of speed in one part of code over another, or code size as opposed to code speed. By default, these choices are made by the compiler using reasonable heuristics, tuned over time to produce the optimum runtime performance. However, GCC also has a way to teach the compiler to optimize executables for a specific machine in a specific production environment. This feature is called profile feedback.
Profile feedback is used to tune optimizations such as:
  • Inlining
  • Branch prediction
  • Instruction scheduling
  • Inter-procedural constant propagation
  • Determining of hot or cold functions
Profile feedback compiles a program first to generate a program that is run and analyzed and then a second time to optimize with the gathered data.

Procedure 3.4. Using Profile Feedback

  1. The application must be instrumented to produce profiling information by compiling it with -fprofile-generate.
  2. Run the application to accumulate and save the profiling information.
  3. Recompile the application with -fprofile-use.
Step three will use the profile information gathered in step one to tune the compiler's heuristics while optimizing the code into a final executable.

Procedure 3.5. Compiling a Program with Profiling Feedback

  1. Compile source.c to include profiling instrumentation:
    gcc source.c -fprofile-generate -O2 -o executable
  2. Run executable to gather profiling information:
    ./executable
  3. Recompile and optimize source.c with profiling information gathered in step one:
    gcc source.c -fprofile-use -O2 -o executable
Multiple data collection runs, as seen in step two, will accumulate data into the profiling file instead of replacing it. This allows the executable in step two to be run multiple times with additional representative data in order to collect even more information.
The executable must run with representative levels of both the machine being used and a respective data set large enough for the input required. This ensures optimal results are achieved.
By default, GCC will generate the profile data into the directory where step one was performed. To generate this information elsewhere, compile with -fprofile-dir=DIR where DIR is the preferred output directory.

Warning

The format of the compiler feedback data file changes between compiler versions. It is imperative that the program compilation is repeated with each version of the compiler.

3.1.3.6. Using 32-bit compilers on a 64-bit host

On a 64-bit host, GCC will build executables that can only run on 64-bit hosts. However, GCC can be used to build executables that will run both on 64-bit hosts and on 32-bit hosts.
To build 32-bit binaries on a 64-bit host, first install 32-bit versions of any supporting libraries the executable may require. This must at least include supporting libraries for glibc and libgcc, and libstdc++ if the program is a C++ program. On Intel 64 and AMD64, this can be done with:
yum install glibc-devel.i686 libgcc.i686 libstdc++-devel.i686
There may be cases where it is useful to to install additional 32-bit libraries that a program may require. For example, if a program uses the db4-devel libraries to build, the 32-bit version of these libraries can be installed with:
yum install db4-devel.i686

Note

The .i686 suffix on the x86 platform (as opposed to x86-64) specifies a 32-bit version of the given package. For PowerPC architectures, the suffix is ppc (as opposed to ppc64).
After the 32-bit libraries have been installed, the -m32 option can be passed to the compiler and linker to produce 32-bit executables. Provided the supporting 32-bit libraries are installed on the 64-bit system, this executable will be able to run on both 32-bit systems and 64-bit systems.

Procedure 3.6. Compiling a 32-bit Program on a 64-bit Host

  1. On a 64-bit system, compile hello.c into a 64-bit executable with:
    gcc hello.c -o hello64
  2. Ensure that the resulting executable is a 64-bit binary:
                  $ file hello64
                  hello64: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
                  $ ldd hello64
                  linux-vdso.so.1 =>  (0x00007fff242dd000)
                  libc.so.6 => /lib64/libc.so.6 (0x00007f0721514000)
                  /lib64/ld-linux-x86-64.so.2 (0x00007f0721893000)
    
    The command file on a 64-bit executable will include ELF 64-bit in its output, and ldd will list /lib64/libc.so.6 as the main C library linked.
  3. On a 64-bit system, compile hello.c into a 32-bit executable with:
    gcc -m32 hello.c -o hello32
  4. Ensure that the resulting executable is a 32-bit binary:
                  $ file hello32
                  hello32: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
                  $ ldd hello32
                  linux-gate.so.1 =>  (0x007eb000)
                  libc.so.6 => /lib/libc.so.6 (0x00b13000)
                  /lib/ld-linux.so.2 (0x00cd7000)
    
    The command file on a 32-bit executable will include ELF 32-bit in its output, and ldd will list /lib/libc.so.6 as the main C library linked.
If you have not installed the 32-bit supporting libraries you will get an error similar to this for C code:
          $ gcc -m32 hello32.c -o hello32
          /usr/bin/ld: crt1.o: No such file: No such file or directory
          collect2: ld returned 1 exit status
A similar error would be triggered on C++ code:
$ g++ -m32 hello32.cc -o hello32-c++
In file included from /usr/include/features.h:385,
     from /usr/lib/gcc/x86_64-redhat-linux/4.4.4/../../../../include/c++/4.4.4/x86_64-redhat-linux/32/bits/os_defines.h:39,
     from /usr/lib/gcc/x86_64-redhat-linux/4.4.4/../../../../include/c++/4.4.4/x86_64-redhat-linux/32/bits/c++config.h:243,
     from /usr/lib/gcc/x86_64-redhat-linux/4.4.4/../../../../include/c++/4.4.4/iostream:39,
     from hello32.cc:1:
/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory
These errors indicate that the supporting 32-bit libraries have not been properly installed as explained at the beginning of this section.
Also important is to note that building with -m32 will in not adapt or convert a program to resolve any issues arising from 32/64-bit incompatibilities. For tips on writing portable code and converting from 32-bits to 64-bits, see the paper entitled Porting to 64-bit GNU/Linux Systems in the Proceedings of the 2003 GCC Developers Summit.

3.1.4. GCC Documentation

For more information about GCC compilers, see the man pages for cpp, gcc, g++, gcj, and gfortran.
The following online user manuals are also available:
The main site for the development of GCC is gcc.gnu.org.