Chapter 2. Preparing software for RPM packaging

This section explains how to prepare software for RPM packaging. To do so, knowing how to code is not necessary. However, you need to understand the basic concepts, such as What source code is and How programs are made.

2.1. What source code is

This part explains what source code is and shows example source codes of a program written in three different programming languages.

Source code is human-readable instructions to the computer, which describe how to perform a computation. Source code is expressed using a programming language.

2.1.1. Source code examples

This document features three versions of the Hello World program written in three different programming languages:

Each version is packaged differently.

These versions of the Hello World program cover the three major use cases of an RPM packager.

2.1.1.1. Hello World written in bash

The bello project implements Hello World in bash. The implementation only contains the bello shell script. The purpose of the program is to output Hello World on the command line.

The bello file has the following syntax:

#!/bin/bash

printf "Hello World\n"

2.1.1.2. Hello World written in Python

The pello project implements Hello World in Python. The implementation only contains the pello.py program. The purpose of the program is to output Hello World on the command line.

The pello.py file has the following syntax:

#!/usr/bin/python3

print("Hello World")

2.1.1.3. Hello World written in C

The cello project implements Hello World in C. The implementation only contains the cello.c and the Makefile files, so the resulting tar.gz archive will have two files apart from the LICENSE file.

The purpose of the program is to output Hello World on the command line.

The cello.c file has the following syntax:

#include <stdio.h>

int main(void) {
    printf("Hello World\n");
    return 0;
}

2.2. How programs are made

Methods of conversion from human-readable source code to machine code (instructions that the computer follows to execute the program) include the following:

  • The program is natively compiled.
  • The program is interpreted by raw interpreting.
  • The program is interpreted by byte compiling.

2.2.1. Natively Compiled Code

Natively compiled software is software written in a programming language that compiles to machine code with a resulting binary executable file. Such software can be run stand-alone.

RPM packages built this way are architecture-specific.

If you compile such software on a computer that uses a 64-bit (x86_64) AMD or Intel processor, it does not execute on a 32-bit (x86) AMD or Intel processor. The resulting package has architecture specified in its name.

2.2.2. Interpreted Code

Some programming languages, such as bash or Python, do not compile to machine code. Instead, their programs' source code is executed step by step, without prior transformations, by a Language Interpreter or a Language Virtual Machine.

Software written entirely in interpreted programming languages is not architecture-specific. Hence, the resulting RPM Package has the noarch string in its name.

Interpreted languages are either Raw-interpreted programs or Byte-compiled programs. These two types differ in program build process and in packaging procedure.

2.2.2.1. Raw-interpreted programs

Raw-interpreted language programs do not need to be compiled and are directly executed by the interpreter.

2.2.2.2. Byte-compiled programs

Byte-compiled languages need to be compiled into byte code, which is then executed by the language virtual machine.

Note

Some languages offer a choice: they can be raw-interpreted or byte-compiled.

2.3. Building software from source

This part describes how to build software from source code.

For software written in compiled languages, the source code goes through a build process, producing machine code. This process, commonly called compiling or translating, varies for different languages. The resulting built software can be run, which makes the computer perform the task specified by the programmer.

For software written in raw interpreted languages, the source code is not built, but executed directly.

For software written in byte-compiled interpreted languages, the source code is compiled into byte code, which is then executed by the language virtual machine.

2.3.1. Natively Compiled Code

This section shows how to build the cello.c program written in the C language into an executable.

cello.c

#include <stdio.h>

int main(void) {
    printf("Hello World\n");
    return 0;
}

2.3.1.1. Manual building

If you want to build the cello.c program manually, use this procedure:

Procedure

  1. Invoke the C compiler from the GNU Compiler Collection to compile the source code into binary:

    gcc -g -o cello cello.c
  2. Execute the resulting output binary cello:

    $ ./cello
    Hello World

2.3.1.2. Automated building

Large-scale software commonly uses automated building that is done by creating the Makefile file and then running the GNU make utility.

If you want to use the automated building to build the cello.c program, use this procedure:

Procedure

  1. To set up automated building, create the Makefile file with the following content in the same directory as cello.c.

    Makefile

    cello:
    	gcc -g -o cello cello.c
    clean:
    	rm cello

    Note that the lines under cello: and clean: must begin with a tab space.

  2. To build the software, run the make command:

    $ make
    make: 'cello' is up to date.
  3. Since there is already a build available, run the make clean command, and after run the make command again:

    $ make clean
    rm cello
    
    $ make
    gcc -g -o cello cello.c
    Note

    Trying to build the program after another build has no effect.

    $ make
    make: 'cello' is up to date.
  4. Execute the program:

    $ ./cello
    Hello World

You have now compiled a program both manually and using a build tool.

2.3.2. Interpreting code

This section shows how to byte-compile a program written in Python and raw-interpret a program written in bash.

Note

In the two examples below, the #! line at the top of the file is known as a shebang, and is not part of the programming language source code.

The shebang enables using a text file as an executable: the system program loader parses the line containing the shebang to get a path to the binary executable, which is then used as the programming language interpreter. The functionality requires the text file to be marked as executable.

2.3.2.1. Byte-compiling code

This section shows how to compile the pello.py program written in Python into byte code, which is then executed by the Python language virtual machine.

Python source code can also be raw-interpreted, but the byte-compiled version is faster. Hence, RPM Packagers prefer to package the byte-compiled version for distribution to end users.

pello.py

#!/usr/bin/python3

print("Hello World")

Procedure for byte-compiling programs varies depending on the following factors:

  • Programming language
  • Language’s virtual machine
  • Tools and processes used with that language
Note

Python is often byte-compiled, but not in the way described here. The following procedure aims not to conform to the community standards, but to be simple. For real-world Python guidelines, see Software Packaging and Distribution.

Use this procedure to compile pello.py into byte code:

Procedure

  1. Byte-compile the pello.py file:

    $ python -m compileall pello.py
    
    $ file pello.pyc
    pello.pyc: python 2.7 byte-compiled
  2. Execute the byte code in pello.pyc:

    $ python pello.pyc
    Hello World

2.3.2.2. Raw-interpreting code

This section shows how to raw-interpret the bello program written in the bash shell built-in language.

bello

#!/bin/bash

printf "Hello World\n"

Programs written in shell scripting languages, like bash, are raw-interpreted.

Procedure

  • Make the file with source code executable and run it:

    $ chmod +x bello
    $ ./bello
    Hello World

2.4. Patching software

This section explains how to patch the software.

In RPM packaging, instead of modifying the original source code, we keep it, and use patches on it.

A patch is a source code that updates other source code. It is formatted as a diff, because it represents what is different between two versions of the text. A diff is created using the diff utility, which is then applied to the source code using the patch utility.

Note

Software developers often use Version Control Systems such as git to manage their code base. Such tools provide their own methods of creating diffs or patching software.

The following example shows how to create a patch from the original source code using diff, and how to apply the patch using patch. Patching is used in a later section when creating an RPM; see Section 3.2, “Working with SPEC files”.

This procedure shows how to create a patch from the original source code for cello.c.

Procedure

  1. Preserve the original source code:

    $ cp -p cello.c cello.c.orig

    The -p option is used to preserve mode, ownership, and timestamps.

  2. Modify cello.c as needed:

    #include <stdio.h>
    
    int main(void) {
        printf("Hello World from my very first patch!\n");
        return 0;
    }
  3. Generate a patch using the diff utility:

    $ diff -Naur cello.c.orig cello.c
    --- cello.c.orig        2016-05-26 17:21:30.478523360 -0500
    + cello.c     2016-05-27 14:53:20.668588245 -0500
    @@ -1,6 +1,6 @@
     #include<stdio.h>
    
     int main(void){
    -    printf("Hello World!\n");
    +    printf("Hello World from my very first patch!\n");
         return 0;
     }
    \ No newline at end of file

    Lines starting with a - are removed from the original source code and replaced with the lines that start with +.

    Using the Naur options with the diff command is recommended because it fits the majority of usual use cases. However, in this particular case, only the -u option is necessary. Particular options ensure the following:

    • -N (or --new-file) - Handles absent files as if they were empty files.
    • -a (or --text) - Treats all files as text. As a result, the files that diff classifies as binaries are not ignored.
    • -u (or -U NUM or --unified[=NUM]) - Returns output in the form of output NUM (default 3) lines of unified context. This is an easily readable format that allows fuzzy matching when applying the patch to a changed source tree.
    • -r (or --recursive) - Recursively compares any subdirectories that are found.

      For more information on common arguments for the diff utility, see the diff manual page.

  4. Save the patch to a file:

    $ diff -Naur cello.c.orig cello.c > cello-output-first-patch.patch
  5. Restore the original cello.c:

    $ cp cello.c.orig cello.c

    The original cello.c must be retained, because when an RPM is built, the original file is used, not the modified one. For more information, see Section 3.2, “Working with SPEC files”.

The following procedure shows how to patch cello.c using cello-output-first-patch.patch, built the patched program, and run it.

  1. Redirect the patch file to the patch command:

    $ patch < cello-output-first-patch.patch
    patching file cello.c
  2. Check that the contents of cello.c now reflect the patch:

    $ cat cello.c
    #include<stdio.h>
    
    int main(void){
        printf("Hello World from my very first patch!\n");
        return 1;
    }
  3. Build and run the patched cello.c:

    $ make clean
    rm cello
    
    $ make
    gcc -g -o cello cello.c
    
    $ ./cello
    Hello World from my very first patch!

2.5. Installing arbitrary artifacts

Unix-like systems use the Filesystem Hierarchy Standard (FHS) to specify a directory suitable for a particular file.

Files installed from the RPM packages are placed according to FHS. For example, an executable file should go into a directory that is in the system $PATH variable.

In the context of this documentation, an Arbitrary Artifact is anything installed from an RPM to the system. For RPM and for the system it can be a script, a binary compiled from the package’s source code, a pre-compiled binary, or any other file.

This section describes two common ways of placing Arbitrary Artifacts in the system:

2.5.1. Using the install command

Packagers often use the install command in cases when build automation tooling such as GNU make is not optimal; for example if the packaged program does not need extra overhead.

The install command is provided to the system by coreutils, which places the artifact to the specified directory in the file system with a specified set of permissions.

The following procedure uses the bello file that was previously created as the arbitrary artifact as a subject to this installation method.

Procedure

  1. Run the install command to place the bello file into the /usr/bin directory with permissions common for executable scripts:

    $ sudo install -m 0755 bello /usr/bin/bello

    As a result, bello is now located in the directory that is listed in the $PATH variable.

  2. Execute bello from any directory without specifying its full path:

    $ cd ~
    
    $ bello
    Hello World

2.5.2. Using the make install command

Using the make install command is an automated way to install built software to the system. In this case, you need to specify how to install the arbitrary artifacts to the system in the Makefile that is usually written by the developer.

This procedure shows how to install a build artifact into a chosen location on the system.

Procedure

  1. Add the install section to the Makefile:

    Makefile

    cello:
    	gcc -g -o cello cello.c
    
    clean:
    	rm cello
    
    install:
    	mkdir -p $(DESTDIR)/usr/bin
    	install -m 0755 cello $(DESTDIR)/usr/bin/cello

    Note that the lines under cello:, clean:, and install: must begin with a tab space.

    Note

    The $(DESTDIR) variable is a GNU make built-in and is commonly used to specify installation to a directory different than the root directory.

    Now you can use Makefile not only to build software, but also to install it to the target system.

  2. Build and install the cello.c program:

    $ make
    gcc -g -o cello cello.c
    
    $ sudo make install
    install -m 0755 cello /usr/bin/cello

    As a result, cello is now located in the directory that is listed in the $PATH variable.

  3. Execute cello from any directory without specifying its full path:

    $ cd ~
    
    $ cello
    Hello World

2.6. Preparing source code for packaging

Developers often distribute software as compressed archives of source code, which are then used to create packages. RPM packagers work with a ready source code archive.

Software should be distributed with a software license.

This procedure uses the GPLv3 license text as an example content of the LICENSE file.

Procedure

  • Create a LICENSE file, and make sure that it includes the following content:

    $ cat /tmp/LICENSE
    This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
    
    This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
    
    You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Additional resources

  • The code created in this section can be found here.

2.7. Putting source code into tarball

This section describes how to put each of the three Hello World programs introduced in Section 2.1.1, “Source code examples” into a gzip-compressed tarball, which is a common way to release the software to be later packaged for distribution.

2.7.1. Putting the bello project into tarball

The bello project implements Hello World in bash. The implementation only contains the bello shell script, so the resulting tar.gz archive will have only one file apart from the LICENSE file.

This procedure shows how to prepare the bello project for distribution.

Prerequisites

Considering that this is version 0.1 of the program.

Procedure

  1. Put all required files into a single directory:

    $ mkdir /tmp/bello-0.1
    
    $ mv ~/bello /tmp/bello-0.1/
    
    $ cp /tmp/LICENSE /tmp/bello-0.1/
  2. Create the archive for distribution and move it to the ~/rpmbuild/SOURCES/ directory, which is the default directory where the rpmbuild command stores the files for building packages:

    $ cd /tmp/
    
    $ tar -cvzf bello-0.1.tar.gz bello-0.1
    bello-0.1/
    bello-0.1/LICENSE
    bello-0.1/bello
    
    $ mv /tmp/bello-0.1.tar.gz ~/rpmbuild/SOURCES/

For more information about the example source code written in bash, see Section 2.1.1.1, “Hello World written in bash”.

2.7.2. Putting the pello project into tarball

The pello project implements Hello World in Python. The implementation only contains the pello.py program, so the resulting tar.gz archive will have only one file apart from the LICENSE file.

This procedure shows how to prepare the pello project for distribution.

Prerequisites

Considering that this is version 0.1.1 of the program.

Procedure

  1. Put all required files into a single directory:

    $ mkdir /tmp/pello-0.1.2
    
    $ mv ~/pello.py /tmp/pello-0.1.2/
    
    $ cp /tmp/LICENSE /tmp/pello-0.1.2/
  2. Create the archive for distribution and move it to the ~/rpmbuild/SOURCES/ directory, which is the default directory where the rpmbuild command stores the files for building packages:

    $ cd /tmp/
    
    $ tar -cvzf pello-0.1.2.tar.gz pello-0.1.2
    pello-0.1.2/
    pello-0.1.2/LICENSE
    pello-0.1.2/pello.py
    
    $ mv /tmp/pello-0.1.2.tar.gz ~/rpmbuild/SOURCES/

For more information about the example source code written in Python, see Section 2.1.1.2, “Hello World written in Python”.

2.7.3. Putting the cello project into tarball

The cello project implements Hello World in C. The implementation only contains the cello.c and the Makefile files, so the resulting tar.gz archive will have two files apart from the LICENSE file.

Note

The patch file is not distributed in the archive with the program. The RPM Packager applies the patch when the RPM is built. The patch will be placed into the ~/rpmbuild/SOURCES/ directory alongside the .tar.gz archive.

This procedure shows how to prepare the cello project for distribution.

Prerequisites

Considering that this is version 1.0 of the program.

Procedure

  1. Put all required files into a single directory:

    $ mkdir /tmp/cello-1.0
    
    $ mv ~/cello.c /tmp/cello-1.0/
    
    $ mv ~/Makefile /tmp/cello-1.0/
    
    $ cp /tmp/LICENSE /tmp/cello-1.0/
  2. Create the archive for distribution and move it to the ~/rpmbuild/SOURCES/ directory, which is the default directory where the rpmbuild command stores the files for building packages:

    $ cd /tmp/
    
    $ tar -cvzf cello-1.0.tar.gz cello-1.0
    cello-1.0/
    cello-1.0/Makefile
    cello-1.0/cello.c
    cello-1.0/LICENSE
    
    $ mv /tmp/cello-1.0.tar.gz ~/rpmbuild/SOURCES/
  3. Add the patch:

    $ mv ~/cello-output-first-patch.patch ~/rpmbuild/SOURCES/

For more information about the example source code written in C, see Section 2.1.1.3, “Hello World written in C”.