Chapter 10. Debugging a Crashed Application

Sometimes, it is not possible to debug an application directly. In these situations, you can collect information about the application at the moment of its termination and analyze it afterwards.

10.1. Core dumps: what they are and how to use them

A core dump is a copy of a part of the application’s memory at the moment the application stopped working, stored in the ELF format. It contains all the application’s internal variables and stack, which enables inspection of the application’s final state. When augmented with the respective executable file and debugging information, it is possible to analyze a core dump file with a debugger in a way similar to analyzing a running program.

The Linux operating system kernel can record core dumps automatically, if this functionality is enabled. Alternatively, you can send a signal to any running application to generate a core dump regardless of its actual state.

Warning

Some limits might affect the ability to generate a core dump. To see the current limits:

$ ulimit -a

10.2. Recording application crashes with core dumps

To record application crashes, set up core dump saving and add information about the system.

Procedure

  1. Enable core dumps. Edit the file /etc/systemd/system.conf and make sure it contains the following lines:

    DumpCore=yes
    DefaultLimitCORE=infinity

    You can also add comments describing if these settings were previously present, and what the previous values were. This will enable you to reverse these changes later, if needed. Comments are lines starting with the # character.

    Changing the file requires administrator level access.

  2. Apply the new configuration:

    # systemctl daemon-reexec
  3. Remove the limits for core dump sizes:

    # ulimit -c unlimited

    To reverse this change, run the command with value 0 instead of unlimited.

  4. Install the sos package which provides the sosreport utility for collecting system information:

    # yum install sos
  5. When an application crashes, a core dump is generated and handled by systemd-coredump.
  6. Create an SOS report to provide additional information about the system:

    # sosreport

    This creates a .tar archive containing information about your system, such as copies of configuration files.

  7. Locate and export the core dump:

    $ coredumpctl list executable-name
    $ coredumpctl dump executable-name > /path/to/file-for-export

    If the application crashed multiple times, output of the first command lists more captured core dumps. In that case, construct for the second command a more precise query using the other information. See the coredumpctl(1) manual page for details.

  8. Transfer the core dump and the SOS report to the computer where the debugging will take place. Transfer the executable file, too, if it is known.

    Important

    When the executable file is not known, subsequent analysis of the core file identifies it.

  9. Optional: Remove the core dump and SOS report after transferring them to free up disk space.

Additional resources

10.3. Inspecting application crash states with core dumps

Prerequisites

  • You must have a core dump file and sosreport from the system where the crash occurred.
  • GDB and elfutils must be installed on your system.

Procedure

  1. To identify the executable file where the crash occurred, run the eu-unstrip command with the core dump file:

    $ eu-unstrip -n --core=./core.9814
    0x400000+0x207000 2818b2009547f780a5639c904cded443e564973e@0x400284 /usr/bin/sleep /usr/lib/debug/bin/sleep.debug [exe]
    0x7fff26fff000+0x1000 1e2a683b7d877576970e4275d41a6aaec280795e@0x7fff26fff340 . - linux-vdso.so.1
    0x35e7e00000+0x3b6000 374add1ead31ccb449779bc7ee7877de3377e5ad@0x35e7e00280 /usr/lib64/libc-2.14.90.so /usr/lib/debug/lib64/libc-2.14.90.so.debug libc.so.6
    0x35e7a00000+0x224000 3ed9e61c2b7e707ce244816335776afa2ad0307d@0x35e7a001d8 /usr/lib64/ld-2.14.90.so /usr/lib/debug/lib64/ld-2.14.90.so.debug ld-linux-x86-64.so.2

    The output contains details for each module on a line, separated by spaces. The information is listed in this order:

    1. The memory address where the module was mapped
    2. The build-id of the module and where in the memory it was found.
    3. The module’s executable file name, displayed as - when unknown, or as . when the module has not been loaded loaded from a file
    4. The source of debugging information, displayed as a file name when available, as . when contained in the executable file itself, or as - when not present at all
    5. The shared library name (soname), or [exe] for the main module

    In this example, the important details are the file name /usr/bin/sleep and the build-id 2818b2009547f780a5639c904cded443e564973e on the line containing the text [exe]. With this information, you can identify the executable file required for analyzing the core dump.

  2. Get the executable file that crashed.

    • If possible, copy it from the system where the crash occurred. Use the file name extracted from the core file.
    • You can also use an identical executable file on your system. Each executable file built on Red Hat Enterprise Linux contains a note with an unique build-id value. Determine the build-id of the relevant locally available executable files:

      $ eu-readelf -n executable_file

      Use this information to match the executable file on the remote system with your local copy. The build-id of the local file and build-id listed in the core dump must match.

    • Finally, if the application is installed from a RPM package, you can get the executable file from the package. Use the sosreport output to find the exact version of the package required.
  3. Get the shared libraries used by the executable file. Use the same steps as for the executable file.
  4. If the application is distributed as a package, load the executable file in GDB, to display hints for missing debuginfo packages. For more details, see Section 7.4, “Getting debuginfo packages for an application or library using GDB”.
  5. To examine the core file in detail, load the executable file and core dump file with GDB:

    $ gdb -e executable_file -c core_file

    Further messages about missing files and debugging information help you identify what is missing for the debugging session. Return to the previous step if needed.

    If the application’s debugging information is available as a file instead of as a package, load this file in GDB with the symbol-file command:

    (gdb) symbol-file program.debug

    Replace program.debug with the actual file name.

    Note

    It might not be necessary to install the debugging information for all executable files contained in the core dump. Most of these executable files are libraries used by the application code. These libraries might not directly contribute to the problem you are analyzing, and you do not need to include debugging information for them.

  6. Use the GDB commands to inspect the state of the application at the moment it crashed. See Chapter 8, Inspecting Application Internal State with GDB.

    Note

    When analyzing a core file, GDB is not attached to a running process. Commands for controlling execution have no effect.

Additional resources

10.4. Creating and accessing a core dump with coredumpctl

The coredumpctl tool of systemd can significantly streamline working with core dumps on the machine where the crash happened. This procedure outlines how to capture a core dump of unresponsive process.

Prerequisites

  • The system must be configured to use systemd-coredump for core dump handling. To verify this is true:

    $ sysctl kernel.core_pattern

    The configuration is correct if the output starts with the following:

    kernel.core_pattern = |/usr/lib/systemd/systemd-coredump

Procedure

  1. Find the PID of the hung process, based on a known part of the executable file name:

    $ pgrep -a executable-name-fragment

    This command will output a line in the form

    PID command-line

    Use the command-line value to verify that the PID belongs to the intended process.

    For example:

    $ pgrep -a bc
    5459 bc
  2. Send an abort signal to the process:

    # kill -ABRT PID
  3. Verify that the core has been captured by coredumpctl:

    $ coredumpctl list PID

    For example:

    $ coredumpctl list 5459
    TIME                            PID   UID   GID SIG COREFILE  EXE
    Thu 2019-11-07 15:14:46 CET    5459  1000  1000   6 present   /usr/bin/bc
  4. Further examine or use the core file as needed.

    You can specify the core dump by PID and other values. See the coredumpctl(1) manual page for further details.

    • To show details of the core file:

      $ coredumpctl info PID
    • To load the core file in the GDB debugger:

      $ coredumpctl debug PID

      Depending on availability of debugging information, GDB will suggest commands to run, such as:

      Missing separate debuginfos, use: dnf debuginfo-install bc-1.07.1-5.el8.x86_64

      For more details on this process, see Section 7.4, “Getting debuginfo packages for an application or library using GDB”.

    • To export the core file for further processing elsewhere:

      $ coredumpctl dump PID > /path/to/file_for_export

      Replace /path/to/file_for_export with the file where you want to put the core dump.

10.5. Dumping process memory with gcore

The workflow of core dump debugging enables the analysis of the program’s state offline. In some cases, you can use this workflow with a program that is still running, such as when it is hard to access the environment with the process. You can use the gcore command to dump memory of any process while it is still running.

Prerequisites

  • xref:core-dumps_debugging-a-crashed-application[You must understand what core dumps are and how they are created.
  • xref:setting-up-to-debug-applications_setting-up-a-development-workstation[GDB must be installed on the system.

Procedure

  1. Find out the process id (pid). Use tools such as ps, pgrep, and top:

    $ ps -C some-program
  2. Dump the memory of this process:

    $ gcore -o filename pid

    This creates a file filename and dumps the process memory in it. While the memory is being dumped, the execution of the process is halted.

  3. After the core dump is finished, the process resumes normal execution.
  4. Create an SOS report to provide additional information about the system:

    # sosreport

    This creates a tar archive containing information about your system, such as copies of configuration files.

  5. Transfer the program’s executable file, core dump, and the SOS report to the computer where the debugging will take place.
  6. Optional: Remove the core dump and SOS report after transferring them, to free up disk space.

Additional resources

10.6. Dumping protected process memory with GDB

You can mark the memory of processes as not to be dumped. This can save resources and ensure additional security when the process memory contains sensitive data: for example, in banking or accounting applications or on whole virtual machines. Both kernel core dumps (kdump) and manual core dumps (gcore, GDB) do not dump memory marked this way.

In some cases, you must dump the whole contents of the process memory regardless of these protections. This procedure shows how to do this using the GDB debugger.

Procedure

  1. Set GDB to ignore the settings in the /proc/PID/coredump_filter file:

    (gdb) set use-coredump-filter off
  2. Set GDB to ignore the memory page flag VM_DONTDUMP:

    (gdb) set dump-excluded-mappings on
  3. Dump the memory:

    (gdb) gcore core-file

    Replace core-file with name of file where you want to dump the memory.

Additional resources