Chapter 6. Running and interpreting hardware and firmware latency tests

With the hwlatdetect program, you can test and verify if a potential hardware platform is suitable for using real-time operations.

Prerequisites

  • Ensure that the RHEL-RT (RHEL for Real Time) and rt-tests packages are installed.
  • Check the vendor documentation for any tuning steps required for low latency operation.

    The vendor documentation can provide instructions to reduce or remove any System Management Interrupts (SMIs) that would transition the system into System Management Mode (SMM). While a system is in SMM, it runs firmware and not operating system code. This means that any timers that expire while in SMM wait until the system transitions back to normal operation. This can cause unexplained latencies, because SMIs cannot be blocked by Linux, and the only indication that we actually took an SMI can be found in vendor-specific performance counter registers.

    Warning

    Red Hat strongly recommends that you do not completely disable SMIs, as it can result in catastrophic hardware failure.

6.1. Running hardware and firmware latency tests

It is not required to run any load on the system while running the hwlatdetect program, because the test looks for latencies introduced by the hardware architecture or BIOS or EFI firmware. The default values for hwlatdetect are to poll for 0.5 seconds each second, and report any gaps greater than 10 microseconds between consecutive calls to fetch the time. hwlatdetect returns the best maximum latency possible on the system. Therefore, if you have an application that requires maximum latency values of less than 10us and hwlatdetect reports one of the gaps as 20us, then the system can only guarantee latency of 20us.

Note

If hwlatdetect shows that the system cannot meet the latency requirements of the application, try changing the BIOS settings or working with the system vendor to get new firmware that meets the latency requirements of the application.

Prerequisites

  • Ensure that the RHEL-RT and rt-tests packages are installed.

Procedure

  • Run hwlatdetect, specifying the test duration in seconds.

    hwlatdetect looks for hardware and firmware-induced latencies by polling the clock-source and looking for unexplained gaps.

    # hwlatdetect --duration=60s
    hwlatdetect:  test duration 60 seconds
    	detector: tracer
    	parameters:
    		Latency threshold:    10us
    		Sample window:        1000000us
    		Sample width:         500000us
    		Non-sampling period:  500000us
    		Output File:          None
    
    Starting test
    test finished
    Max Latency: Below threshold
    Samples recorded: 0
    Samples exceeding threshold: 0

Additional resources

6.2. Interpreting hardware and firmware latency test results

The hardware latency detector (hwlatdetect) uses the tracer mechanism to detect latencies introduced by the hardware architecture or BIOS/EFI firmware. By checking the latencies measured by hwlatdetect, you can determine if a potential hardware is suitable to support the RHEL for Real Time kernel.

Examples

  • The example result represents a system tuned to minimize system interruptions from firmware. In this situation, the output of hwlatdetect looks like this:

    # hwlatdetect --duration=60s
    hwlatdetect:  test duration 60 seconds
    	detector: tracer
    	parameters:
    		Latency threshold: 10us
    		Sample window:     1000000us
    		Sample width:      500000us
    		Non-sampling period:  500000us
    		Output File:       None
    
    Starting test
    test finished
    Max Latency: Below threshold
    Samples recorded: 0
    Samples exceeding threshold: 0
  • The example result represents a system that could not be tuned to minimize system interruptions from firmware. In this situation, the output of hwlatdetect looks like this:

    # hwlatdetect --duration=10s
    hwlatdetect:  test duration 10 seconds
    	detector: tracer
    	parameters:
    		Latency threshold: 10us
    		Sample window:     1000000us
    		Sample width:      500000us
    		Non-sampling period:  500000us
    		Output File:       None
    
    Starting test
    test finished
    Max Latency: 18us
    Samples recorded: 10
    Samples exceeding threshold: 10
    SMIs during run: 0
    ts: 1519674281.220664736, inner:17, outer:15
    ts: 1519674282.721666674, inner:18, outer:17
    ts: 1519674283.722667966, inner:16, outer:17
    ts: 1519674284.723669259, inner:17, outer:18
    ts: 1519674285.724670551, inner:16, outer:17
    ts: 1519674286.725671843, inner:17, outer:17
    ts: 1519674287.726673136, inner:17, outer:16
    ts: 1519674288.727674428, inner:16, outer:18
    ts: 1519674289.728675721, inner:17, outer:17
    ts: 1519674290.729677013, inner:18, outer:17----

    The output shows that during the consecutive reads of the system clocksource, there were 10 delays that showed up in the 15-18 us range.

    Note

    Previous versions used a kernel module rather than the ftrace tracer.

Understanding the results

The information on testing method, parameters, and results helps you understand the latency parameters and the latency values detected by the hwlatdetect utility.

The table for Testing method, parameters, and results, lists the parameters and the latency values detected by the hwlatdetect utility.

Table 6.1. Testing method, parameters, and results

ParameterValueDescription

test duration

10 seconds

The duration of the test in seconds

detector

tracer

The utility that runs the detector thread

parameters

  

Latency threshold

10us

The maximum allowable latency

Sample window

1000000us

1 second

Sample width

500000us

0.05 seconds

Non-sampling period

500000us

0.05 seconds

Output File

None

The file to which the output is saved.

Results

  

Max Latency

18us

The highest latency during the test that exceeded the Latency threshold. If no sample exceeded the Latency threshold, the report shows Below threshold.

Samples recorded

10

The number of samples recorded by the test.

Samples exceeding threshold

10

The number of samples recorded by the test where the latency exceeded the Latency threshold.

SMIs during run

0

The number of System Management Interrupts (SMIs) that occurred during the test run.

Note

The values printed by the hwlatdetect utility for inner and outer are the maximum latency values. They are deltas between consecutive reads of the current system clocksource (usually the TSC or TSC register, but potentially the HPET or ACPI power management clock) and any delays between consecutive reads introduced by the hardware-firmware combination.

After finding the suitable hardware-firmware combination, the next step is to test the real-time performance of the system while under a load.