2.6. Using Hardware Clocks for System Timestamping

Multiprocessor systems such as NUMA or SMP have multiple instances of hardware clocks. During boot time the kernel discovers the available clock sources and selects one to use. For the list of the available clock sources in your system, view the /sys/devices/system/clocksource/clocksource0/available_clocksource file:
~]# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm
In the example above, the TSC, HPET and ACPI_PM clock sources are available.
The clock source currently in use can be inspected by reading the /sys/devices/system/clocksource/clocksource0/current_clocksource file:
~]# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
Changing Clock Sources

Sometimes the best-performing clock for a system's main application is not used due to known problems on the clock. After ruling out all problematic clocks, the system can be left with a hardware clock that is unable to satisfy the minimum requirements of a real-time system.

Requirements for crucial applications vary on each system. Therefore, the best clock for each application, and consequently each system, also varies. Some applications depend on clock resolution, and a clock that delivers reliable nanoseconds readings can be more suitable. Applications that read the clock too often can benefit from a clock with a smaller reading cost (the time between a read request and the result).
In all these cases it is possible to override the clock selected by the kernel, provided that you understand the side effects of this override and can create an environment which will not trigger the known shortcomings of the given hardware clock. To do so, select a clock source from the list presented in the /sys/devices/system/clocksource/clocksource0/available_clocksource file and write the clock's name into the /sys/devices/system/clocksource/clocksource0/current_clocksource file. For example, the following command sets HPET as the clock source in use:
~]# echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource

Note

For a brief description of widely used hardware clocks, and to compare the performance between different hardware clocks, see the Red Hat Enterprise Linux for Real Time Reference guide for Red Hat Enterprise Linux for Real Time.
Configuring Additional Boot Parameters for the TSC Clock

While there is no single clock which is ideal for all systems, TSC is generally the preferred clock source. To optimize the reliability of the TSC clock, you can configure additional parameters when booting the kernel, for example:

  • idle=poll: Forces the clock to avoid entering the idle state.
  • processor.max_cstate=1: Prevents the clock from entering deeper C-states (energy saving mode), so it does not become out of sync.
Note however that in both cases there will be an increase in energy consumption, as the system will always run at top speed.
Controlling Power Management Transitions

Modern processors actively transition to higher power saving states (C-states) from lower states. Unfortunately, transitioning from a high power saving state back to a running state can consume more time than is optimal for a real-time application. To prevent these transitions, an application can use the Power Management Quality of Service (PM QoS) interface.

With the PM QoS interface, the system can emulate the behavior of the idle=poll and processor.max_cstate=1 parameters (as listed in Configuring Additional Boot Parameters for the TSC Clock), but with a more fine-grained control of power saving states.
When an application holds the /dev/cpu_dma_latency file open, the PM QoS interface prevents the processor from entering deep sleep states, which cause unexpected latencies when they are being exited. When the file is closed, the system returns to a power-saving state.
  1. Open the /dev/cpu_dma_latency file. Keep the file descriptor open for the duration of the low-latency operation.
  2. Write a 32-bit number to it. This number represents a maximum response time in microseconds. For the fastest possible response time, use 0.
    An example /dev/cpu_dma_latency file is as follows:
    static int pm_qos_fd = -1;
    
    void start_low_latency(void)
    {
    	s32_t target = 0;
    
    	if (pm_qos_fd >= 0)
    		return;
    	pm_qos_fd = open("/dev/cpu_dma_latency", O_RDWR);
    	if (pm_qos_fd < 0) {
    	   fprintf(stderr, "Failed to open PM QOS file: %s",
    	           strerror(errno));
    	   exit(errno);
    	}
    	write(pm_qos_fd, &target, sizeof(target));
    }
    
    void stop_low_latency(void)
    {
    	if (pm_qos_fd >= 0)
    	   close(pm_qos_fd);
    }
    The application will first call start_low_latency(), perform the required latency-sensitive processing, then call stop_low_latency().
Related Manual Pages

For more information, or for further reading, the following book is related to the information given in this section.

  • Linux System Programming by Robert Love