Chapter 15. Timestamping

15.1. Hardware Clocks

Multiprocessor systems such as NUMA or SMP have multiple instances of clock sources. The way clocks interact among themselves and the way they react to system events, such as CPU frequency scaling or entering energy economy modes, determine whether they are suitable clock sources for the realtime kernel.
During boot time the kernel discovers the available clock sources and selects one to use. The preferred clock source is the Time Stamp Counter (TSC), but if it is not available the High Precision Event Timer (HPET) is the second best option. However, not all systems have HPET clocks and some HPET clocks can be unreliable.
In the absence of TSC and HPET, other options include the ACPI Power Management Timer (ACPI_PM), the Programmable Interval Timer (PIT) and the Real Time Clock (RTC). The last two options are either costly to read or have a low resolution (time granularity), therefore they are sub-optimal for the realtime kernel.
For the list of the available clock sources in your system, view the /sys/devices/system/clocksource/clocksource0/available_clocksource file:
~]# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm
In the sample output above, the TSC, HPET and ACPI_PM clock sources are available.
The clock source currently in use can be inspected by reading the /sys/devices/system/clocksource/clocksource0/current_clocksource file:
~]# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
It is possible to select a different clock source, from the list presented in the /sys/devices/system/clocksource/clocksource0/available_clocksource file. To do so, write the name of the clock source into the /sys/devices/system/clocksource/clocksource0/current_clocksource file. For example, the following command sets HPET as the clock source in use:
~]# echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource

Important

The kernel selects the best available clock source. Overriding the selected clock source is not recommended unless the implications are well understood.
While TSC is generally the preferred clock source, some of its hardware implementations may have shortcomings. For example, some TSC clocks can stop when the system goes to an idle state, or become out of sync when their CPUs enter deeper C-states (energy saving states) or perform speed- or frequency-scaling operations.
However, you can work around some of these TSC shortcomings by configuring additional kernel boot parameters. For instance, the idle=poll parameter forces the clock to avoid entering the idle state, and the processor.max_cstate=1 parameter prevents the clock from entering deeper C-states. Note however that in both cases there would be an increase on energy consumption, as the system would always run at top speed.

Note

For a comprehensive list of clock sources see the Timing Measurements chapter in Understanding The Linux Kernel by Daniel P. Bovet and Marco Cesati.

15.1.1. Reading Hardware Clock Sources

Reading from the TSC means reading a register from the processor. Reading from the HPET clock means reading a memory area. Reading from the TSC is faster, which provides a significant performance advantage when timestamping hundreds of thousands of messages per second.
Using a simple program that reads the current clock source 10,000,000 times in a row, it is possible to observe the duration required to read the clock sources available:

Example 15.1. Comparing the Cost of Reading Hardware Clock Sources

In this example, the clock source currently in use is TSC, as shown by the output of the cat command. The time command is used to view the duration required to read the clock source 10 million times:
~]# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
~]# time ./clock_timing

	real	0m0.601s
	user	0m0.592s
	sys	0m0.002s
The clock source is changed to HPET to compare the duration required to generate 10 million timestamps:
~]# echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource
~]# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
hpet
~]# time ./clock_timing

	real	0m12.263s
	user	0m12.197s
	sys	0m0.001s
The steps are repeated with the ACPI_PM clock source:
~]# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource
~]# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
acpi_pm
~]# time ./clock_timing

	real	0m24.461s
	user	0m0.504s
	sys	0m23.776s
The time(1) man page provides detailed information on how to use the command and interpret its output. The example above uses the following categories:
  • real: The total time spent beginning from program invocation until the process ends. real includes user and sys times, and will usually be larger than the sum of the latter two. If this process is interrupted by an application with higher priority, or by a system event such as a hardware interrupt (IRQ), this time spent waiting is also computed under real.
  • user: The time the process spent in user space, performing tasks that did not require kernel intervention.
  • sys: The time spent by the kernel while performing tasks required by the user process. These tasks include opening files, reading and writing to files or I/O ports, memory allocation, thread creation and network related activities.
As seen from the results of Example 15.1, “Comparing the Cost of Reading Hardware Clock Sources”, the efficiency of generating timestamps, in descending order, is: TSC, HPET, ACPI_PM. This is because of the increased overhead to access time values from the HPET and ACPI_PM timers.