On RHEL7, perf stat is reporting incorrect output while measuring performance counter
Issue
- While measuring L1-dcache-loads performance counter with perf, the results are varying depending on other requested events.
- In the example below, stalled-cycles* events are requested in addition to L1-dcache*, and L1-dcache-loads is reported as 0 (which is less than L1-dcache-load-misses).
$ perf stat -e cycles,stalled-cycles-frontend,stalled-cycles-backend,cache-misses,cache-references,L1-dcache-load-misses,L1-dcache-loads ls -lR >/dev/null
Performance counter stats for 'ls -lR':
26,024,782,410 cycles
13,361,154,466 stalled-cycles-frontend # 51.34% frontend cycles idle
8,953,266,724 stalled-cycles-backend # 34.40% backend cycles idle
13,016,947 cache-misses # 6.923 % of all cache refs
188,033,023 cache-references
422,911,848 L1-dcache-load-misses # 0.00% of all L1-dcache hits
0 L1-dcache-loads
11.643342128 seconds time elapsed
- In the below second example, without requesting stalled-cycles* events, the L1-dcache stats seem correct.
$ perf stat -e cycles,cache-misses,cache-references,L1-dcache-load-misses,L1-dcache-loads ls -lR >/dev/null
Performance counter stats for 'ls -lR':
16,372,131,296 cycles
4,779,769 cache-misses # 10.904 % of all cache refs
43,833,706 cache-references
226,938,713 L1-dcache-load-misses # 3.50% of all L1-dcache hits
6,477,653,079 L1-dcache-loads
4.540951230 seconds time elapsed
- The issue here is that the perf loses existing counters when more events are requested (as seen in first example).
- If we remove some events (second example), then perf reports sensible values for the events that were previously wrong.
Environment
- Red Hat Enterprise Linux 7
- perf
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.