On RHEL7, perf stat is reporting incorrect output while measuring performance counter
Issue
- While measuring L1-dcache-loads performance counter with perf, the results are varying depending on other requested events.
- In the example below, stalled-cycles* events are requested in addition to L1-dcache*, and L1-dcache-loads is reported as 0 (which is less than L1-dcache-load-misses).
$ perf stat -e cycles,stalled-cycles-frontend,stalled-cycles-backend,cache-misses,cache-references,L1-dcache-load-misses,L1-dcache-loads ls -lR >/dev/null
Performance counter stats for 'ls -lR':
26,024,782,410 cycles
13,361,154,466 stalled-cycles-frontend # 51.34% frontend cycles idle
8,953,266,724 stalled-cycles-backend # 34.40% backend cycles idle
13,016,947 cache-misses # 6.923 % of all cache refs
188,033,023 cache-references
422,911,848 L1-dcache-load-misses # 0.00% of all L1-dcache hits
0 L1-dcache-loads
11.643342128 seconds time elapsed
- In the below second example, without requesting stalled-cycles* events, the L1-dcache stats seem correct.
$ perf stat -e cycles,cache-misses,cache-references,L1-dcache-load-misses,L1-dcache-loads ls -lR >/dev/null
Performance counter stats for 'ls -lR':
16,372,131,296 cycles
4,779,769 cache-misses # 10.904 % of all cache refs
43,833,706 cache-references
226,938,713 L1-dcache-load-misses # 3.50% of all L1-dcache hits
6,477,653,079 L1-dcache-loads
4.540951230 seconds time elapsed
- The issue here is that the perf loses existing counters when more events are requested (as seen in first example).
- If we remove some events (second example), then perf reports sensible values for the events that were previously wrong.
Environment
- Red Hat Enterprise Linux 7
- perf
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
