Chapter 6. Interpreting the output of the pmd-stats-show command in Open vSwitch with DPDK
Use this section to interpret the output of the pmd-stats-show
command (ovs-appctl dpif-netdev/pmd-stats-show
) in Open vSwitch (OVS) with DPDK.
6.1. Symptom
The ovs-appctl dpif-netdev/pmd-stats-show
command provides an inaccurate measurement. This is due to gathered statistics that have been charted since PMD was started.
6.2. Diagnosis
To obtain useful output, put the system into a steady state and reset the statistics that you want to measure:
# put system into steady state ovs-appctl dpif-netdev/pmd-stats-clear # wait <x> seconds sleep <x> ovs-appctl dpif-netdev/pmd-stats-show
Here’s an example of the output:
[root@overcloud-compute-0 ~]# ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl dpif-netdev/pmd-stats-show | egrep 'core_id (2|22):' -A9 pmd thread numa_id 0 core_id 22: emc hits:17461158 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0 polling cycles:4948219259 (25.81%) processing cycles:14220835107 (74.19%) avg cycles per packet: 1097.81 (19169054366/17461158) avg processing cycles per packet: 814.43 (14220835107/17461158) -- pmd thread numa_id 0 core_id 2: emc hits:14874381 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0 polling cycles:5460724802 (29.10%) processing cycles:13305794333 (70.90%) avg cycles per packet: 1261.67 (18766519135/14874381) avg processing cycles per packet: 894.54 (13305794333/14874381)
Note that core_id 2
is mainly busy, spending 70% of the time processing and 30% of the time polling.
polling cycles:5460724802 (29.10%) processing cycles:13305794333 (70.90%)
In this example, miss
indicates packets that were not classified in the DPDK datapath ('emc' or 'dp' classifier). Under normal circumstances, they would then be sent to the ofproto
layer. On rare occasions, due to a flow revalidation lock or if the ofproto
layer returns an error, the packet is dropped. In this case, the value of lost
will also be incremented to indicate the loss.
emc hits:14874381 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0
For more information, see OVS-DPDK Datapath Classifier.
6.3. Solution
This section shows the procedures for viewing traffic flow using the ovs-appctl
command.
6.3.1. Idle PMD
The following example shows a system where the core_ids serve the PMDs that are pinned to dpdk0, with only management traffic flowing through dpdk0:
[root@overcloud-compute-0 ~]# ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl dpif-netdev/pmd-stats-show | egrep 'core_id (2|22):' -A9 pmd thread numa_id 0 core_id 22: emc hits:0 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0 polling cycles:12613298746 (100.00%) processing cycles:0 (0.00%) -- pmd thread numa_id 0 core_id 2: emc hits:5 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0 polling cycles:12480023709 (100.00%) processing cycles:14354 (0.00%) avg cycles per packet: 2496007612.60 (12480038063/5) avg processing cycles per packet: 2870.80 (14354/5)
6.3.2. PMD under load test with packet drop
The following example shows a system where the core_ids serve the PMDs that are pinned to dpdk0, with a load test flowing through dpdk0, causing a high number of RX drops:
[root@overcloud-compute-0 ~]# ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl dpif-netdev/pmd-stats-show | egrep 'core_id (2|4|22|24):' -A9 pmd thread numa_id 0 core_id 22: emc hits:35497952 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0 polling cycles:1446658819 (6.61%) processing cycles:20453874401 (93.39%) avg cycles per packet: 616.95 (21900533220/35497952) avg processing cycles per packet: 576.20 (20453874401/35497952) -- pmd thread numa_id 0 core_id 2: emc hits:30183582 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:2 lost:0 polling cycles:1497174615 (6.85%) processing cycles:20354613261 (93.15%) avg cycles per packet: 723.96 (21851787876/30183584) avg processing cycles per packet: 674.36 (20354613261/30183584)
Where packet drops occur, you can see a high ratio of processing cycles vs polling cycles (more than 90% processing cycles):
polling cycles:1497174615 (6.85%) processing cycles:20354613261 (93.15%)
Check the average cycles per packet (CPP) and average processing cycles per packet (PCPP). You can expect a PCPP/CPP ratio of 1 for a fully loaded PMD as there will be no idle cycles counted.
avg cycles per packet: 723.96 (21851787876/30183584) avg processing cycles per packet: 674.36 (20354613261/30183584)
6.3.3. PMD under loadtest with 50% of mpps capacity
The following example shows a system where the core_ids serve the PMDs that are pinned to dpdk0, with a load test flowing through dpdk0, sending 6.4 Mpps (around 50% of the maximum capacity) of this dpdk0 interface (around 12.85 Mpps):
[root@overcloud-compute-0 ~]# ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl dpif-netdev/pmd-stats-show | egrep 'core_id (2|4|22|24):' -A9 pmd thread numa_id 0 core_id 22: emc hits:17461158 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0 polling cycles:4948219259 (25.81%) processing cycles:14220835107 (74.19%) avg cycles per packet: 1097.81 (19169054366/17461158) avg processing cycles per packet: 814.43 (14220835107/17461158) -- pmd thread numa_id 0 core_id 2: emc hits:14874381 megaflow hits:0 avg. subtable lookups per hit:0.00 miss:0 lost:0 polling cycles:5460724802 (29.10%) processing cycles:13305794333 (70.90%) avg cycles per packet: 1261.67 (18766519135/14874381) avg processing cycles per packet: 894.54 (13305794333/14874381)
Where the pps are about half of the maximum for the interface, you can see a lower ratio of processing cycles vs polling cycles (approximately 70% processing cycles):
polling cycles:5460724802 (29.10%) processing cycles:13305794333 (70.90%)
6.3.4. Hit vs miss vs lost
The following examples shows the man pages regarding the subject:
an ovs-vswitchd (...) DPIF-NETDEV COMMANDS These commands are used to expose internal information (mostly statistics) about the `dpif-netdev` userspace datapath. If there is only one datapath (as is often the case, unless dpctl/ commands are used), the dp argument can be omitted. dpif-netdev/pmd-stats-show [dp] Shows performance statistics for each pmd thread of the datapath dp. The special thread ``main'' sums up the statistics of every non pmd thread. The sum of ``emc hits'', ``masked hits'' and ``miss'' is the number of packets received by the datapath. Cycles are counted using the TSC or similar facilities when available on the platform. To reset these counters use dpif-netdev/pmd-stats-clear. The duration of one cycle depends on the measuring infrastructure. (...) Raw man ovs-dpctl (...) dump-dps Prints the name of each configured datapath on a separate line. [-s | --statistics] show [dp...] Prints a summary of configured datapaths, including their datapath numbers and a list of ports connected to each datapath. (The local port is identified as port 0.) If -s or --statistics is specified, then packet and byte counters are also printed for each port. The datapath numbers consists of flow stats and mega flow mask stats. The "lookups" row displays three stats related to flow lookup triggered by processing incoming packets in the datapath. "hit" displays number of packets matches existing flows. "missed" displays the number of packets not matching any existing flow and require user space processing. "lost" displays number of packets destined for user space process but subsequently dropped before reaching userspace. The sum of "hit" and "miss" equals to the total number of packets datapath processed. (...) Raw man ovs-vswitchd (...) dpctl/show [-s | --statistics] [dp...] Prints a summary of configured datapaths, including their datapath numbers and a list of ports connected to each datapath. (The local port is identified as port 0.) If -s or --statistics is specified, then packet and byte counters are also printed for each port. The datapath numbers consists of flow stats and mega flow mask stats. The "lookups" row displays three stats related to flow lookup triggered by processing incoming packets in the datapath. "hit" displays number of packets matches existing flows. "missed" displays the number of packets not matching any existing flow and require user space processing. "lost" displays number of packets destined for user space process but subsequently dropped before reaching userspace. The sum of "hit" and "miss" equals to the total number of packets datapath processed. (...)
Some of the documentation is referring to the kernel datapath, so when it says user space processing
it means the packet is not classified in the kernel sw
caches (equivalents to emc
& dpcls
) and sent to the ofproto layer in userspace.