RHEL6.4 Apparent reduced PCIe performance in comparison to RHEL5.9

Solution Verified - Updated -

Environment

Red hat Enterprise Linux RHEL6.4 (2.6.32-358)
- Customer's own written PCIe Driver, application and tester program.
- Xilinx PCIe card

Issue

  • The bandwidth performance of a PCIe card was dramatically reduced (1/20) in RHEL6 compared to that seen in RHEL5

Resolution

  • There is no performance drop in RHEL6. In actuality, using memmap= the performance in RHEL5 was being miscompared as the interface and mechanism was not working correctly in RHEL5.

  • There is a solution that allows the programmer to select a cachable option in RHEL6.

    • If you allocate the memory area using dynamic memory allocation and map it to the PCI using a shared memory area.
    • That is: memory that is allocated via pci_alloc_consistent() or via other DMA APIs can be allocated as cachable (or uncachable if required), and then transferred via an iommu that supports a cache coherency protocol with the CPUs (if you have hardware that supports it).
  • To correct the problem in RHEL5, make the memmap'd area uncachable in RHEL5 (Not necessary in RHEL6 due to the PCD - Page Cache Disabled flag now being default in the page table for this mapped area).

    • Testing validated that the data in RHEL5 often miscompared and was erroneous.
    • This was the main reason that the Community changed the memmap= assigned area to be uncachable to correct cache mismatch and data corruption conditions that could occur.

Root Cause

  • In RHEL5, memory reserved with a memmap= parameter on the kernel line in grub was previously assigned as cachable.

  • In RHEL6, memmap reserved areas are now assigned as uncachable using the PCD (Page Cache Disabled) flag for each page assigned to the area.

  • This meant that in RHEL5, the static area was being cached even though the PCIe card was not cache coherent with the CPU.

  • A cached memory area that is remapped for communication with a device can be problematic. If the device writes a value to the memory mapped area, when the cpu tries to read it, any cached value from a previous read will mask the device update.

  • Therefore, in RHEL6 (2.6.32), it is our belief that it is now working as designed when using memmap= statically reserved memory areas.

NOTE. The addresses in the following example are specific to the environment under test. The script and output are shown only for clarification. You will need to verify the /proc/mtrr before you start modifying it.

configmem.scr

#!/bin/bash
echo /proc/mtrr
cat /proc/mtrr
echo
echo "disable=6" > /proc/mtrr
echo "disable=5" > /proc/mtrr
echo "disable=4" > /proc/mtrr
echo /proc/mtrr
cat /proc/mtrr
echo
echo "base=0x010000000 size=0x004000000 type=uncachable" > /proc/mtrr
echo "base=0x014000000 size=0x001000000 type=uncachable" > /proc/mtrr
echo "base=0x0ae000000 size=0x002000000 type=uncachable" > /proc/mtrr
echo "base=0x0b0000000 size=0x010000000 type=uncachable" > /proc/mtrr
echo "base=0x0c0000000 size=0x040000000 type=uncachable" > /proc/mtrr
echo /proc/mtrr
cat /proc/mtrr
echo

Output from running configmem.scr

# ./configmem.scr
/proc/mtrr           <===== Displaying content before any changes
reg00: base=0x00000000 (   0MB), size=32768MB: write-back, count=1
reg01: base=0x800000000 (32768MB), size=8192MB: write-back, count=1
reg02: base=0xa00000000 (40960MB), size=1024MB: write-back, count=1
reg03: base=0xa40000000 (41984MB), size= 256MB: write-back, count=1
reg04: base=0xae000000 (2784MB), size=  32MB: uncachable, count=1
reg05: base=0xb0000000 (2816MB), size= 256MB: uncachable, count=1
reg06: base=0xc0000000 (3072MB), size=1024MB: uncachable, count=1

/proc/mtrr           <===== Displaying content after deleting some entries
reg00: base=0x00000000 (   0MB), size=32768MB: write-back, count=1
reg01: base=0x800000000 (32768MB), size=8192MB: write-back, count=1
reg02: base=0xa00000000 (40960MB), size=1024MB: write-back, count=1
reg03: base=0xa40000000 (41984MB), size= 256MB: write-back, count=1
<===== reg04, 05, 06 deleted

/proc/mtrr           <===== Displaying content after making changes
reg00: base=0x00000000 (   0MB), size=32768MB: write-back, count=1
reg01: base=0x800000000 (32768MB), size=8192MB: write-back, count=1
reg02: base=0xa00000000 (40960MB), size=1024MB: write-back, count=1
reg03: base=0xa40000000 (41984MB), size= 256MB: write-back, count=1
reg04: base=0x10000000 ( 256MB), size=  64MB: uncachable, count=1  \  These 2 make an 80MB
reg05: base=0x14000000 ( 320MB), size=  16MB: uncachable, count=1  /  contiguous area
reg06: base=0xae000000 (2784MB), size=  32MB: uncachable, count=1  \  
reg07: base=0xb0000000 (2816MB), size= 256MB: uncachable, count=1   } Added back in the 
reg08: base=0xc0000000 (3072MB), size=1024MB: uncachable, count=1  /  3 other areas deleted

NOTE: In the last section of output above it notes "Added back in the 3 other areas deleted". This was done solely to keep the list tidy so that all the uncachable areas were sequential.

Diagnostic Steps

  • In RHEL5.9 (2.6.18) I was able to toggle the memmap'd reserved area and make it uncachable or write-back and ran numerous tests.

  • Running in the default environment with write-back set, the numbers in RHEL5.9 are identical to each other, both 'dma' (dynamically assigned memory areas) and 'static' (statically assigned using memmap= in grub) reported 96.88% throughput.

  • Toggling the memmap'd arear to uncachable, the bandwidth figures mirror those produced by RHEL6.4, 'dma'=96.88%, 'static'=4.56%.

    • However to be clear, in RHEL5 when enabling data checking in the PCIe interface testing (fibTester), the results were supportive and clear of a caching problem as noted following:

    • In RHEL5.9 with the memmap'd area marked uncachable, NO data mismatches are reported.

    • In RHEL5.9 with the memmap'd area marked write-back, it streams mismatches.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments