Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

Chapter 28. Persistent Memory: NVDIMMs

Persistent memory (pmem), also called as storage class memory, is a combination of memory and storage. pmem combines the durability of storage with the low access latency and the high bandwidth of dynamic RAM (DRAM):
  • Persistent memory is byte-addressable, so it can be accessed by using CPU load and store instructions. In addition to read() or write() system calls that are required for accessing traditional block-based storage, pmem also supports direct load and store programming model.
  • The performance characteristics of persistent memory are similar to DRAM with very low access latency, typically in the tens to hundreds of nanoseconds.
  • Contents of persistent memory are preserved when the power is off, like with storage.

Using persistent memory is beneficial for use cases like:

Rapid start: data set is already in memory.
Rapid start is also called the warm cache effect. A file server has none of the file contents in memory after starting. As clients connect and read and write data, that data is cached in the page cache. Eventually, the cache contains mostly hot data. After a reboot, the system must start the process again.
Persistent memory allows an application to keep the warm cache across reboots if the application is designed properly. In this instance, there would be no page cache involved: the application would cache data directly in the persistent memory.
Fast write-cache
File servers often do not acknowledge a client's write request until the data is on durable media. Using persistent memory as a fast write cache enables a file server to acknowledge the write request quickly thanks to the low latency of pmem.

NVDIMMs Interleaving

Non-Volatile Dual In-line Memory Modules (NVDIMMs) can be grouped into interleave sets in the same way as regular DRAM. An interleave set is like a RAID 0 (stripe) across multiple DIMMs.
Following are the advantages of NVDIMMS interleaving:
  • Like DRAM, NVDIMMs benefit from increased performance when they are configured into interleave sets.
  • It can be used to combine multiple smaller NVDIMMs into one larger logical device.
Use the system BIOS or UEFI firmware to configure interleave sets.
In Linux, one region device is created per interleave set.
Following is the relation between region devices and labels:
  • If your ⁠NVDIMMs support labels, the region device can be further subdivided into namespaces.
  • If your NVDIMMs do not support labels, the region devices can only contain a single namespace. In this case, the kernel creates a default namespace which covers the entire region.

Persistent Memory Access Modes

You can use persistent memory devices in the sector, fsdax, devdax (device direct access) or raw mode:
sector mode
It presents the storage as a fast block device. Using sector mode is useful for legacy applications that have not been modified to use persistent memory, or for applications that make use of the full I/O stack, including the Device Mapper.
fsdax mode
It enables persistent memory devices to support direct access programming as described in the Storage Networking Industry Association (SNIA) Non-Volatile Memory (NVM) Programming Model specification. In this mode, I/O bypasses the storage stack of the kernel, and many Device Mapper drivers therefore cannot be used.
devdax mode
The devdax (device DAX) mode provides raw access to persistent memory by using a DAX character device node. Data on a devdax device can be made durable using CPU cache flushing and fencing instructions. Certain databases and virtual machine hypervisors might benefit from the devdax mode. File systems cannot be created on device devdax instances.
raw mode
The raw mode namespaces have several limitations and should not be used.

28.1. Configuring Persistent Memory with ndctl

Use the ndctl utility to configure persistent memory devices. To install ndctl utility, use the following command:
# yum install ndctl

Procedure 28.1. Configuring Persistent Memory for a device that does not support labels

  1. List the available pmem regions on your system. In the following example, the command lists an NVDIMM-N device that does not support labels:
    # ndctl list --regions
    [
      {
        "dev":"region1",
        "size":34359738368,
        "available_size":0,
        "type":"pmem"
      },
      {
        "dev":"region0",
        "size":34359738368,
        "available_size":0,
        "type":"pmem"
      }
    ]
    
    Red Hat Enterprise Linux creates a default namespace for each region because the NVDIMM-N device here does not support labels. Hence, the available size is 0 bytes.
  2. List all the inactive namespaces on your system:
    # ndctl list --namespaces --idle
    [
      {
        "dev":"namespace1.0",
        "mode":"raw",
        "size":34359738368,
        "state":"disabled",
        "numa_node":1
      },
      {
        "dev":"namespace0.0",
        "mode":"raw",
        "size":34359738368,
        "state":"disabled",
        "numa_node":0
      }
    ]
    
  3. Reconfigure the inactive namespaces in order to make use of this space. For example, to use namespace0.0 for a file system that supports DAX, use the following command:
    # ndctl create-namespace --force --reconfig=namespace0.0 --mode=fsdax --map=mem 
    {
      "dev":"namespace0.0",
      "mode":"fsdax",
      "size":"32.00 GiB (34.36 GB)",
      "uuid":"ab91cc8f-4c3e-482e-a86f-78d177ac655d",
      "blockdev":"pmem0",
      "numa_node":0
    }
    

Procedure 28.2. Configuring Persistent Memory for a device that supports labels

  1. List the available pmem regions on your system. In the following example, the command lists an NVDIMM-N device that supports labels:
    # ndctl list --regions
    [
      {
        "dev":"region5",
        "size":270582939648,
        "available_size":270582939648,
        "type":"pmem",
        "iset_id":-7337419320239190016
      },
      {
        "dev":"region4",
        "size":270582939648,
        "available_size":270582939648,
        "type":"pmem",
        "iset_id":-137289417188962304
      }
    ]
    
  2. If the NVDIMM device supports labels, default namespaces are not created, and you can allocate one or more namespaces from a region without using the --force or --reconfigure flags:
    # ndctl create-namespace --region=region4 --mode=fsdax --map=dev --size=36G
    {
      "dev":"namespace4.0",
      "mode":"fsdax",
      "size":"35.44 GiB (38.05 GB)",
      "uuid":"9c5330b5-dc90-4f7a-bccd-5b558fa881fe",
      "blockdev":"pmem4",
      "numa_node":0
    }
    
    Now, you can create another namespace from the same region:
    # ndctl create-namespace --region=region4 --mode=fsdax --map=dev --size=36G
    {
      "dev":"namespace4.1",
      "mode":"fsdax",
      "size":"35.44 GiB (38.05 GB)",
      "uuid":"91868e21-830c-4b8f-a472-353bf482a26d",
      "blockdev":"pmem4.1",
      "numa_node":0
    }
    
    You can also create namespaces of different types from the same region, using the following command:
    # ndctl create-namespace --region=region4 --mode=devdax --align=2M --size=36G
    {
      "dev":"namespace4.2",
      "mode":"devdax",
      "size":"35.44 GiB (38.05 GB)",
      "uuid":"a188c847-4153-4477-81bb-7143e32ffc5c",
      "daxregion":
      {
        "id":4,
        "size":"35.44 GiB (38.05 GB)",
        "align":2097152,
        "devices":[
          {
            "chardev":"dax4.2",
            "size":"35.44 GiB (38.05 GB)"
          }]
      },
        "numa_node":0
    }
    
For more information on ndctl utility, see man ndctl.