2.2. Using mlock to Avoid Page I/O

The mlock and mlockall system calls tell the system to lock to a specified memory range, and to not allow that memory to be paged. This means that once the physical page has been allocated to the page table entry, references to that page will always be fast.
There are two groups of mlock system calls available. The mlock and munlock calls lock and unlock a specific range of addresses. The mlockall and munlockall calls lock or unlock the entire program space.
Examine the use of mlock carefully and exercise caution. If the application is large, or if it has a large data domain, the mlock calls can cause thrashing if the system cannot allocate memory for other tasks.

Note

Always use mlock with care. Using it excessively can lead to an out of memory (OOM) error. Do not put an mlockall call at the start of your application. It is recommended that only the data and text of the realtime portion of the application be locked.
Use of mlock will not guarantee that the program will experience no page I/O. It is used to ensure that the data will stay in memory, but cannot ensure that it will stay in the same page. Other functions such as move_pages and memory compactors can move data around despite the use of mlock.

Important

Unprivileged users must have the CAP_IPC_LOCK capability in order to be able to use mlockall or mlock on large buffers. See the capabilities(7) man page for details.
Moreover, it is worth noting that memory locks are made on a page basis, and do not stack. It means that if two dynamically allocated memory segments share the same page locked twice by calls to mlock or mlockall, they will be unlocked by a single call to munlock for the corresponding page, or by munlockall. Thus, the application must be aware of which pages it is unlocking in order to prevent this double-lock/single-unlock problem.
The two most common alternatives to mitigate the double-lock/single-unlock problem are:
  • Tracking the memory areas allocated and locked, and creating a wrapper function that, before unlocking a page, verifies how many users (allocations) that page has. This is the resource counting principle used in device drivers.
  • Performing allocations considering the page size and alignment, in order to prevent a double-lock in the same page.
The code examples below represent the second alternative.
The best way to use mlock depends on the application's needs and system resources. Although there is no single solution for all the applications, the following code example can be used as a starting point for the implementation of a function that will allocate and lock memory buffers.

Example 2.2. Using mlock in an Application

#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>

void *
alloc_workbuf(size_t size)
{
 void *ptr;
 int retval;
 /*
  * alloc memory aligned to a page, to prevent two mlock() in the
  * same page.
  */
 retval = posix_memalign(&ptr, (size_t) sysconf(_SC_PAGESIZE), size);

 /* return NULL on failure */
 if (retval)
  return NULL;

 /* lock this buffer into RAM */
 if (mlock(ptr, size)) {
  free(ptr);
  return NULL;
 }
 return ptr;
}

void
free_workbuf(void *ptr, size_t size)
{
 /* unlock the address range */
 munlock(ptr, size);

 /* free the memory */
 free(ptr);
}
The function alloc_workbuf dynamically allocates a memory buffer and locks it. The memory allocation is performed by posix_memalig in order to align the memory area to a page. If the size variable is smaller then a page size, regular malloc allocation will be able to use the remainder of the page. But, to safely use this method advantage, no mlock calls can be made on regular malloc allocations. This will prevent the double-lock/single-unlock problem. The function free_workbuf will unlock and free the memory area.
In addition to the use of mlock and mlockall, it is possible to allocate and lock a memory area using mmap with the MAP_LOCKED flag. The following example is the implementation of the aforementioned code using mmap.

Example 2.3. Using mmap in an Application

#include <sys/mman.h>
#include <stdlib.h>

void *
alloc_workbuf(size_t size)
{
 void *ptr;

 ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
            MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, -1, 0);

 if (ptr == MAP_FAILED)
  return NULL;

 return ptr;
}

void
free_workbuf(void *ptr, size_t size)
{
 munmap(ptr, size);
}
As mmap allocates memory on a page basis, there are no two locks in the same page, helping to prevent the double-lock/single-unlock problem. On the other hand, if the size variable is not a multiple of the page size, the rest of the page is wasted. Furthermore, a call to munlockall unlocks the memory locked by mmap.
Another possibility for small-footprint applications is to call mlockall prior to entering a time-sensitive region of the code, followed by munlockall at the end of the time-sensitive region. This can reduce paging while in the critical section. Similarly, mlock can be used on a data region that is relatively static or that will grow slowly but needs to be accessed without page I/O.

Note

For more information, or for further reading, the following man pages are related to the information given in this section:
  • capabilities(7)
  • mlock(2)
  • mlock(3)
  • mlockall(2)
  • mmap(2)
  • move_pages(2)
  • posix_memalign(3)
  • posix_memalign(3p)