Best Practices / Deployment Guide SAP HANA on Red Hat Virtualization 4.2 and 4.3

Updated -

The guide in this PDF attachment shows you how to deploy SAP HANA as a supported workload on Red Hat Virtualization (RHV) versions 4.2 and 4.3. This document contains information about SAP HANA hardware requirements and best practices. It includes examples of SAP HANA and RHV-specific configuration settings and deployment options to consider when using the two products together. You can also download the scripts associated with this guide.

Attachments

12 Comments

For their application servers SAP seems to suggest noop scheduler for KVM VMs: https://launchpad.support.sap.com/#/notes/1400911 is that suggestion outdated or do HANAs behave better with deadline?

I don't know why SAP suggests this. In general using deadline for both hypervisor and VM is typically most sensible from our experience. That is why we are doing the SAP HANA cert like this as well.

For the 4.1 virtualization we talked about the pinning and someone from qemu/libvirtd suggested to use 1<>1 cpu pinning instead of pinning the real core and the ht core. Did you test that and it came out inferior or did you forget to change it for the 4.2 guide? currently: 0#4,116_1#4,116 direct: 0#4_1#116

No, the script still works the same. What we do propose though is to disable HT completely due to security concerns, see chapter 5 in the guide, or as direct link MDS - Microarchitectural Data Sampling

The doc itself shows the config used to reach the SAP performance goals.

Hi, I still have the issue that booting with a large number of hugepages takes many minutes and just shows a blank screen. People actually thought the system was hanging and rebooted it during that phase. Recently I discovered that I can just have RHV dynamically allocate the hugepages. Would this be a good idea to add into this guide? https://bugzilla.redhat.com/show_bug.cgi?id=1785507

I would advise against that. Mainly because the dynamic allocation is not necessarily even across all NUMA nodes - leading to a NUMA imbalance within the SAP HANA VM.

As such defining the hugepages at boot is most sensible.

Cheers, Martin

I am currently using dynamic hugepages for the sap application servers after red hat support suggest to use them because of performance issues. This lead to several bugzillas and issues on my end. I do not think that feature is mature enough for customer use. So I would say, yes you are right, do not use dynamic hugepages :) One issue remains though, if I preallocate hugepages at boot people who are not familiar with the systems thing it's hanging. Allocation takes minutes and during that allocation it's just showing a blank screen on console. A 'warning' on screen or something like that would be really helpful :)

I am not sure why but on kernel 3.10.0-957.41.1.el7.x86_64 either the modprobe or the cd command failed and it created the two files in /usr/lib/tuned/sap-hana-kvm-guest -- maybe use set -e or explicitly check return values

#!/bin/bash
set -e

if [ "$1" == "start" ]; then
    modprobe cpuidle-haltpoll
    cd /sys/module/cpuidle_haltpoll/parameters/
    echo 800000 > guest_halt_poll_ns
    echo 200000 > guest_halt_poll_grow_start
fi

modprobe cpuidle-haltpoll seems to kill my VMs on kernel 3.10.0-957.41.1.el7.x86_64 -- I'll open a support case

Nevermind... I only looked at the changelog of the kernel which states

- [x86] cpuidle-haltpoll: vcpu hotplug support (Marcelo Tosatti) [1776288 1771849] 

and I couldn't open any of the bugs because they are private, but https://access.redhat.com/errata/RHSA-2020:0179 is more specific

Guest crash after load cpuidle-haltpoll driver (BZ#1776288)

I'll test the latest kernel

Please note that the interface for the haltpoll driver has slightly changed. Use the following script that can handle both, old and new location:

#!/bin/bash

guest_halt_poll_ns=800000
guest_halt_poll_grow_start=200000

if [ "$1" == "start" ]; then
    modprobe cpuidle-haltpoll
    if [ -e /sys/module/cpuidle_haltpoll/parameters/ ]; then
        echo $guest_halt_poll_ns > /sys/module/cpuidle_haltpoll/parameters/guest_halt_poll_ns
        echo $guest_halt_poll_grow_start > /sys/module/cpuidle_haltpoll/parameters/guest_halt_poll_grow_start
    elif [ -e /sys/module/haltpoll/parameters/ ]; then
        echo $guest_halt_poll_ns > /sys/module/haltpoll/parameters/guest_halt_poll_ns
        echo $guest_halt_poll_grow_start > /sys/module/haltpoll/parameters/guest_halt_poll_grow_start
    fi
fi

In case your VM crashed when you load the driver, something else is wrong and opening a bug is the right way to go.

You should still fail in case modprobe fails or in case if and elif conditions both fail or create a warning/whatever. Otherwise issues could go unnoticed.