Chapter 8. Hardening Infrastructure and Virtualization

Check with hardware and software vendors periodically to get available information about new vulnerabilities and security updates. Red Hat Product Security maintains the following sites to inform you of security updates:

Keep the following in mind as you regularly update your deployment of Red Hat OpenStack Platform.

  • Ensure all security updates are included.
  • Kernel updates require a reboot.
  • Update hosted Image service (glance) images to ensure that newly created instances have the latest updates.

8.1. Hypervisors

When you evaluate a hypervisor platform, consider the supportability of the hardware on which the hypervisor will run. Additionally, consider the additional features available in the hardware and how those features are supported by the hypervisor you chose as part of the OpenStack deployment. To that end, hypervisors each have their own hardware compatibility lists (HCLs). When selecting compatible hardware it is important to know in advance which hardware-based virtualization technologies are important from a security perspective.

8.1.1. Hypervisor versus bare metal

It is important to recognize the difference between using Linux containers or bare metal systems versus using a hypervisor like KVM. Specifically, the focus of this security guide is largely based on having a hypervisor and virtualization platform. However, should your implementation require the use of a bare metal or containerized environment, you must pay attention to the particular differences in regard to deployment of that environment.

For bare metal, make sure the node has been properly sanitized of data prior to re-provisioning and decommissioning. In addition, before reusing a node, you must provide assurances that the hardware has not been tampered or otherwise compromised. For more information see https://docs.openstack.org/ironic/queens/admin/cleaning.html

8.1.2. Hypervisor memory optimization

Certain hypervisors use memory optimization techniques that overcommit memory to guest virtual machines. This is a useful feature that allows you to deploy very dense compute clusters. One approach to this technique is through deduplication or sharing of memory pages: When two virtual machines have identical data in memory, there are advantages to having them reference the same memory. Typically this is performed through Copy-On-Write (COW) mechanisms, such as kernel same-page merging (KSM). These mechanisms are vulnerable to attack:

  • Memory deduplication systems are vulnerable to side-channel attacks. In academic studies, attackers were able to identify software packages and versions running on neighboring virtual machines as well as software downloads and other sensitive information through analyzing memory access times on the attacker VM. Consequently, one VM can infer something about the state of another, which might not be appropriate for multi-project environments where not all projects are trusted or share the same levels of trust
  • More importantly, row-hammer type attacks have been demonstrated against KSM to enact cross-VM modification of executable memory. This means that a hostile instance can gain code-execution access to other instances on the same Compute host.

Deployers should disable KSM if they require strong project separation (as with public clouds and some private clouds):

8.2. PCI Passthrough

PCI passthrough allows an instance to have direct access to a piece of hardware on the node. For example, this could be used to allow instances to access video cards or GPUs offering the compute unified device architecture (CUDA) for high performance computation. This feature carries two types of security risks: direct memory access and hardware infection.

Direct memory access (DMA) is a feature that permits certain hardware devices to access arbitrary physical memory addresses in the host computer. Often video cards have this capability. However, an instance should not be given arbitrary physical memory access because this would give it full view of both the host system and other instances running on the same node. Hardware vendors use an input/output memory management unit (IOMMU) to manage DMA access in these situations. You should confirm that the hypervisor is configured to use this hardware feature.

A hardware infection occurs when an instance makes a malicious modification to the firmware or some other part of a device. As this device is used by other instances or the host OS, the malicious code can spread into those systems. The end result is that one instance can run code outside of its security zone. This is a significant breach as it is harder to reset the state of physical hardware than virtual hardware, and can lead to additional exposure such as access to the management network.

Due to the risk and complexities associated with PCI passthrough, it should be disabled by default. If enabled for a specific need, you will need to have appropriate processes in place to help ensure the hardware is clean before reuse.

8.3. Selinux on Red Hat OpenStack Platform

Security-Enhanced Linux (SELinux) is an implementation of mandatory access control (MAC). MAC limits the impact of an attack by restricting what a process or application is permitted to do on a system. For more information on SELinux, see What is SELinux?.

SELinux policies have been pre-configured for Red Hat OpenStack Platform (RHOSP) services. On RHOSP, SELinux is configured to run each QEMU process under a separate security context. In RHOSP, SELinux policies help protect hypervisor hosts and instances against the following threats:

Hypervisor threats
A compromised application running within an instance attacks the hypervisor to access underlying resources. If an instance is able to access the hypervisor OS, physical devices and other applications can become targets. This threat represents considerable risk. A compromise on a hypervisor can also compromise firmware, other instances, and network resources.
Instance threats
A compromised application running within an instance attacks the hypervisor to access or control another instance and its resources, or instance file images. The administrative strategies for protecting real networks do not apply directly to virtual environments. Because every instance is a process labeled by SELinux, there is a security boundary around each instance, enforced by the Linux kernel.

In RHOSP, instance image files on disk are labeled with SELinux data type svirt_image_t. When the instance is powered on, SELinux appends a random numerical identifier to the image. A random numerical identifier can prevent a compromised OpenStack instance from gaining unauthorized access to other containers. SELinux is capable of assigning up to 524,288 numeric identifiers on each hypervisor node.

8.4. Investigating containerized services

The OpenStack services that come with Red Hat OpenStack Platform run within containers. Containerization allows for the development and upgrade of services without dependency related conflicts. When a service runs within a container, potential vulnerabilities to that service are also contained.

You can get information about the service that are running in your environment by using the following steps:

Procedure

  • Use `podman inspect to get information, such as bind mounted host directories:

    Example:

    $ sudo podman inspect <container_name> | less

    Replace <container_name> with the name of your container. For example, nova compute.

  • Check the logs for the service located in /var/log/containers:

    Example:

    sudo less /var/log/containers/nova/nova-compute.log
  • Run an interactive CLI session within the container:

    Example:

    podman exec -it nova_compute /bin/bash
    Note

    You can make changes to the service for testing purposes directly within the container. All changes are lost when the container is restarted.

8.5. Making temporary changes to containerized services

You can make changes to containerized services that persist when the container is restarted, but that do not affect the permanent configuration of your Red Hat OpenStack Platform (RHOSP) cluster. This is useful for testing configuration changes, or enabling debug-level logs when troubleshooting. You can revert changes manually. Alternatively, running a redeploy on your RHOSP cluster resets all parameters to their permanent configurations.

Use configuration files that are located in /var/lib/config-data/puppet-generated/[service] to make temporary changes to a service. The following example enables debugging on the nova service:

Procedure

  1. Edit the nova.conf configuration file that is bind mounted to the nova_compute container. Set the value of the debug parameter to True:

    $ sudo sed -i 's/^debug=.*/debug=True' \
    /var/lib/config-data/puppet-generated/nova/etc/nova/nova.conf
    Warning

    Configuration files for OpenStack files are ini files with multiple sections, such as [DEFAULT] and [database]. Parameters that are unique to each section might not be unique across the entire file. Use sed with caution. You can check to see if a parameter appears more than once in a configuration file by running egrep -v "^$|^#" [configuration_file] | grep [parameter].

  2. Restart the nova container:

    sudo podman restart nova_compute

8.6. Making permanent changes to containerized services

You can make permanent changes to containerized services in Red Hat OpenStack Platform (RHOSP) services with heat. Use an existing template that you used when you first deployed RHOSP, or create a new template to add to your deployment script. In the following example, the private key size for libvirt is increased to 4096.

Procedure

  1. Create a new yaml template called libvirt-keysize.yaml, and use the LibvirtCertificateKeySize parameter to increase the default value from 2048 to 4096.

    cat > /home/stack/templates/libvirt-keysize.yaml
    parameter_defaults:
            LibvirtCertificateKeySize: 4096
    EOF
  2. Add the libvirt-keysize.yaml configuration file to your deployment script:

    openstack overcloud deploy --templates \
    ...
    -e /home/stack/templates/libvirt-keysize.yaml
    ...
  3. Rerun the deployment script:

    ./deploy.sh

8.7. Firmware updates

Physical servers use complex firmware to enable and operate server hardware and lights-out management cards, which can have their own security vulnerabilities, potentially allowing system access and interruption. To address these, hardware vendors will issue firmware updates, which are installed separately from operating system updates. You will need an operational security process that retrieves, tests, and implements these updates on a regular schedule, noting that firmware updates often require a reboot of physical hosts to become effective.

8.8. Use SSH banner text

You can set a banner that displays a console message to all users that connect over SSH. You can add banner text to /etc/issue using the following parameters in an environment file. Consider customizing this sample text to suit your requirements.

resource_registry:
  OS::TripleO::Services::Sshd:
    /usr/share/openstack-tripleo-heat-templates/deployment/sshd/sshd-baremetal-puppet.yaml

parameter_defaults:
  BannerText: |
   ******************************************************************
   * This system is for the use of authorized users only. Usage of  *
   * this system may be monitored and recorded by system personnel. *
   * Anyone using this system expressly consents to such monitoring *
   * and is advised that if such monitoring reveals possible        *
   * evidence of criminal activity, system personnel may provide    *
   * the evidence from such monitoring to law enforcement officials.*
   ******************************************************************

To apply this change to your deployment, save the settings as a file called ssh_banner.yaml, and then pass it to the overcloud deploy command as follows. The <full environment> indicates that you must still include all of your original deployment parameters. For example:

    openstack overcloud deploy --templates \
      -e <full environment> -e  ssh_banner.yaml

8.9. Audit for system events

Maintaining a record of all audit events helps you establish a system baseline, perform troubleshooting, or analyze the sequence of events that led to a certain outcome. The audit system is capable of logging many types of events, such as changes to the system time, changes to Mandatory/Discretionary Access Control, and creating/deleting users or groups.

Rules can be created using an environment file, which are then injected by director into /etc/audit/audit.rules. For example:

    resource_registry:
      OS::TripleO::Services::AuditD: /usr/share/openstack-tripleo-heat-templates/deployment/auditd/auditd-baremetal-puppet.yaml
    parameter_defaults:
      AuditdRules:
        'Record Events that Modify User/Group Information':
          content: '-w /etc/group -p wa -k audit_rules_usergroup_modification'
          order  : 1
        'Collects System Administrator Actions':
          content: '-w /etc/sudoers -p wa -k actions'
          order  : 2
        'Record Events that Modify the Systems Mandatory Access Controls':
          content: '-w /etc/selinux/ -p wa -k MAC-policy'
          order  : 3

8.10. Manage firewall rules

Firewall rules are automatically applied on overcloud nodes during deployment, and are intended to only expose the ports required to get OpenStack working. You can specify additional firewall rules as needed. For example, to add rules for a Zabbix monitoring system:

    parameter_defaults:
      ControllerExtraConfig:
        tripleo::firewall::firewall_rules:
          '301 allow zabbix':
            dport: 10050
            proto: tcp
            source: 10.0.0.8
            action: accept

You can also add rules that restrict access. The number used during rule definition will determine the rule’s precedence. For example, RabbitMQ’s rule number is 109 by default. If you want to restrain it, you switch it to use a lower value:

    parameter_defaults:
      ControllerExtraConfig:
        tripleo::firewall::firewall_rules:
          '098 allow rabbit from internalapi network':
            dport: [4369,5672,25672]
            proto: tcp
            source: 10.0.0.0/24
            action: accept
          '099 drop other rabbit access':
            dport: [4369,5672,25672]
            proto: tcp
            action: drop

In this example, 098 and 099 are arbitrarily chosen numbers that are lower than RabbitMQ’s rule number 109. To determine a rule’s number, you can inspect the iptables rule on the appropriate node; for RabbitMQ, you would check the controller:

iptables-save
[...]
-A INPUT -p tcp -m multiport --dports 4369,5672,25672 -m comment --comment "109 rabbitmq" -m state --state NEW -j ACCEPT

Alternatively, you can extract the port requirements from the puppet definition. For example, RabbitMQ’s rules are stored in puppet/services/rabbitmq.yaml:

    tripleo.rabbitmq.firewall_rules:
      '109 rabbitmq':
        dport:
          - 4369
          - 5672
          - 25672

The following parameters can be set for a rule:

  • port: The port associated to the rule. Deprecated by puppetlabs-firewall.
  • dport: The destination port associated to the rule.
  • sport: The source port associated to the rule.
  • proto: The protocol associated to the rule. Defaults to tcp
  • action: The action policy associated to the rule. Defaults to accept
  • jump: The chain to jump to.
  • state: Array of states associated to the rule. Default to [NEW]
  • source: The source IP address associated to the rule.
  • iniface: The network interface associated to the rule.
  • chain: The chain associated to the rule. Default to INPUT
  • destination: The destination cidr associated to the rule.
  • extras: Hash of any additional parameters supported by the puppetlabs-firewall module.

8.11. Intrusion detection with AIDE

AIDE (Advanced Intrusion Detection Environment) is a file and directory integrity checker. It is used to detect incidents of unauthorized file tampering or changes. For example, AIDE can alert you if system password files are changed.

AIDE works by analyzing system files and then compiling an integrity database of file hashes. The database then serves as a comparison point to verify the integrity of the files and directories and detect changes.

The director includes the AIDE service, allowing you to add entries into an AIDE configuration, which is then used by the AIDE service to create an integrity database. For example:

  resource_registry:
    OS::TripleO::Services::Aide:
      /usr/share/openstack-tripleo-heat-templates/deployment/aide/aide-baremetal-ansible.yaml

  parameter_defaults:
    AideRules:
      'TripleORules':
        content: 'TripleORules = p+sha256'
        order: 1
      'etc':
        content: '/etc/ TripleORules'
        order: 2
      'boot':
        content: '/boot/ TripleORules'
        order: 3
      'sbin':
        content: '/sbin/ TripleORules'
        order: 4
      'var':
        content: '/var/ TripleORules'
        order: 5
      'not var/log':
        content: '!/var/log.*'
        order: 6
      'not var/spool':
        content: '!/var/spool.*'
        order: 7
      'not nova instances':
        content: '!/var/lib/nova/instances.*'
        order: 8
Note

The above example is not actively maintained or benchmarked, so you should select the AIDE values that suit your requirements.

  1. An alias named TripleORules is declared to avoid having to repeatedly out the same attributes each time.
  2. The alias receives the attributes of p+sha256. In AIDE terms, this reads as the following instruction: monitor all file permissions p with an integrity checksum of sha256.

For a complete list of attributes available for AIDE’s config files, see the AIDE MAN page at https://aide.github.io/.

Complete the following to apply changes to your deployment:

  1. Save the settings as a file called aide.yaml in the /home/stack/templates/ directory.
  2. Edit the aide.yaml environment file to have the parameters and values suitable for your environment.
  3. Include the /home/stack/templates/aide.yaml environment file in the openstack overcloud deploy command, along with all other necessary heat templates and environment files specific to your environment:

    openstack overcloud deploy --templates
    ...
    -e /home/stack/templates/aide.yaml

8.11.1. Using complex AIDE rules

Complex rules can be created using the format described previously. For example:

    MyAlias = p+i+n+u+g+s+b+m+c+sha512

The above would translate as the following instruction: monitor permissions, inodes, number of links, user, group, size, block count, mtime, ctime, using sha256 for checksum generation.

Note, the alias should always have an order position of 1, which means that it is positioned at the top of the AIDE rules and is applied recursively to all values below.

Following after the alias are the directories to monitor. Note that regular expressions can be used. For example we set monitoring for the var directory, but overwrite with a not clause using ! with '!/var/log.*' and '!/var/spool.*'.

8.11.2. Additional AIDE values

The following AIDE values are also available:

AideConfPath: The full POSIX path to the aide configuration file, this defaults to /etc/aide.conf. If no requirement is in place to change the file location, it is recommended to stick with the default path.

AideDBPath: The full POSIX path to the AIDE integrity database. This value is configurable to allow operators to declare their own full path, as often AIDE database files are stored off node perhaps on a read only file mount.

AideDBTempPath: The full POSIX path to the AIDE integrity temporary database. This temporary files is created when AIDE initializes a new database.

AideHour: This value is to set the hour attribute as part of AIDE cron configuration.

AideMinute: This value is to set the minute attribute as part of AIDE cron configuration.

AideCronUser: This value is to set the linux user as part of AIDE cron configuration.

AideEmail: This value sets the email address that receives AIDE reports each time a cron run is made.

AideMuaPath: This value sets the path to the Mail User Agent that is used to send AIDE reports to the email address set within AideEmail.

8.11.3. Cron configuration for AIDE

The AIDE director service allows you to configure a cron job. By default, it will send reports to /var/log/audit/; if you want to use email alerts, then enable the AideEmail parameter to send the alerts to the configured email address. Note that a reliance on email for critical alerts can be vulnerable to system outages and unintentional message filtering.

8.11.4. Considering the effect of system upgrades

When an upgrade is performed, the AIDE service will automatically regenerate a new integrity database to ensure all upgraded files are correctly recomputed to possess an updated checksum.

If openstack overcloud deploy is called as a subsequent run to an initial deployment, and the AIDE configuration rules are changed, the director AIDE service will rebuild the database to ensure the new config attributes are encapsulated in the integrity database.

8.12. Review SecureTTY

SecureTTY allows you to disable root access for any console device (tty). This behavior is managed by entries in the /etc/securetty file. For example:

  resource_registry:
    OS::TripleO::Services::Securetty: ../puppet/services/securetty.yaml

  parameter_defaults:
    TtyValues:
      - console
      - tty1
      - tty2
      - tty3
      - tty4
      - tty5
      - tty6

8.13. CADF auditing for Identity Service

A thorough auditing process can help you review the ongoing security posture of your OpenStack deployment. This is especially important for keystone, due to its role in the security model.

Red Hat OpenStack Platform has adopted Cloud Auditing Data Federation (CADF) as the data format for audit events, with the keystone service generating CADF events for Identity and Token operations. You can enable CADF auditing for keystone using KeystoneNotificationFormat:

  parameter_defaults:
    KeystoneNotificationFormat: cadf

8.14. Review the login.defs values

To enforce password requirements for new system users (non-keystone), director can add entries to /etc/login.defs by following these example parameters:

  resource_registry:
    OS::TripleO::Services::LoginDefs: ../puppet/services/login-defs.yaml

  parameter_defaults:
    PasswordMaxDays: 60
    PasswordMinDays: 1
    PasswordMinLen: 5
    PasswordWarnAge: 7
    FailDelay: 4