Chapter 6. Workflow and Playbook Deep Dive

This section provides a detailed walk-through of the Ansible playbooks created for this reference architecture and how the Ansible Tower Workflow ties it all together.

6.1. Overview

The set of playbooks were created with modularity in mind. Rather than create one monolithic playbook to interact with the various services, a role was created for each major step. This makes debugging sections easier, as well as allowing for re-use of certain roles in other infrastructures. For instance, the provisioning playbook could be used in another Ansible Tower workflow, with the OpenShift Container deployment swapped out for another infrastructure stack.

The Red Hat OpenShift Container Platform installer is based on Ansible, and an advanced install exposes this to a systems administrator. Inventory and variables can be configured and the playbook can be run using the standard ansible-playbook command. The use of these playbooks is well documented on the OpenShift site. This reference architecture only covers integrating and running those playbooks in Tower.

Playbooks to handle all of the provisioning steps leading up to the OpenShift installation are available in a github repo. Some modifications will be required to work in different environments, outlined below.

6.2. Inventory and Variables

The reference architecture playbooks require an inventory file and environment specific variables to be set. This will need to be customized for every unique deployment. Since the openshift-ansible playbooks also require an inventory, that inventory is extended to provide the required information for the provisioning playbooks.

ansible-hosts

## required for provisoning
ocp-master1.hpecloud.test ansible_host=192.168.1.173 ilo_ip=192.168.1.136
ocp-master2.hpecloud.test ansible_host=192.168.1.174 ilo_ip=192.168.1.137
ocp-master3.hpecloud.test ansible_host=192.168.1.175 ilo_ip=192.168.1.138
ocp-infra0.hpecloud.test ansible_host=192.168.1.172 ilo_ip=192.168.1.135
ocp-infra1.hpecloud.test ansible_host=192.168.1.170 ilo_ip=192.168.1.133
ocp-infra2.hpecloud.test ansible_host=192.168.1.171 ilo_ip=192.168.1.134
ocp-cns1.hpecloud.test	ansible_host=192.168.1.176 ilo_ip=192.168.1.140
ocp-cns2.hpecloud.test 	ansible_host=192.168.1.177 ilo_ip=192.168.1.141
ocp-cns3.hpecloud.test 	ansible_host=192.168.1.178 ilo_ip=192.168.1.142
[cns]
ocp-cns1.hpecloud.test
ocp-cns2.hpecloud.test
ocp-cns3.hpecloud.test

## standard openshift-ansible inventory
[OSEv3:children]
masters
nodes
etcd
lb
nfs

[masters]
ocp-master1.hpecloud.test
ocp-master2.hpecloud.test
ocp-master3.hpecloud.test

[etcd]
ocp-master1.hpecloud.test
ocp-master2.hpecloud.test
ocp-master3.hpecloud.test


[lb]
ocp-infra0.hpecloud.test
[nfs]
ocp-infra0.hpecloud.test

[nodes]
ocp-master1.hpecloud.test
ocp-master2.hpecloud.test
ocp-master3.hpecloud.test
ocp-infra1.hpecloud.test openshift_node_labels="{'region': 'infra', 'zone': 'west'}"
ocp-infra2.hpecloud.test openshift_node_labels="{'region': 'infra', 'zone': 'east'}"
ocp-cns1.hpecloud.test 	openshift_node_labels="{'region': 'primary', 'zone': 'east'}"
ocp-cns2.hpecloud.test 	openshift_node_labels="{'region': 'primary', 'zone': 'west'}"
ocp-cns3.hpecloud.test 	openshift_node_labels="{'region': 'primary', 'zone': 'west'}"

The standard openshift-ansible inventory is extended by adding an entry for each host, with two variables that must be set:

  • ansible_host the desired static IP of the deployed servers.
  • ilo_ip the IP for the node’s HPE Integrated Lights Out Interface

In addition, the nodes that are to be used for Red Hat Container-native storage are declared in a [cns] group.

There are also three files for variables:

  • group_vars/all common variables for all nodes
  • group_vars/cns specific disk configuration for storage nodes
  • groups_vars/OSEv3 OpenShift variables

group_vars/all contains variables common across all nodes:

group_vars/all

---
## satellite vars (change to match target environment)
satellite_user: "admin"
satellite_url: "https://192.168.1.211"
location_id: 2
organization_id: 1
environment_id: 1
hostgroup_id: 1
ansible_domain: hpecloud.test
ntp_server: "clock.corp.redhat.com"
rndc_key: "r/24yYpTOcnIxvXul+xz3Q=="
ilo_username: "Administrator"
bios_settings:
- id: "CustomPostMessage"
  value: "!! automate the planet !!"
- id: "MinProcIdlePkgState"
  value: "NoState"
- id: "EnergyPerfBias"
  value: "MaxPerf"
- id: "PowerProfile"
  value: "MaxPerf"
- id: "IntelQpiPowerManagement"
  value: "Disabled"
- id: "PowerRegulator"
  value: "StaticHighPerf"
- id: "MinProcIdlePower"
  value: "NoCStates"
storage_cfg:
  controllers:
    - deviceSlot: "Embedded"
      importConfiguration: false
      initialize: true
      mode: "RAID"
      logicalDrives:
      - bootable: true
        driveNumber: 1
        driveTechnology: "SasHdd"
        name: "os"
        numPhysicalDrives: 2
        raidLevel: "RAID1"
## oneview vars
oneview_auth:
  ip: "192.168.1.233"
  username: "Administrator"
  api_version: 300

Notice that a storage_cfg is defined which specifies how the disks should be configured. The [cns] nodes have additional disks for gluster, so they have a different storage_cfg variable set to create the additional RAID volume. This is configured in the cns group variables file:

group_vars/cns

---
storage_cfg:
  controllers:
    - deviceSlot: "Embedded"
      importConfiguration: false
      initialize: true
      mode: "RAID"
      logicalDrives:
      - bootable: true
        driveNumber: 1
        driveTechnology: "SasSsd"
        name: "os"
        numPhysicalDrives: 2
        raidLevel: "RAID1"
      - bootable: false
        driveNumber: 2
        driveTechnology: "SasHdd"
        name: "cns"
        numPhysicalDrives: 12
        raidLevel: "RAID6"

A number of variables need to be configured for the Red Hat OpenShift Container Platform deployment. These can be customized for an system administrator’s particular needs. The following are the settings used in this reference architecture:

group_vars/OSEv3

ansible_ssh_user: root
deployment_type: openshift-enterprise
openshift_master_identity_providers: [{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]
openshift_master_cluster_method: native
openshift_master_cluster_hostname: openshift-master.hpecloud.test
openshift_master_cluster_public_hostname: openshift-master.hpecloud.test
openshift_use_openshift_sdn: true
openshift_hosted_registry_storage_kind: nfs
openshift_hosted_registry_storage_access_modes: ['ReadWriteMany']
openshift_hosted_registry_storage_nfs_directory: /exports
openshift_hosted_registry_storage_nfs_options: '*(rw,root_squash)'
openshift_hosted_registry_storage_volume_name: registry
openshift_hosted_registry_storage_volume_size: 10Gi
openshift_hosted_metrics_deploy: false
openshift_hosted_logging_deploy: false
openshift_master_default_subdomain: paas.hpecloud.test
openshift_master_htpasswd_users: {'davidc': '$apr1$znPqASZl$WjOV1pFe4diJJPhjZgW2q1', 'kbell': '$apr1$.WBeJqbh$aic2L/5dxbnkdoEC0UWiT.', 'gluster': '$apr1$OMdwuyb6$bF2f3hSfwsE9XOyCaFEOP.'}

The openshift_master_htpasswd_users parameter determins a set of cluster users to create during the installation. It is a dict of username: password pairs. The passwords can be generated with htpasswd:

$ htpasswd -n gluster
New password:
Re-type new password:
gluster:$apr1$OMdwuyb6$bF2f3hSfwsE9XOyCaFEOP.

Finally, the provisioning playbooks rely on a number passwords and keys:

  • ILO Administrator on target nodes
  • Satellite Administrator
  • Root password on target nodes
  • Gluster admin password created above
  • RNDC key for the DNS server

For security reasons, these passwords are encrypted using Ansible Vault. To create a vaulted password file:

  • modify passwords.yaml to match the target environment:
---
- name: set ilo password
  set_fact:
    ilo_pw: "P@ssw0rd"
- name: set oneview password
  set_fact:
    oneview_pw: "P@ssw0rd"
- name: set satellite password
  set_fact:
    satellite_pw: "P@ssw0rd"
- name: set root password
  set_fact:
    root_pw: "P@ssw0rd"
- name: set gluster admin password
  set_fact:
    gluster_pw: "n0tP@ssw0rd"
- name: set rndc_key
  set_fact:
    rndc_key: "r/42yYsCOcnIxvXul+xz3Q=="
  • encrypt the file with ansible-vault and store it under roles/passwords/tasks/main.yaml:
$ ansible-vault encrypt passwords.yaml --output=roles/passwords/tasks/main.yaml

This role is then called by any other roles that require these credentials.

6.3. Workflow

Ansible Tower supports the creation of workflows, allowing multiple job templates to be chained together. For this reference architecture, a workflow entitled hpe-end2end was created. This workflow calls the following job templates:

  • hpe-cleanup
  • hpe-provisioning
  • hpe-common
  • hpe-predeploy
  • hpe-openshift
  • hpe-cns

Each playbook is annotated below.

6.4. Playbooks

hpe-provisioning

Although this is not the first playbook in the workflow, the cleanup playbook will make more sense after working through this one.

The provisioning playbook includes the provisioning role:

playbooks/provisioning.yaml

---
## bare metal provisioning

- name: get passwords
  hosts: all
  remote_user: root
  gather_facts: false
  roles:
    - ../roles/passwords

- name: baremetal provisioning
  hosts: all
  remote_user: root
  gather_facts: false
  roles:
    - ../roles/provisioning
  environment:
    PYTHONPATH: "$PYTHONPATH:/usr/share/ansible"
    ONEVIEWSDK_IP: "{{ oneview_auth.ip }}"
    ONEVIEWSDK_USERNAME: "{{ oneview_auth.username }}"
    ONEVIEWSDK_PASSWORD: "{{ oneview_pw }}"
    ONEVIEWSDK_API_VERSION: "{{ oneview_auth.api_version }}"

The following is the directory structure of that role:

hpe-provisioning

├── roles
│   └── provisioning
│       └── tasks
│           ├── addserver.yaml
│           ├── facts.yaml
│           ├── firmware.yaml
│           ├── main.yaml
│           ├── poweron.yaml
│           ├── profiles.yaml
│           └── satellite.yaml

main.yaml is the entry point for the role, which includes a set of playbooks to execute the required tasks:

roles/provisioning/tasks/main.yaml

---
# set some vars
- include: ./facts.yaml
# add the server to oneview
- include: ./addserver.yaml
# get firmware version
- include: ./firmware.yaml
# create profile templates -> profiles
- include: ./profiles.yaml
# create the host in satellite
- include: ./satellite.yaml
# power on server to kickstart over pxe
- include: ./poweron.yaml

The first included set of tasks is facts.yaml which sets a fact for the ILO hostname based on the node’s hostname, and uses the hpilo_facts module to generate the MAC addresses of the provisioning and bonded interfaces. In this case, eth0 is the interface used for provisioning while eth4 and eth6 are 10 gig interfaces bonded together for all other traffic. Note that the task assumes the ILO hostname is node’s hostname with -ilo appended to the end. For example, ocp-infra1's ILO hostname would be ocp-infra1-ilo.

roles/provisioning/tasks/facts.yaml

---
- set_fact:
    ilo_name: "{{ inventory_hostname.split('.').0 }}-ilo"
  tags:
  - ilo

- name: get facts from ilo
  hpilo_facts:
    host: "{{ ilo_ip }}"
    login: "{{ ilo_username }}"
    password: "{{ ilo_pw }}"
  register: ilo_facts
  delegate_to: localhost
  tags:
  - ilo

- name: set PXE MAC address
  set_fact:
    pxe_mac: "{{ ilo_facts.ansible_facts.hw_eth0.macaddress }}"
  tags:
  - ilo

## IP + 20, disabled after kickstart
- name: set PXE IP of new host
  set_fact:
    pxe_ip: "{{ ansible_host.split('.').0 }}.{{ ansible_host.split('.').1 }}.{{ ansible_host.split('.').2 }}.{{ ansible_host.split('.').3|int + 20 }}"

- name: set slave0 MAC address
  set_fact:
    slave0_mac: "{{ ilo_facts.ansible_facts.hw_eth4.macaddress }}"
  tags:
  - ilo

- name: set slave1 MAC address
  set_fact:
    slave1_mac: "{{ ilo_facts.ansible_facts.hw_eth6.macaddress }}"
  tags:
  - ilo

addserver.yaml first checks to see if the host is already in HPE OneView or not via the oneview_server_hardware_facts module. It adds the server with oneview_server_hardware if it does not exist. Later tasks rely on some facts provided by the OneView modules. If the host exists, those facts are looked up. Otherwise, they are captured with the register argument when added with the oneview_server_hardware module.

roles/provisioning/tasks/addserver.yaml

---
- name: check if server is in oneview
  oneview_server_hardware_facts:
    name: "{{ ilo_name }}"
  register: server_facts_exists
  delegate_to: localhost
  tags:
  - mkserver

- set_fact:
    server_exists: false
  when:
  - server_facts_exists.ansible_facts.server_hardwares|length == 0
  tags:
  - mkserver
- set_fact:
    server_exists: true
  when:
  - server_facts_exists.ansible_facts.server_hardwares|length > 0
  tags:
  - mkserver

- name: add a server to oneview
  oneview_server_hardware:
    state: present
    data:
      hostname: "{{ ilo_ip }}"
      username: "{{ ilo_username }}"
      password: "{{ ilo_pw }}"
      force: false
      licensingIntent: "OneViewNoiLO"
      configurationState: "Managed"
  register: server_facts_new
  delegate_to: localhost
  tags:
  - mkserver
  when:
  - server_exists == false
- set_fact:
    server_facts: "{{ server_facts_new.ansible_facts.server_hardware }}"
  tags:
  - mkserver
  when:
  - server_exists == false
- set_fact:
    server_facts: "{{ server_facts_exists.ansible_facts.server_hardwares.0 }}"
  tags:
  - mkserver
  when:
  - server_exists == true

The next playbook, firmware.yaml simply queries HPE OneView with the oneview_firmware_driver_facts module to determine the latest available firmware. The URI for the firmware is saved so it can be used when applying HPE OneView server profiles.

roles/provisioning/tasks/firmware.yaml

---
- name: get fw driver info
  oneview_firmware_driver_facts:
    params:
      sort: 'releaseDate:descending'
      start: 0
      count: 1
  register: fw_info
  delegate_to: localhost
  tags:
  - firmware
- name: set fw version
  set_fact:
    fw_baseline_uri: "{{ fw_info['ansible_facts']['firmware_drivers'].0['uri'] }}"
  tags:
  - firmware

profiles.yaml takes all the facts generated along the way, in addition to some of the pre-defined variables, to apply a profile in HPE OneView. When the profile is applied, HPE OneView will boot the server to apply any BIOS and storage settings as well as check the firmware level and update as required. This is one of the longer running plays, taking approximately 20 minutes to complete.

roles/provisioning/tasks/profiles.yaml

---
- set_fact:
      model: "{{ server_facts.model }}"
      short_model: "{{ server_facts.shortModel }}"
      dl_model: "{{ server_facts.shortModel.split(' ').0 }}"
  tags:
  - model

- name: create a service profile and assign it to a node
  oneview_server_profile:
    state: present
    data:
      name: "{{ inventory_hostname.split('.').0 }}"
      server_hardware: "{{ ilo_name }}"
      description: "OpenShift Nodes - {{ short_model }}"
      serverHardwareTypeName: "{{ short_model }} 1"
      boot:
        manageBoot: true
        order: ["PXE", "CD", "USB", "HardDisk"]
      bootMode:
        manageMode: true
        mode: "BIOS"
        pxeBootPolicy: null
      bios:
        manageBios: true
        overriddenSettings: "{{ bios_settings }}"
      firmware:
        firmwareBaselineUri: "{{ fw_baseline_uri }}"
        #firmwareInstallType: "FirmwareOnly"
        firmwareInstallType: "FirmwareOnlyOfflineMode"
        forceInstallFirmware: false
        manageFirmware: true
      localStorage: "{{ storage_cfg }}"
  register: output
  delegate_to: localhost
  tags:
  - templates

Once the server is properly managed by HPE OneView, it is deployed against the Red Hat Satellite server. The uri module is used to interact with Red Hat Satellite’s API. Satellite is first queried to check if the host exists already, and adds it if necessary. The pxe_mac fact generated earlier along with the ansible_host IP defined in the inventory configures the NIC in Satellite and creates a static DNS entry.

The URI call will create the host with one provisioning interface, and a bonded pair of 10gig NICs. If this configuration differs from the target environment, then interfaces_attributes below should be updated to match.

roles/provisioning/tasks/satellite.yaml

- name: "Get Host ID from Satellite"
  uri:
    url: "{{ satellite_url }}/api/hosts/?search=name={{ inventory_hostname }}"
    user: "{{ satellite_user }}"
    password: "{{ satellite_pw }}"
    headers:
      Content-Type: "application/json"
      Accept: "application/json"
    force_basic_auth: yes
    validate_certs: False
    body_format: json
    return_content: yes
    status_code: 200
  ignore_errors: false
  delegate_to: localhost
  register: check_host_response
  tags:
  - satellite
- debug:
    var: check_host_response
    verbosity: 2

- set_fact:
    host_in_satellite: true
  when:
  - check_host_response.json.subtotal == 1
  tags:
  - satellite
- set_fact:
    host_in_satellite: false
  when:
  - check_host_response.json.subtotal == 0
  tags:
  - satellite

- name: "Add Host to Satellite"
  uri:
    url: "{{ satellite_url }}/api/hosts/"
    method: POST
    user: "{{ satellite_user }}"
    password: "{{ satellite_pw }}"
    headers:
      Content-Type: "application/json"
      Accept: "application/json"
    force_basic_auth: yes
    validate_certs: False
    return_content: yes
    body_format: json
    body:
      host:
        name: "{{ inventory_hostname }}"
        location_id: "{{ location_id }}"
        organization_id: "{{ organization_id }}"
        environment_id: "{{ environment_id }}"
        hostgroup_id: "{{ hostgroup_id }}"
        build: true
        enabled: true
        managed: true
        root_pass: "{{ root_pw }}"
        overwrite: true
        interfaces_attributes:
          -
            mac: "{{ pxe_mac }}"
            primary: false
            type: "interface"
            identifier: "pxe0"
            provision: true
            ip: "{{ pxe_ip }}"
            managed: true
            subnet_id: 1
            domain_id: 1
          -
            mac: "{{ slave0_mac }}"
            primary: true
            type: "bond"
            ip: "{{ ansible_host }}"
            mode: "active-backup"
            identifier: "bond0"
            attached_devices: "slave0,slave1"
          -
            mac: "{{ slave0_mac }}"
            primary: false
            type: "interface"
            identifier: "slave0"
          -
            mac: "{{ slave1_mac }}"
            primary: false
            type: "interface"
            identifier: "slave1"
    status_code: 201
  ignore_errors: false
  delegate_to: localhost
  register: add_host_response
  when:
  - host_in_satellite == false
  tags:
  - satellite

Finally, the server can be powered on. Since PXE was configured as the first boot option for each server in HPE OneView, it will begin the kickstart process when first powered up. A wait is declared so the playbook will not exit until the server is online and responding to SSH, e.g. fully provisioned. Putting in the check that the server is up helps ensure the subsequent playbooks are executed when the host is installed and available:

roles/provisioning/tasks/poweron.yaml

---
- name: power on server
  oneview_server_hardware:
    state: power_state_set
    data:
      name: "{{ ilo_name }}"
      powerStateData:
        powerState: "On"
        powerControl: "MomentaryPress"
  register: output
  delegate_to: localhost
  tags:
  - poweron
- name: wait for server to become available
  local_action: wait_for port=22 host='{{ ansible_host }}' delay=30 timeout=3600
  tags:
  - poweron

hpe-common

hpe-common contains tasks that are common to any environment. It includes the type of things one would bake in to their 'golden image.' Examples would be the creation of users, or installation of packages.

The common playbook calls the common role:

playbooks/common.yaml

---
## common tasks for all servers

- name: common
  hosts: all
  remote_user: root
  roles:
    - ../roles/common

The role is laid out as follows:

├── roles
│   ├── common
│   │   ├── handlers
│   │   │   └── main.yaml
│   │   ├── tasks
│   │   │   ├── main.yaml
│   │   │   ├── motd.yaml
│   │   │   ├── ntp.yaml
│   │   │   └── packages.yaml
│   │   └── templates
│   │       ├── chrony.j2
│   │       └── motd.j2

And the role is executed in the desired order:

roles/common/tasks/main.yaml

---
- include: packages.yaml
  tags:
  - packages
- include: ntp.yaml
  tags:
  - ntp
- include: motd.yaml
  tags:
  - motd

packages.yaml installs some base packages:

roles/common/tasks/packages.yaml

---
- name: install common packages
  yum:
    name: '{{ item }}'
    state: latest
  tags:
    - rhn
  with_items:
  - screen
  - tmux
  - nfs-utils
  - sg3_utils
  - policycoreutils-python
  - '@network-file-system-client'

The NTP service is configured:

roles/common/tasks/ntp.yaml

---
- name: install chrony
  yum:
    name: chrony
    state: latest

- name: config chronyd
  template:
    src: chrony.j2
    dest: /etc/chrony.conf
  notify:
  - restart chronyd

The chronyd configuration file is generated via the template module. A base template is supplied to Ansible and variable substituion is performed on it. This is a simple example, only using the ntp_server variable defined previously:

roles/common/templates/chrony.j2

server {{ ntp_server }} iburst
stratumweight 0
driftfile /var/lib/chrony/drift
rtcsync
makestep 10 3
bindcmdaddress 127.0.0.1
bindcmdaddress ::1
keyfile /etc/chrony.keys
commandkey 1
generatecommandkey
noclientlog
logchange 0.5
logdir /var/log/chrony

The notify argument is used to trigger a handler. A handler is a special type of task that is run after another task notifies that it is changed. In this case, the chronyd daemon is enabled and restarted when the configuration file is created:

roles/common/handlers/main.yaml

---
- name: restart chronyd
  service:
    name: chronyd
    state: restarted
    enabled: true

Finally, the role includes motd.yaml which is used to create a customized message of the day file:

roles/common/tasks/motd.yaml

---
- name: get build date
  stat:
    path: /root/anaconda-ks.cfg
  register: build_stat
- name: convert to a nice string for the template
  command: date --date='@{{ build_stat.stat.ctime }}' +'%Y.%m.%d @ %H:%M %Z'
  register: pretty_date
- name: create motd file
  template:
    src: motd.j2
    dest: /etc/motd

The stat module is used to get the details of a file created at build time, and converts the creation time in to a friendly format. That variable is then fed to another template, motd.j2:

roles/common/templates/motd.j2

welcome to {{ ansible_fqdn }} built on {{ pretty_date.stdout }}

hpe-predeploy

When deploying a complex infrastructure such as Red Hat OpenShift Container Platform, there are usually some prerequisite steps to prepare the environment, before installing the software itself.

The prerequisite tasks for this OpenShift reference architecture are captured in the hp_predeploy role.

playbooks/predeploy.yaml

---
- name: run predeployment tasks
  hosts: all
  gather_facts: true
  remote_user: root
  roles:
    - ../roles/passwords
    - ../roles/predeploy
  environment:
    PYTHONPATH: "$PYTHONPATH:/usr/share/ansible"

This role includes just three playbooks:

roles/predeploy/tasks/main.yaml

---
- include: dns.yaml
  tags:
  - dns
- include: packages.yaml
  tags:
  - packages
- include: docker.yaml
  tags:
  - docker

dns.yaml will setup the DNS entries for the publicly available openshift-master endpoint. It will also create the wildcard DNS entry, which will host the deployed apps in OpenShift. Dynamic DNS creation is done against the Red Hat Satellite server which runs an integrated named server.

roles/predeploy/tasks/dns.yaml

---
- name: get IP of load balancer
  set_fact:
    lb_ip: "{{ hostvars[groups['lb'][0]]['ansible_default_ipv4']['address'] }}"
  run_once: true
  delegate_to: localhost

- name: get IP of DNS server via the load balancer
  set_fact:
    dns_server: "{{ hostvars[groups['lb'][0]]['ansible_dns']['nameservers'].0 }}"
  run_once: true
  delegate_to: localhost

- name: create DNS entry for master
  nsupdate:
    state: present
    key_name: "rndc-key"
    key_secret: "{{ rndc_key }}"
    server: "{{ dns_server }}"
    zone: "{{ ansible_domain }}"
    record: "{{ openshift_master_cluster_public_hostname.split('.').0 }}"
    value: "{{ lb_ip }}"
  run_once: true
  delegate_to: localhost

- set_fact:
    region: "{{ openshift_node_labels.region }}"
  when:
  - openshift_node_labels is defined

- name: set subdomain record
  set_fact:
    subdomain: "{{ openshift_master_default_subdomain.split('.').0 }}"
  run_once: true
  delegate_to: localhost

- name: define router IP
  set_fact:
    router_ip: '{{ ansible_default_ipv4.address }}'
  delegate_to: localhost
  when:
  - region|default('none') == 'infra'

- name: blacklist non-router IPs
  set_fact:
    router_ip: '0.0.0.0'
  delegate_to: localhost
  when:
  - region|default('none') != 'infra'

- name: create router IP list
  set_fact:
    router_ips: "{{ groups['all']|map('extract', hostvars, 'router_ip')|list|unique|difference(['0.0.0.0']) }}"
  delegate_to: localhost
  run_once: true

- name: update wildcard DNS entry
  nsupdate:
    state: present
    key_name: "rndc-key"
    key_secret: "r/42yYsCOcnIxvXul+xz3Q=="
    server: "{{ dns_server }}"
    zone: "{{ ansible_domain }}"
    record: "*.{{ subdomain }}"
    value: "{{ router_ips }}"
  delegate_to: localhost
  run_once: true

This leans heavily on a modified nsupdate module. Currently the module does not support one to many record mappings. A pull request has been submitted upstream to handle this use case. The modified module has also been included in the git repo, under the library subdirectory.

packages.yaml ensures per-requisite packages for OpenShift are installed:

roles/predeploy/tasks/packages.yaml

---
- name: install ocp prereq packages
  package:
    name: '{{ item }}'
    state: latest
  with_items:
  - wget
  - git
  - net-tools
  - bind-utils
  - iptables-services
  - bridge-utils
  - bash-completion
  - kexec-tools
  - sos
  - psacct

And docker.yaml installs the docker RPMs and configures docker storage:

roles/predeploy/tasks/docker.yaml

---
- name: configure docker-storage-setup
  lineinfile:
    path: /etc/sysconfig/docker-storage-setup
    line: "VG=docker-vg"
    create: yes
  register: dss

hpe-cleanup

With an understanding of hpe-provisioning, hpe-cleanup merely does the opposite: to remove the servers from Satellite and HPE OneView. Although not necessary on an initial run, having such a playbook in a workflow is handy to facilitate a CI/CD (continuous integration/delivery) environment. When code is pushed to the git repo, or newer versions of the RPMs are available for testing, the entire workflow can be executed again, with the first step ensuring a clean build environment.

The clean playbook includes the cleanup role:

playbooks/clean.yaml

---
## clean up!

- name: get passwords
  hosts: all
  remote_user: root
  gather_facts: false
  roles:
    - ../roles/passwords

- name: delete nodes from satellite and oneview
  hosts: all
  remote_user: root
  gather_facts: false
  roles:
    - ../roles/cleanup
  environment:
    PYTHONPATH: "$PYTHONPATH:/usr/share/ansible"
    ONEVIEWSDK_IP: "{{ oneview_auth.ip }}"
    ONEVIEWSDK_USERNAME: "{{ oneview_auth.username }}"
    ONEVIEWSDK_PASSWORD: "{{ oneview_pw }}"
    ONEVIEWSDK_API_VERSION: "{{ oneview_auth.api_version }}"

Here’s the directory structure of that role:

hpe-cleanup

├── roles
│   ├── cleanup
│   │   └── tasks
│   │       ├── dns.yaml
│   │       ├── facts.yaml
│   │       ├── main.yaml
│   │       ├── poweroff.yaml
│   │       ├── profiles.yaml
│   │       ├── rmserver.yaml
│   │       └── satellite.yaml

main.yaml is the entry point for the playbook and includes the remaining files in the proper order:

roles/cleanup/tasks/main.yaml

---
- include: ./facts.yaml
- include: ./poweroff.yaml
- include: ./satellite.yaml
- include: ./dns.yaml
- include: ./profiles.yaml
- include: ./rmserver.yaml

facts.yaml generates all the required variables while also checking to see if the server exists in HPE OneView or Red Hat Satellite:

roles/cleanup/tasks/facts.yaml

---
- set_fact:
    ilo_name: "{{ inventory_hostname.split('.').0 }}-ilo"
  tags:
  - ilo
- name: check if server is in oneview
  oneview_server_hardware_facts:
    name: "{{ ilo_name }}"
  register: server_facts_exists
  delegate_to: localhost
  tags:
  - mkserver
- set_fact:
    server_exists: false
  when:
  - server_facts_exists.ansible_facts.server_hardwares|length == 0
  tags:
  - mkserver
- set_fact:
    server_exists: true
  when:
  - server_facts_exists.ansible_facts.server_hardwares|length > 0
  tags:
  - mkserver
- set_fact:
    server_facts: "{{ server_facts_exists.ansible_facts.server_hardwares.0 }}"
  tags:
  - mkserver
  when:
  - server_exists == true

- set_fact:
      model: "{{ server_facts.model }}"
      short_model: "{{ server_facts.shortModel }}"
      dl_model: "{{ server_facts.shortModel.split(' ').0 }}"
  when:
  - server_exists == true

- name: get uri of profile template
  oneview_server_profile_template_facts:
    params:
      filter: name='OCP-{{ dl_model }}'
  register: profile_output
  delegate_to: localhost
  when:
  - server_exists == true

- set_fact:
    template_uri: "{{ profile_output.ansible_facts.server_profile_templates.0.uri }}"
  when:
  - server_exists == true
  - profile_output.ansible_facts.server_profile_templates|length > 0

poweroff.yaml shuts the node down:

roles/cleanup/tasks/poweroff.yaml

---
- name: power off server
  oneview_server_hardware:
    state: power_state_set
    data:
      name: "{{ ilo_name }}"
      powerStateData:
        powerState: "Off"
        powerControl: "PressAndHold"
  register: output
  delegate_to: localhost
  when:
  - server_exists == true

Then deleted in Satellite:

roles/cleanup/tasks/satellite.yaml

---
- name: "Get Host ID from Satellite"
  uri:
    url: "{{ satellite_url }}/api/hosts/?search=name={{ inventory_hostname }}"
    user: "{{ satellite_user }}"
    password: "{{ satellite_pw }}"
    headers:
      Content-Type: "application/json"
      Accept: "application/json"
    force_basic_auth: yes
    validate_certs: False
    return_content: yes
    status_code: 200
  ignore_errors: false
  register: check_host_response
  delegate_to: localhost

- name: set host ID
  set_fact:
    host_id: "{{ check_host_response.json.results.0.id }}"
  when:
  - check_host_response.json.subtotal == 1

- name: "delete host from satellite"
  uri:
    url: "{{ satellite_url }}/api/hosts/{{ host_id }}"
    user: "{{ satellite_user }}"
    password: "{{ satellite_pw }}"
    method: DELETE
    headers:
      Content-Type: "application/json"
      Accept: "application/json"
    force_basic_auth: yes
    validate_certs: False
    return_content: yes
    status_code: 200
  ignore_errors: false
  register: rm_host_response
  delegate_to: localhost
  when:
  - check_host_response.json.subtotal == 1

The OpenShift DNS entries are removed:

roles/cleanup/tasks/dns.yaml

---
- name: get IP of DNS server via the load balancer
  set_fact:
    dns_server: "{{ satellite_url.split('//').1 }}"
  run_once: true
  delegate_to: localhost

- name: set subdomain record
  set_fact:
    subdomain: "{{ openshift_master_default_subdomain.split('.').0 }}"
  run_once: true
  delegate_to: localhost

- name: remove DNS entry for master
  nsupdate:
    state: absent
    key_name: "rndc-key"
    key_secret: "{{ rndc_key }}"
    server: "{{ dns_server }}"
    zone: "{{ ansible_domain }}"
    record: "{{ openshift_master_cluster_public_hostname.split('.').0 }}"
  run_once: true
  delegate_to: localhost


- name: delete wildcard DNS entries
  nsupdate:
    state: absent
    key_name: "rndc-key"
    key_secret: "{{ rndc_key }}"
    server: "{{ dns_server }}"
    zone: "{{ ansible_domain }}"
    record: "*.{{ subdomain }}"
  run_once: true
  delegate_to: localhost

The profile is disassociated from the node and removed in HPE OneView:

roles/cleanup/tasks/profiles.yaml

---
- name: unassign server profile
  oneview_server_profile:
    state: absent
    data:
      name: "{{ inventory_hostname.split('.').0 }}"
      server_template: "OCP-{{ dl_model }}"
      server_hardware: "{{ ilo_name }}"
  register: output
  delegate_to: localhost
  when:
  - server_exists == true

And finally, delete the node from HPE OneView:

roles/cleanup/tasks/rmserver.yaml

---
- name: remove a server from oneview
  oneview_server_hardware:
    state: absent
    data:
      name: "{{ ilo_name }}"
      force: false
      licensingIntent: "OneView"
      configurationState: "Managed"
  register: output
  delegate_to: localhost
  tags:
  - mkserver
  when:
  - server_exists == true

hpe-openshift

The official Red Hat OpenShift Container Platform ansible playbooks were used for this reference architecture. Detailing these playbooks is out of scope for this document. The playbooks can be found on github and instructions on how to use them can be found here

The playbooks require no modification to work with Ansible Tower. As per the installation guide, a host file is created and a number of variables are defined. This is documented at the beginning of this chatper.

hpe-cns

With a properly running OpenShift Container Platform cluster, Red Hat Container-native storage can now be deployed. The following playbooks automate the tasks from the official documentation, found here.

The cns playbook calls the cns role:

playbooks/cns.yaml

---
- name: deploy container native storage w/ gluster
  hosts: all
  gather_facts: true
  remote_user: root
  roles:
    - ../roles/passwords
    - ../roles/cns
  environment:
    PYTHONPATH: "$PYTHONPATH:/usr/share/ansible"

The role’s directory tree details the various files used:

├── roles
│   ├── cns
│   │   ├── tasks
│   │   │   ├── cnsdeploy.yaml
│   │   │   ├── disks.yaml
│   │   │   ├── heketitopo.yaml
│   │   │   ├── iptables.yaml
│   │   │   ├── main.yaml
│   │   │   ├── oc.yaml
│   │   │   └── packages.yaml
│   │   └── templates
│   │       └── topology-hpe.j2

The cns roles includes a number of required tasks:

roles/cns/tasks/main.yaml

---
- include: disks.yaml
  when:
  - "'cns' in group_names"
  tags:
  - disks

- include: iptables.yaml
  when:
  - "'cns' in group_names"
  tags:
  - iptables

- include: packages.yaml
  when:
  - "ansible_fqdn == groups['masters'][0]"

- include: oc.yaml
  when:
   - "ansible_fqdn == groups['masters'][0]"

- include: heketitopo.yaml
  when:
  - "ansible_fqdn == groups['masters'][0]"
  tags:
  - json

- include: cnsdeploy.yaml
  when:
  - "ansible_fqdn == groups['masters'][0]"
  tags:
  - cnsdeploy

Note that some of these tasks will only run on the nodes in the cns group, while others will only running on the first OpenShift master node, due to the 'when' statements.

disks.yaml will prepare the RAID volume for gluster: .roles/cns/tasks/disks.yaml

---
- name: nuke sdb on cns nodes for gluster
  command: sgdisk -Z /dev/sdb

gluster requires a number of firewall ports to be opened, which is handled by iptables.yaml:

roles/cns/tasks/iptables.yaml

---
- name: required gluster ports
  lineinfile:
    dest: /etc/sysconfig/iptables
    state: present
    insertbefore: 'COMMIT'
    line: "{{ item }}"
  with_items:
  - '-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 24007 -j ACCEPT'
  - '-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 24008 -j ACCEPT'
  - '-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 2222 -j ACCEPT'
  - '-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m multiport --dports 49152:49664 -j ACCEPT'

- name: reload iptables
  systemd:
    name: iptables
    state: reloaded

A couple of packages are required for CNS:

roles/cns/tasks/packages.yaml

---

- name: install cns and heketi packages
  package:
    name: '{{ item }}'
    state: latest
  with_items:
  - cns-deploy
  - heketi-client

The OpenShift environment is prepared by creating a project and setting some privileges:

roles/cns/tasks/oc.yaml

---
- name: loging as cns cluster admin
  command: oc login -u system:admin -n default

- name:  add cluster admin role for gluster
  command: oadm policy add-cluster-role-to-user cluster-admin gluster

- name: login as gluster user
  command: oc login -u gluster -p {{ gluster_pw }}

- name: create storage project
  command: oc new-project rhgs-cluster

- name: add scc policy default
  command: oadm policy add-scc-to-user privileged -z default

- name: add scc policy router
  command: oadm policy add-scc-to-user privileged -z router

heketi is used to install gluster and requires a json file describing the topology of the storage nodes. heketitopo.yaml generates this topology using the template module and facts from the hosts:

roles/cns/tasks/heketitopo.yaml

---
- template:
   src: topology-hpe.j2
   dest: /usr/share/heketi/topology-hpe.json
   owner: bin
   group: wheel
   mode: 0644

The template iterates over the cns group to define the hosts:

roles/cns/templates/topology-hpe.j2

{
  "clusters": [
    {
      "nodes": [
        {% for host in groups['cns'] %}
          {
          "node": {
            "hostnames": {
              "manage": [
                "{{ host }}"
              ],
              "storage": [
                "{{ hostvars[host]['ansible_default_ipv4']['address'] }}"
              ]
              },
              "zone": 1
              },
              "devices": [
                "/dev/sdb"
              ]
{% if loop.last %}
          }
{% else %}
          },
{% endif %}
{% endfor %}
        ]
    }
  ]
}

The generated file is then used to deploy the CNS solution:

roles/cns/tasks/cnsdeploy.yaml

---
- name: loging as cns gluster
  command: oc login -u gluster -p {{ gluster_pw }}


- name: deploy cns
  command: cns-deploy -n rhgs-cluster -g /usr/share/heketi/topology-hpe.json -y  --admin-key {{ gluster_pw }} --user-key {{ gluster_pw }}