Red Hat Ceph Storage 5 rootless deployment utilizing ansible ad-hoc commands

Updated -

Issue

  • Restricted and automated infrastructures limit the use of the superuser account to enhance security
  • Delegate permissions to the minimum necessary set of privileges required

Environment

  • Red Hat Ceph Storage 5

Prerequisites for the Environment

  • ansible for managing and configuring the target nodes
  • distributed user for ssh key-based access
  • sudo password-less permissions for distributing user
  • revoking root ssh access in the ssh daemon configuration

Resolution

  • Changing Red Hat Ceph Storage 5 to a none privileged user for deployment and orchestration of its components can be achieved by deploying ssh based key authentication and password-less sudo access.

  • Should this issue occurs in the environment, please open a new support case in the Red Hat Customer Portal referring to this article.

Note ! Red Hat does not support and recommend limiting the sudo permissions to a specific set of commands, at this point in time.
Note ! Red Hat strongly recommends to verify the described deployment in test or development environments.

Described scenarios


Creating a none privileged distributed user for cephadm

  • Ensure to choose a unique, not shared user name to avoid conflicts
  • Limit the necessary command set to a minimum

    # ansible -b -i hosts \
         -m user \
         -a 'name=cephorch \
             comment="Cephadm orchestrator" \
             generate_ssh_key=true \
             ssh_key_comment="Cephadm orchestrator" \
             password_lock=true' \
             nodes
    
    node1.example.com | CHANGED => {
        "ansible_facts": {
            "discovered_interpreter_python": "/usr/libexec/platform-python"
        },
        "append": false,
        "changed": false,
        "comment": "Cephadm orchestrator",
        "group": 1001,
        "home": "/home/cephorch",
        "move_home": false,
        "name": "cephorch",
        "shell": "/bin/bash",
        "ssh_fingerprint": "3072 SHA256:T0...[output ommited]... Cephadm orchestrator (RSA)",
        "ssh_key_file": "/home/cephorch/.ssh/id_rsa",
        "ssh_public_key": "ssh-rsa AAA...[output ommited]... Cephadm orchestrator",
        "state": "present",
        "uid": 1001
    }
    ...[output omitted]...
    node2.example.com | CHANGED => {
    ...[output omitted]...
    node3.example.com | CHANGED => {
    ...[output omitted]...
    
  • Note that in the output the ssh public key generated by the selected bootstrap node is shown

  • The value of the ssh public key is required to be distributed to all nodes for ssh-key-based access.

Deploying ssh-key-based access for the none privileged distributed user

  • If necessary, retrieve the value from the bootstrap node as follows and store it into a file on the management node

    # ansible -b -i hosts \
         -a 'cat /home/cephorch/.ssh/id_rsa.pub' \
         bootstrap | tail -1 > \
      cephorch.pub
    
    # ansible -b -i hosts \
        -m authorized_key \
        -a 'user=cephorch \
            state=present \
            key={{ lookup("file", "cephorch.pub") }}' \
        nodes
    
    node1.example.com | CHANGED => {
        "ansible_facts": {
            "discovered_interpreter_python": "/usr/libexec/platform-python"
        },
        "changed": true,
        "comment": null,
        "exclusive": false,
        "follow": false,
        "key": "ssh-rsa AAAAB3...[output ommited]... Cephadm orchestrator",
        "key_options": null,
        "keyfile": "/home/cephorch/.ssh/authorized_keys",
        "manage_dir": true,
        "path": null,
        "state": "present",
        "user": "cephorch",
        "validate_certs": true
    }
    node2.example.com | CHANGED => {
    ...[output ommited]...
    node3.example.com | CHANGED => {
    ...[output ommited]...
    

Verify connectivity to all nodes from the bootstrap node.

  • Ensure to deploy an ansible.cfg and inventory file on the newly created user.

    # ssh ansible@bootstrap
    Activate the web console with: systemctl enable --now cockpit.socket
    Last login: Sun Dec 19 08:28:41 2021 from 127.0.0.1
    $ sudo runuser -u cephorch /bin/bash
    [cephorch@bootstrap ~]$ cat <<EOF > ansible.cfg
    [defaults]
    inventory = ./hosts
    remote_user = cephorch
    EOF 
    [cephorch@bootstrap ~]$ cat <<EOF > inventory
    bootstrap:
      hosts:
          node1.example.com:
    nodes:
      hosts:
        node1.example.com:
        node2.example.com:
        node3.example.com:
    EOF
    [cephorch@bootstrap ~]$ ansible -m ping nodes,bootstrap
    node1.example.com | SUCCESS => {
        "ansible_facts": {
            "discovered_interpreter_python": "/usr/libexec/platform-python"
        },
        "changed": false,
        "ping": "pong"
    }
    node2.example.com | SUCCESS => {
        "ansible_facts": {
            "discovered_interpreter_python": "/usr/libexec/platform-python"
        },
        "changed": false,
        "ping": "pong"
    }
    node3.example.com | SUCCESS => {
        "ansible_facts": {
            "discovered_interpreter_python": "/usr/bin/python3.6"
        },
        "changed": false,
        "ping": "pong"
    }
    

Bootstrap your Red Hat Ceph Storage 5

  • Disable, root login on the ssh daemon by setting the PermitRootLogin option to no.

    [cephorch@bootstrap ~]$ ansible -b -m lineinfile \
        -a 'path=/etc/ssh/sshd_config \
            regexp="^PermitRootLogin yes" \
            line="PermitRootLogin no" \
            state=present' \
        nodes
    
  • Restart the sshd daemon for the changes to take effect.

    [cephorch@bootstrap ~]$ ansible -b -m service \
                                    -a 'name=sshd state=restarted' nodes
    
  • Ensure to update the cephadm-ansible ansible.cfg file with the remote_user option

  • Alternatively, specify the appropriate command line switch.

    [cephorch@bootstrap ~]$ ansible -b -m lineinfile \
        -a 'path=/usr/share/cephadm-ansible/ansible.cfg \
             line="remote_user = cephorch" \
             insertafter="\[defaults\]" \
             state=present' \
        localhost
    
    localhost | CHANGED => {
        "backup": "",
        "changed": true,
        "msg": "line added"
    }
    

Run the cephadm-preflight.yml playbook

  • Prepare the Red Hat Ceph Storage nodes by running the cephadm-preflight.yml playbook.

    [cephorch@bootstrap ~]$ cd /usr/share/cephadm-ansible
    [cephorch@bootstrap cephadm-ansible]$ ansible-playbook -b \
      -i /home/cephorch/hosts \
      --extra-vars='ceph_origin=' \
      cephadm-preflight.yml
    
    PLAY [all] ******************************************************************************
    
    TASK [enable red hat storage tools repository] ******************************************
    Sunday 19 December 2021  09:07:24 +0000 (0:00:00.024)       0:00:00.024 
    
    ...[output ommited]...
    
    PLAY RECAP ******************************************************************************
    node1.example.com            : ok=2    changed=1    unreachable=0    failed=0    skipped=8    rescued=0    ignored=0   
    node2.example.com            : ok=2    changed=1    unreachable=0    failed=0    skipped=8    rescued=0    ignored=0   
    node3.example.com            : ok=2    changed=1    unreachable=0    failed=0    skipped=8    rescued=0    ignored=0   
    
    Sunday 19 December 2021  09:09:20 +0000 (0:00:00.891)       0:01:56.104 
    =============================================================================== 
    install prerequisites packages ----------------------------------------------- 114.04s
    ensure chronyd is running ---------------------------------------------------- 0.89s
    configure red hat ceph stable community repository --------------------------- 0.20s
    remove ceph_stable repositories ---------------------------------------------- 0.15s
    configure red hat ceph stable noarch community repository -------------------- 0.15s
    configure red hat ceph community repository stable key ----------------------- 0.13s
    enable red hat storage tools repository -------------------------------------- 0.13s
    install epel-release --------------------------------------------------------- 0.13s
    fetch ceph red hat development repository ------------------------------------ 0.13s
    configure ceph red hat development repository -------------------------------- 0.12s
    

Bootstraping Red Hat Ceph Storage 5 cluster with cephadm

  • Extend bootstraping the Red Hat Ceph Storage cluster command with the three mandatory options to reflect the changes on ssh transport level.

    • ssh-private-key
    • ssh-public-key
    • ssh-user
    [cephorch@bootstrap ~]$ sudo cephadm bootstrap \
        --mon-ip 127.0.0.1 \
        --ssh-private-key /home/cephorch/.ssh/id_rsa \
        --ssh-public-key /home/cephorch/.ssh/id_rsa.pub \
        --ssh-user cephorch \
        --allow-fqdn-hostname
    
    Verifying podman|docker is present...
    Verifying lvm2 is present...
    Verifying time synchronization is in place...
    Unit chronyd.service is enabled and running
    Repeating the final host check...
    podman|docker (/bin/podman) is present
    systemctl is present
    lvcreate is present
    Unit chronyd.service is enabled and running
    Host looks OK
    Cluster fsid: ec428a82-60ac-11ec-8c29-52540059a157
    Verifying IP 127.0.0.1 port 3300 ...
    Verifying IP 127.0.0.1 port 6789 ...
    Mon IP `127.0.0.1` is in CIDR network `127.0.0.0/8`
    - internal network (--cluster-network) has not been provided, 
      OSD replication will default to the public_network
    
    ...[output ommited]...
    
    Ceph Dashboard is now available at:
    
             URL: https://node1.example.com:8443/
            User: admin
        Password: xxxxxxxxx
    
    Enabling client.admin keyring and conf on hosts with "admin" label
    You can access the Ceph CLI with:
    
        sudo /sbin/cephadm shell --fsid ec428a82-60ac-11ec-8c29-52540059a157 \
         -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
    
    Please consider enabling telemetry to help improve Ceph:
    
        ceph telemetry on
    
    For more information see:
    
        https://docs.ceph.com/docs/pacific/mgr/telemetry/
    
    Bootstrap complete.
    
  • Verify that the bootstrapped Red Hat Ceph Storage 5 is up and is ready for orchestration.

    [cephorch@bootstrap ~]$ sudo cephadm shell 
    
    [ceph: root@node1 /]# ceph health
    HEALTH_WARN OSD count 0 < osd_pool_default_size 3
    
    [ceph: root@vm07 /]# ceph -s
      cluster:
        id:     ec428a82-60ac-11ec-8c29-52540059a157
        health: HEALTH_WARN
                OSD count 0 < osd_pool_default_size 3
    
      services:
        mon: 1 daemons, quorum node1.example.com (age 40m)
        mgr: node1.example.com.vjhvdm(active, since 39m)
        osd: 0 osds: 0 up, 0 in
    
      data:
        pools:   0 pools, 0 pgs
        objects: 0 objects, 0 B
        usage:   0 B used, 0 B / 0 B avail
        pgs:     
    
    [ceph: root@node1 /]# ceph config-key get mgr/cephadm/ssh_user
    cephorch
    
  • Continue the steps at Extending Red Hat Ceph Storage 5

Update an existing Red Hat Ceph Storage 5

Changing an existing Red Hat Ceph Storage 5 Cluster to an unprivileged user by re-using the ssh key generated by the cephadm bootstrap command.

Retrieve the cluster ssh-key for the none privileged distributed user

  • retrieve the value from the cluster configuration as follows and store it into a file on the management node

    # cephadm shell ceph config-key get \
         mgr/cephadm/ssh_identity_pub | \
         tail -1 > \
      cephorch.pub
    
  • distribute the ssh public key to all nodes for the unprivileged user

    # ansible -b -i hosts \
        -m authorized_key \
        -a 'user=cephorch \
            state=present \
            key={{ lookup("file", "cephorch.pub") }}' \
        nodes
    
    node1.example.com | CHANGED => {
        "ansible_facts": {
            "discovered_interpreter_python": "/usr/libexec/platform-python"
        },
        "changed": true,
        "comment": null,
        "exclusive": false,
        "follow": false,
        "key": "ssh-rsa AAAAB3...[output ommited]... Cephadm orchestrator",
        "key_options": null,
        "keyfile": "/home/cephorch/.ssh/authorized_keys",
        "manage_dir": true,
        "path": null,
        "state": "present",
        "user": "cephorch",
        "validate_certs": true
    }
    node2.example.com | CHANGED => {
    ...[output ommited]...
    node3.example.com | CHANGED => {
    ...[output ommited]...
    
  • verify that the nodes are access able prior changing the Red Hat Ceph Storage configuration

    [ceph: root@bootstrap /]# ceph config-key get \
        mgr/cephadm/ssh_identity_key > \
        /root/.ssh/id_rsa
    
    [ceph: root@bootstrap /]# chmod 0600 /root/.ssh/id_rsa
    
    [ceph: root@bootstrap /]# ssh cephorch@node1.example.com "date"
    Mon Dec 20 11:53:08 UTC 2021
    
    [ceph: root@bootstrap /]# ssh cephorch@node2.example.com "date"
    Mon Dec 20 11:53:08 UTC 2021
    
    [ceph: root@bootstrap /]# ssh cephorch@node2.example.com "date"
    Mon Dec 20 11:53:25 UTC 2021
    
  • update the ssh user in the Red Hat Ceph Storage 5 configuration accordingly

    [ceph: root@bootstrap /]# ceph config-key \
                                   set mgr/cephadm/ssh_user cephorch
    
    
  • Disable, root login on the ssh daemon by setting the PermitRootLogin option to no.

    [cephorch@bootstrap ~]$ ansible -b -m lineinfile \
        -a 'path=/etc/ssh/sshd_config \
            regexp="^PermitRootLogin yes" \
            line="PermitRootLogin no" \
            state=present' \
        nodes
    
  • Restart the sshd daemon for the changes to take effect.

    [cephorch@bootstrap ~]$ ansible -b -m service \
                                    -a 'name=sshd state=restarted' nodes
    

Extending Red Hat Ceph Storage 5

After finishing updating or bootstrapping the Red hat Ceph Storage 5 with an unprivileged user, proceed with extending or maintenance tasks of the configuration.

  • Use your favorite editor to write your Cluster definition (a basic example is shown below)

    # Host definitions
    service_type: host
    addr: 127.0.0.1
    hostname: node1.example.com
    labels:
    - mon
    - mgr
    - osr
    - _admin
    ---
    service_type: host
    addr: 127.0.0.2
    hostname: node2.example.com
    labels:
    - mon
    - mgr
    - osr
    ---
    service_type: host
    addr: 127.0.0.3
    hostname: node3.example.com
    labels:
    - mon
    - mgr
    - osr
    ---
    # Monitor deployment definition
    service_type: mon
    placement:
      hosts:
        - node1.example.com
        - node2.example.com
        - node3.example.com
    ---
    # Manager deployment definition
    service_type: mgr
    placement:
      hosts:
        - node1.example.com
        - node2.example.com
        - node3.example.com
    ---
    # OSD deployment definition
    service_type: osd
    service_id: default_osd_group
    placement:
      host_pattern: "*"
    data_devices:
      paths:
        - /dev/vdb
        - /dev/vdc
        - /dev/vdd
    
  • Apply to definition to the Red Hat Ceph Cluster 5

  • Verify the Cluster health and the configuration is in place

    [ceph: root@node1 /]# ceph orch apply -i rollout.yml 
    Added host 'node1.example.com' with addr '127.0.0.1'
    Added host 'node1.example.com' with addr '127.0.0.2'
    Added host 'node1.example.com' with addr '127.0.0.3'
    Scheduled mon update...
    Scheduled mgr update...
    Scheduled osd.default_osd_group update...
    
    [ceph: root@node1 /]# ceph -s
      cluster:
        id:     ec428a82-60ac-11ec-8c29-52540059a157
        health: HEALTH_OK
    
      services:
        mon: 3 daemons, quorum node1.example.com,node2,node3 (age 5s)
        mgr: node1.example.com.vjhvdm(active, since 61m), standbys: node2.xebxcp, node3.slhagg
        osd: 7 osds: 6 up (since 78s), 7 in (since 1.50128s); 1 remapped pgs
    
      data:
        pools:   1 pools, 1 pgs
        objects: 0 objects, 0 B
        usage:   30 MiB used, 60 GiB / 60 GiB avail
        pgs:     1 active+clean+remapped