Container Guide
Deploying and Managing Red Hat Ceph Storage in Containers
Abstract
Chapter 1. Deploying Red Hat Ceph Storage in Containers
This chapter describes how to use the Ansible application with the ceph-ansible playbook to deploy Red Hat Ceph Storage 3 in containers.
- To install the Red Hat Ceph Storage, see Section 1.2, “Installing a Red Hat Ceph Storage Cluster in Containers”.
- To install the Ceph Object Gateway, see Section 1.3, “Installing the Ceph Object Gateway in a Container”.
- To install Metadata Servers, see Section 1.4, “Installing Metadata Servers”.
-
To learn about the Ansible
--limitoption, see Section 1.5, “Understanding thelimitoption”.
1.1. Prerequisites
- Obtain a valid customer subscription.
Prepare the cluster nodes. On each node:
1.1.1. Registering Red Hat Ceph Storage Nodes to the CDN and Attaching Subscriptions
Register each Red Hat Ceph Storage (RHCS) node to the Content Delivery Network (CDN) and attach the appropriate subscription so that the node has access to software repositories. Each RHCS node must be able to access the full Red Hat Enterprise Linux 7 base content and the extras repository content.
Prerequisites
- A valid Red Hat subscription
- RHCS nodes must be able to connect to the Internet.
For RHCS nodes that cannot access the internet during installation, you must first follow these steps on a system with internet access:
Start a local Docker registry:
# docker run -d -p 5000:5000 --restart=always --name registry registry:2
Pull the Red Hat Ceph Storage 3.x image from the Red Hat Customer Portal:
# docker pull registry.access.redhat.com/rhceph/rhceph-3-rhel7
Tag the image:
# docker tag registry.access.redhat.com/rhceph/rhceph-3-rhel7 <local-host-fqdn>:5000/cephimageinlocalreg
Replace
<local-host-fqdn>with your local host FQDN.Push the image to the local Docker registry you started:
# docker push <local-host-fqdn>:5000/cephimageinlocalreg
Replace
<local-host-fqdn>with your local host FQDN.
Procedure
Perform the following steps on all nodes in the storage cluster as the root user.
Register the node. When prompted, enter your Red Hat Customer Portal credentials:
# subscription-manager register
Pull the latest subscription data from the CDN:
# subscription-manager refresh
List all available subscriptions for Red Hat Ceph Storage:
# subscription-manager list --available --all --matches="*Ceph*"
Identify the appropriate subscription and retrieve its Pool ID.
Attach the subscription:
# subscription-manager attach --pool=$POOL_ID
- Replace
-
$POOL_IDwith the Pool ID identified in the previous step.
-
Disable the default software repositories. Then, enable the Red Hat Enterprise Linux 7 Server and Red Hat Enterprise Linux 7 Server Extras repositories:
# subscription-manager repos --disable=* # subscription-manager repos --enable=rhel-7-server-rpms # subscription-manager repos --enable=rhel-7-server-extras-rpms
Update the system to receive the latest packages:
# yum update
Additional Resources
- See the Registering a System and Managing Subscriptions chapter in the System Administrator’s Guide for Red Hat Enterprise Linux 7.
1.1.2. Creating an Ansible user with sudo access
Ansible must be able to log into all the Red Hat Ceph Storage (RHCS) nodes as a user that has root privileges to install software and create configuration files without prompting for a password. You must create an Ansible user with password-less root access on all nodes in the storage cluster when deploying and configuring a Red Hat Ceph Storage cluster with Ansible.
Prerequisite
-
Having
rootorsudoaccess to all nodes in the storage cluster.
Procedure
Log in to a Ceph node as the
rootuser:ssh root@$HOST_NAME
- Replace
-
$HOST_NAMEwith the host name of the Ceph node.
-
Example
# ssh root@mon01
Enter the
rootpassword when prompted.Create a new Ansible user:
adduser $USER_NAME
- Replace
-
$USER_NAMEwith the new user name for the Ansible user.
-
Example
# adduser admin
ImportantDo not use
cephas the user name. Thecephuser name is reserved for the Ceph daemons. A uniform user name across the cluster can improve ease of use, but avoid using obvious user names, because intruders typically use them for brute-force attacks.Set a new password for this user:
# passwd $USER_NAME
- Replace
-
$USER_NAMEwith the new user name for the Ansible user.
-
# passwd admin
Enter the new password twice when prompted.
Configure
sudoaccess for the newly created user:cat << EOF >/etc/sudoers.d/$USER_NAME $USER_NAME ALL = (root) NOPASSWD:ALL EOF
- Replace
-
$USER_NAMEwith the new user name for the Ansible user.
-
Example
# cat << EOF >/etc/sudoers.d/admin admin ALL = (root) NOPASSWD:ALL EOF
Assign the correct file permissions to the new file:
chmod 0440 /etc/sudoers.d/$USER_NAME
- Replace
-
$USER_NAMEwith the new user name for the Ansible user.
-
Example
# chmod 0440 /etc/sudoers.d/admin
Additional Resources
- The Adding a New User section in the System Administrator’s Guide for Red Hat Enterprise Linux 7.
1.1.3. Enabling Password-less SSH for Ansible
Generate an SSH key pair on the Ansible administration node and distribute the public key to each node in the storage cluster so that Ansible can access the nodes without being prompted for a password.
Prerequisites
Procedure
Do the following steps from the Ansible administration node, and as the Ansible user.
Generate the SSH key pair, accept the default file name and leave the passphrase empty:
[user@admin ~]$ ssh-keygen
Copy the public key to all nodes in the storage cluster:
ssh-copy-id $USER_NAME@$HOST_NAME
- Replace
-
$USER_NAMEwith the new user name for the Ansible user. -
$HOST_NAMEwith the host name of the Ceph node.
-
Example
[user@admin ~]$ ssh-copy-id ceph-admin@ceph-mon01
Create and edit the
~/.ssh/configfile.ImportantBy creating and editing the
~/.ssh/configfile you do not have to specify the-u $USER_NAMEoption each time you execute theansible-playbookcommand.Create the SSH
configfile:[user@admin ~]$ touch ~/.ssh/config
Open the
configfile for editing. Set theHostnameandUseroptions for each node in the storage cluster:Host node1 Hostname $HOST_NAME User $USER_NAME Host node2 Hostname $HOST_NAME User $USER_NAME ...
- Replace
-
$HOST_NAMEwith the host name of the Ceph node. -
$USER_NAMEwith the new user name for the Ansible user.
-
Example
Host node1 Hostname monitor User admin Host node2 Hostname osd User admin Host node3 Hostname gateway User admin
Set the correct file permissions for the
~/.ssh/configfile:[admin@admin ~]$ chmod 600 ~/.ssh/config
Additional Resources
-
The
ssh_config(5)manual page - The OpenSSH chapter in the System Administrator’s Guide for Red Hat Enterprise Linux 7
1.1.4. Configuring a firewall for Red Hat Ceph Storage
Red Hat Ceph Storage (RHCS) uses the firewalld service.
The Monitor daemons use port 6789 for communication within the Ceph storage cluster.
On each Ceph OSD node, the OSD daemons use several ports in the range 6800-7300:
- One for communicating with clients and monitors over the public network
- One for sending data to other OSDs over a cluster network, if available; otherwise, over the public network
- One for exchanging heartbeat packets over a cluster network, if available; otherwise, over the public network
The Ceph Manager (ceph-mgr) daemons use ports in range 6800-7300. Consider colocating the ceph-mgr daemons with Ceph Monitors on same nodes.
The Ceph Metadata Server nodes (ceph-mds) use port 6800.
The Ceph Object Gateway nodes use port 7480 by default. However, you can change the default port, for example to port 80.
To use the SSL/TLS service, open port 443.
Prerequisite
- Network hardware is connected.
Procedure
On all RHCS nodes, start the
firewalldservice. Enable it to run on boot, and ensure that it is running:# systemctl enable firewalld # systemctl start firewalld # systemctl status firewalld
On all Monitor nodes, open port
6789on the public network:[root@monitor ~]# firewall-cmd --zone=public --add-port=6789/tcp [root@monitor ~]# firewall-cmd --zone=public --add-port=6789/tcp --permanent
To limit access based on the source address:
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="6789" accept"
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="6789" accept" --permanent
- Replace
-
$IP_ADDRwith the network address of the Monitor node. -
$NETMASK_PREFIXwith the netmask in CIDR notation.
-
Example
[root@monitor ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.11/24" port protocol="tcp" \ port="6789" accept"
[root@monitor ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.11/24" port protocol="tcp" \ port="6789" accept" --permanent
On all OSD nodes, open ports
6800-7300on the public network:[root@osd ~]# firewall-cmd --zone=public --add-port=6800-7300/tcp [root@osd ~]# firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
If you have a separate cluster network, repeat the commands with the appropriate zone.
On all Ceph Manager (
ceph-mgr) nodes (usually the same nodes as Monitor ones), open ports6800-7300on the public network:[root@monitor ~]# firewall-cmd --zone=public --add-port=6800-7300/tcp [root@monitor ~]# firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
If you have a separate cluster network, repeat the commands with the appropriate zone.
On all Ceph Metadata Server (
ceph-mds) nodes, open port6800on the public network:[root@monitor ~]# firewall-cmd --zone=public --add-port=6800/tcp [root@monitor ~]# firewall-cmd --zone=public --add-port=6800/tcp --permanent
If you have a separate cluster network, repeat the commands with the appropriate zone.
On all Ceph Object Gateway nodes, open the relevant port or ports on the public network.
To open the default port
7480:[root@gateway ~]# firewall-cmd --zone=public --add-port=7480/tcp [root@gateway ~]# firewall-cmd --zone=public --add-port=7480/tcp --permanent
To limit access based on the source address:
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="7480" accept"
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="7480" accept" --permanent
- Replace
-
$IP_ADDRwith the network address of the object gateway node. -
$NETMASK_PREFIXwith the netmask in CIDR notation.
-
Example
[root@gateway ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.31/24" port protocol="tcp" \ port="7480" accept"
[root@gateway ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.31/24" port protocol="tcp" \ port="7480" accept" --permanent
Optional. If you changed the default Ceph Object Gateway port, for example, to port
80, open this port:[root@gateway ~]# firewall-cmd --zone=public --add-port=80/tcp [root@gateway ~]# firewall-cmd --zone=public --add-port=80/tcp --permanent
To limit access based on the source address, run the following commands:
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="80" accept"
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="80" accept" --permanent
- Replace
-
$IP_ADDRwith the network address of the object gateway node. -
$NETMASK_PREFIXwith the netmask in CIDR notation.
-
Example
[root@gateway ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.31/24" port protocol="tcp" \ port="80" accept"
[root@gateway ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.31/24" port protocol="tcp" \ port="80" accept" --permanent
Optional. To use SSL/TLS, open port
443:[root@gateway ~]# firewall-cmd --zone=public --add-port=443/tcp [root@gateway ~]# firewall-cmd --zone=public --add-port=443/tcp --permanent
To limit access based on the source address, run the following commands:
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="443" accept"
firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="$IP_ADDR/$NETMASK_PREFIX" port protocol="tcp" \ port="443" accept" --permanent
- Replace
-
$IP_ADDRwith the network address of the object gateway node. -
$NETMASK_PREFIXwith the netmask in CIDR notation.
-
Example
[root@gateway ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.31/24" port protocol="tcp" \ port="443" accept" [root@gateway ~]# firewall-cmd --zone=public --add-rich-rule="rule family="ipv4" \ source address="192.168.0.31/24" port protocol="tcp" \ port="443" accept" --permanent
Additional Resources
- For more information about public and cluster network, see Verifying the Network Configuration for Red Hat Ceph Storage.
-
For additional details on
firewalld, see the Using Firewalls chapter in the Security Guide for Red Hat Enterprise Linux 7.
1.2. Installing a Red Hat Ceph Storage Cluster in Containers
Use the Ansible application with the ceph-ansible playbook to install Red Hat Ceph Storage 3 in containers.
A Ceph cluster used in production usually consists of ten or more nodes. To deploy Red Hat Ceph Storage as a container image, Red Hat recommends to use a Ceph cluster that consists of at least three OSD and three Monitor nodes.
Ceph can run with one monitor; however, to ensure high availability in a production cluster, Red Hat will only support deployments with at least three monitor nodes.
Prerequisites
On the Ansible administration node, enable the Red Hat Ceph Storage 3 Tools repository and Ansible repository:
[root@admin ~]# subscription-manager repos --enable=rhel-7-server-rhceph-3-tools-rpms --enable=rhel-7-server-ansible-2.4-rpms
On the Ansible administration node, install the
ceph-ansiblepackage:[root@admin ~]# yum install ceph-ansible
Procedure
Use the following commands from the Ansible administration node if not instructed otherwise.
In the user’s home directory, create the
ceph-ansible-keysdirectory where Ansible stores temporary values generated by theceph-ansibleplaybook.[user@admin ~]$ mkdir ~/ceph-ansible-keys
Create a symbolic link to the
/usr/share/ceph-ansible/group_varsdirectory in the/etc/ansible/directory:[root@admin ~]# ln -s /usr/share/ceph-ansible/group_vars /etc/ansible/group_vars
Navigate to the
/usr/share/ceph-ansible/directory:[user@admin ~]$ cd /usr/share/ceph-ansible
Create new copies of the
yml.samplefiles:[root@admin ceph-ansible]# cp group_vars/all.yml.sample group_vars/all.yml [root@admin ceph-ansible]# cp group_vars/osds.yml.sample group_vars/osds.yml [root@admin ceph-ansible]# cp site-docker.yml.sample site-docker.yml
Edit the copied files.
Edit the
group_vars/all.ymlfile. See the table below for the most common required and optional parameters to uncomment. Note that the table does not include all parameters.Option Value Required Notes monitor_interfaceThe interface that the Monitor nodes listen to
monitor_interface,monitor_address, ormonitor_address_blockis requiredmonitor_addressThe address that the Monitor nodes listen to
monitor_address_blockThe subnet of the Ceph public network
Use when the IP addresses of the nodes are unknown, but the subnet is known
ip_versionipv6Yes if using IPv6 addressing
journal_sizeThe required size of the journal in MB
No
public_networkThe IP address and netmask of the Ceph public network
Yes
The Verifying the Network Configuration for Red Hat Ceph Storage section in the Installation Guide for Red Hat Enterprise Linux
cluster_networkThe IP address and netmask of the Ceph cluster network
No
ceph_docker_imagerhceph/rhceph-3-rhel7, orcephimageinlocalregif using a local Docker registryYes
containerized_deploymenttrueYes
ceph_docker_registryregistry.access.redhat.com, or<local-host-fqdn>if using a local Docker registryYes
An example of the
all.ymlfile can look like:monitor_interface: eth0 journal_size: 5120 monitor_interface: eth0 public_network: 192.168.0.0/24 ceph_docker_image: rhceph/rhceph-3-rhel7 containerized_deployment: true ceph_docker_registry: registry.access.redhat.com
For additional details, see the
all.ymlfile.Edit the
group_vars/osds.ymlfile. See the table below for the most common required and optional parameters to uncomment. Note that the table does not include all parameters.Table 1.1. OSD Ansible Settings
Option Value Required Notes osd_scenariocollocatedto use the same device for journal and OSD datanon-collocatedto use a dedicated device to store journal datalvmto use the Logical Volume Manager to store OSD dataYes
When using
osd_scenario: non-collocated,ceph-ansibleexpects the variablesdevicesanddedicated_devicesto match. For example, if you specify 10 disks indevices, you must specify 10 entries indedicated_devices. Currently, Red Hat only supports dedicated journals when usingosd_scenario: lvm, not collocated journals.osd_auto_discoverytrueto automatically discover OSDsYes if using
osd_scenario: collocatedCannot be used when
devicessetting is useddevicesList of devices where
ceph datais storedYes to specify the list of devices
Cannot be used when
osd_auto_discoverysetting is useddedicated_devicesList of dedicated devices for non-collocated OSDs where
ceph journalis storedYes if
osd_scenario: non-collocatedShould be nonpartitioned devices
dmcrypttrueto encrypt OSDsNo
Defaults to
falselvm_volumesa list of dictionaries
Yes if using
osd_scenario: lvmEach dictionary must contain a
data,journalanddata_vgkeys. Thedatakey must be a logical volume. Thejournalkey can be a logical volume (LV), device or partition, but do not use one journal for multipledataLVs. Thedata_vgkey must be the volume group containing thedataLV. Optionally, thejournal_vgkey can be used to specify the volume group containing the journal LV, if applicable.The following are examples of the
osds.ymlfile using these threeosd_scenario::collocated,non-collocated, andlvm.osd_scenario: non-collocated devices: - /dev/sda - /dev/sdb - /dev/sdc - /dev/sdd dedicated_devices: - /dev/nvme0n1 - /dev/nvme0n1 - /dev/nvme0n1 - /dev/nvme0n1
osd_scenario: non-collocated devices: - /dev/sda - /dev/sdb - /dev/sdc - /dev/sdd dedicated_devices: - /dev/nvme0n1 - /dev/nvme0n1 - /dev/nvme0n1 - /dev/nvme0n1
osd_scenario: lvm lvm_volumes: - data: data-lv1 data_vg: vg1 journal: journal-lv1 journal_vg: vg2 - data: data-lv2 journal: /dev/sda data_vg: vg1For additional details, see the comments in the
osds.ymlfile.NoteCurrently,
ceph-ansibledoes not create the volume groups or the logical volumes. This must be done before running the Anisble playbook.
Edit the Ansible inventory file located by default at
/etc/ansible/hosts. Remember to comment out example hosts.Add the Monitor nodes under the
[mons]section:[mons] <monitor-host-name> <monitor-host-name> <monitor-host-name>
Add OSD nodes under the
[osds]section. If the nodes have sequential naming, consider using a range:[osds] <osd-host-name[1:10]>
Alternatively, you can colocate Monitors with the OSD daemons on one node by adding the same node under the
[mons]and[osds]sections. See Chapter 2, Colocation of Containerized Ceph Daemons for details.Add the Ceph Manager (
ceph-mgr) nodes under the[mgrs]section. Colocate the Ceph Manager daemon with Monitor nodes.[mgrs] <monitor-host-name> <monitor-host-name> <monitor-host-name>
As the Ansible user, ensure that Ansible can reach the Ceph hosts:
[user@admin ~]$ ansible all -m ping
As
root, create the/var/log/ansible/directory and assign the appropriate permissions for theansibleuser:[root@admin ceph-ansible]# mkdir /var/log/ansible [root@admin ceph-ansible]# chown ansible:ansible /var/log/ansible [root@admin ceph-ansible]# chmod 755 /var/log/ansible
Edit the
/usr/share/ceph-ansible/ansible.cfgfile, updating thelog_pathvalue as follows:log_path = /var/log/ansible/ansible.log
As the Ansible user, run the
ceph-ansibleplaybook.[user@admin ceph-ansible]$ ansible-playbook site-docker.yml
NoteIf you deploy Red Hat Ceph Storage to Red Hat Enterprise Linux Atomic Host hosts, use the
--skip-tags=with_pkgoption:[user@admin ceph-ansible]$ ansible-playbook --skip-tags=with_pkg site-docker.yml
From a Monitor node, verify the status of the Ceph cluster.
docker exec ceph-<mon|mgr>-<id> ceph health
Replace:
-
<id>with the host name of the Monitor node:
For example:
[root@monitor ~]# docker exec ceph-mon-mon0 ceph health HEALTH_OK
NoteIn addition to verifying the cluster status, you can use the
ceph-medicutility to overall diagnose the Ceph Storage Cluster. See the Installing and Usingceph-medicto Diagnose a Ceph Storage Cluster chapter in the Red Hat Ceph Storage 3 Troubleshooting Guide.-
1.3. Installing the Ceph Object Gateway in a Container
Use the Ansible application with the ceph-ansible playbook to install the Ceph Object Gateway in a container.
Prerequisites
- A working Red Hat Ceph Storage cluster. See Section 1.2, “Installing a Red Hat Ceph Storage Cluster in Containers” for details.
Procedure
Use the following commands from the Ansible administration node.
Navigate to the
/usr/share/ceph-ansible/directory.[user@admin ~]$ cd /usr/share/ceph-ansible/
Uncomment the
radosgw_interfaceparameter in thegroup_vars/all.ymlfile.radosgw_interface: <interface>
Replace:
-
<interface>with the interface that the Ceph Object Gateway nodes listen to
For additional details, see the
all.ymlfile.-
Create a new copy of the
rgws.yml.samplefile located in thegroup_varsdirectory.[root@admin ceph-ansible]# cp group_vars/rgws.yml.sample group_vars/rgws.yml
-
Optional. Edit the
group_vars/rgws.ymlfile. For additional details, see thergws.ymlfile. Add the host name of the Ceph Object Gateway node to the
[rgws]section of the Ansible inventory file located by default at/etc/ansible/hosts.[rgws] gateway01Alternatively, you can colocate the Ceph Object Gateway with the OSD daemon on one node by adding the same node under the
[osds]and[rgws]sections. See Chapter 2, Colocation of Containerized Ceph Daemons for details.Run the
ceph-ansibleplaybook.[user@admin ceph-ansible]$ ansible-playbook site-docker.yml --limit rgws
NoteIf you deploy Red Hat Ceph Storage to Red Hat Enterprise Linux Atomic Host hosts, use the
--skip-tags=with_pkgoption:[user@admin ceph-ansible]$ ansible-playbook --skip-tags=with_pkg site-docker.yml
Verify that the Ceph Object Gateway node was deployed successfully.
Connect to a Monitor node:
ssh <hostname>
Replace
<hostname>with the host name of the Monitor node, for example:[user@admin ~]$ ssh root@monitor
Verify that the Ceph Object Gateway pools were created properly:
[root@monitor ~]# docker exec ceph-mon-mon1 rados lspools rbd cephfs_data cephfs_metadata .rgw.root default.rgw.control default.rgw.data.root default.rgw.gc default.rgw.log default.rgw.users.uid
From any client on the same network as the Ceph cluster, for example the Monitor node, use the
curlcommand to send an HTTP request on port 8080 using the IP address of the Ceph Object Gateway host:curl http://<ip-address>:8080
Replace:
-
<ip-address>with the IP address of the Ceph Object Gateway node. To determine the IP address of the Ceph Object Gateway host, use theifconfigoripcommands.
-
List buckets:
[root@monitor ~]# docker exec ceph-mon-mon1 radosgw-admin bucket list
Additional Resources
- The Red Hat Ceph Storage 3 Ceph Object Gateway Guide for Red Hat Enterprise Linux
-
Section 1.5, “Understanding the
limitoption”
1.4. Installing Metadata Servers
Use the Ansible automation application to install a Ceph Metadata Server (MDS). Metadata Server daemons are necessary for deploying a Ceph File System.
Procedure
Perform the following steps on the Ansible administration node.
Add a new section
[mdss]to the/etc/ansible/hostsfile:[mdss] <hostname> <hostname> <hostname>
Replace
<hostname>with the host names of the nodes where you want to install the Ceph Metadata Servers.Alternatively, you can colocate the Metadata Server with the OSD daemon on one node by adding the same node under the
[osds]and[mdss]sections. See Chapter 2, Colocation of Containerized Ceph Daemons for details.Navigate to the
/usr/share/ceph-ansibledirectory:[root@admin ~]# cd /usr/share/ceph-ansible
Create a copy of the
group_vars/mdss.yml.samplefile namedmdss.yml:[root@admin ceph-ansible]# cp group_vars/mdss.yml.sample group_vars/mdss.yml
-
Optionally, edit parameters in
mdss.yml. Seemdss.ymlfor details. Run the Ansible playbook:
[user@admin ceph-ansible]$ ansible-playbook site-docker.yml --limit mdss
- After installing Metadata Servers, configure them. For details, see the Configuring Metadata Server Daemons chapter in the Ceph File System Guide for Red Hat Ceph Storage 3.
Additional Resources
- The Ceph File System Guide for Red Hat Ceph Storage 3
-
Section 1.5, “Understanding the
limitoption”
1.5. Understanding the limit option
This section contains information about the Ansible --limit option.
Ansible supports the --limit option that enables you to use the site, site-docker, and rolling_upgrade Ansible playbooks for a particular section of the inventory file.
$ ansible-playbook site.yml|rolling_upgrade.yml|site-docker.yml --limit osds|rgws|clients|mdss|nfss
For example, to redeploy only OSDs:
$ ansible-playbook /usr/share/ceph-ansible/site.yml --limit osds
If you colocate Ceph components on one node, Ansible applies a playbook to all components on the node despite that only one component type was specified with the limit option. For example, if you run the rolling_update playbook with the --limit osds option on a node that contains OSDs and Metadata Servers (MDS), Ansible will upgrade both components, OSDs and MDSs.
1.6. Additional Resources
- The Getting Started with Containers guide for Red Hat Enterprise Linux Atomic Host
Chapter 2. Colocation of Containerized Ceph Daemons
This section describes:
2.1. How colocation works and its advantages
You can colocate containerized Ceph daemons on the same node. Here are the advantages of colocating some of Ceph’s services:
- Significant improvement in total cost of ownership (TCO) at small scale
- Reduction from six nodes to three for the minimum configuration
- Easier upgrade
- Better resource isolation
How Colocation Works
You can colocate one daemon from the following list with an OSD daemon by adding the same node to appropriate sections in the Ansible inventory file.
-
The Ceph Object Gateway (
radosgw) - Metadata Server (MDS)
-
RBD mirror (
rbd-mirror) -
Monitor and the Ceph Manager daemon (
ceph-mgr) - NFS Ganesha
The following example shows how the inventory file with colocated daemons can look like:
Example 2.1. Ansible inventory file with colocated daemons
[mons] <hostname1> <hostname2> <hostname3> [mgrs] <hostname1> <hostname2> <hostname3> [osds] <hostname4> <hostname5> <hostname6> [rgws] <hostname4> <hostname5>
The Figure 2.1, “Colocated Daemons” and Figure 2.2, “Non-colocated Daemons” images shows the difference between clusters with colocated and non-colocated daemons.
Figure 2.1. Colocated Daemons

Figure 2.2. Non-colocated Daemons

When you colocate two containerized Ceph daemons on a same node, the ceph-ansible playbook reserves dedicated CPU and RAM resources to each. By default, ceph-ansible uses values listed in the Recommended Minimum Hardware chapter in the Red Hat Ceph Storage Hardware Selection Guide 3. To learn how to change the default values, see the Setting Dedicated Resources for Colocated Daemons section.
2.2. Setting Dedicated Resources for Colocated Daemons
When colocating two Ceph daemon on a same node, the ceph-ansible playbook reserves CPU and RAM resources to each. By default, ceph-ansible uses values listed in the {hardware_guide}#ceph-hardware-min-recommend[Recommended Minimum Hardware] chapter in the Red Hat Ceph Storage Hardware Selection Guide. This section describes how to change the default values.
Procedure
To change the default RAM and CPU limit for a daemon, set the
ceph_<daemon-type>_docker_memory_limitandceph_<daemon-type>_docker_cpu_limitparameters in the appropriate.ymlconfiguration file when deploying the daemon.For example, to change the default RAM limit to 2 GB and the CPU limit to 2 for the Ceph Object Gateway, edit the
/usr/share/ansible/group_vars/rgws.ymlfile as follows:ceph_rgw_docker_memory_limit: 2g ceph_rgw_docker_cpu_limit: 2
Additional Resources
-
The sample configuration files in the
/usr/share/ansible/group_vars/directory
2.3. Additional Resources
Chapter 3. Administering Ceph Clusters That Run in Containers
This chapter describes basic administration tasks to perform on Ceph clusters that run in containers, such as:
3.1. Starting, Stopping, and Restarting Ceph Daemons That Run in Containers
This section describes how to start, stop, or restart Ceph daemons that run in containers
Procedure
To start, stop, or restart a Ceph daemon running in a container:
systemctl <action> ceph-<daemon>@<ID>
Where:
-
<action>is the action to perform;start,stop, orrestart -
<daemon>is the daemon;osd,mon,mds, orrgw <ID>is either-
The device name that the
ceph-osddaemon uses -
The short host name where the
ceph-mon,ceph-mds, orceph-rgwdaemons are running
-
The device name that the
For example, to restart a
ceph-osddaemon that uses the/dev/sdbdevice:# systemctl restart ceph-osd@sdb
To start a
ceph-mondemon that runs on theceph-monitor01host:# systemctl start ceph-mon@ceph-monitor01
To stop a
ceph-rgwdaemon that runs on theceph-rgw01host:# systemctl stop ceph-rgw@ceph-rgw01
-
Additional Resources
- The Running Ceph as a systemd Service section in the Administration Guide for Red Hat Ceph Storage 3.
3.2. Viewing Log Files of Ceph Daemons That Run in Containers
Use the journald daemon from the container host to view a log file of a Ceph daemon from a container.
Procedure: Viewing Log Files of Ceph Daemons That Run in Containers
To view the entire Ceph log file.
journalctl -u ceph-<daemon>@<ID>
Where:
-
<daemon>is the Ceph daemon;osd,mon, orrgw <ID>is either-
The device name that the
ceph-osddaemon uses -
The short host name where the
ceph-monorceph-rgwdaemons are running
-
The device name that the
For example, to view the entire log for the
ceph-osddaemon that uses the/dev/sdbdevice:# journalctl -u ceph-osd@sdb
-
To show only the recent journal entries, use the
-foption.journalctl -fu ceph-<daemon>@<ID>
For example, to view only recent journal entries for the
ceph-mondaemon that runs on theceph-monitor01host:# journalctl -fu ceph-mon@ceph-monitor01
You can also use the sosreport utility to view the journald logs. For more details about SOS reports, see the What is a sosreport and how to create one in Red Hat Enterprise Linux 4.6 and later? solution on the Red Hat Customer Portal.
Additional Resources
-
The
journalctl(1)manual page
3.3. Purging Clusters Deployed by Ansible
If you no longer want to use a Ceph cluster, use the purge-docker-cluster.yml playbook to purge the cluster. Purging a cluster is also useful when the installation process failed and you want to start over.
After purging a Ceph cluster, all data on the OSDs are lost.
Prerequisites
-
Ensure that the
/var/log/ansible.logfile is writable.
Procedure
Use the following commands from the Ansible administration node.
Navigate to the
/usr/share/ceph-ansible/directory.[user@admin ~]$ cd /usr/share/ceph-ansible
Copy the
purge-cluster.ymlplaybook from the/usr/share/infrastructure-playbooks/directory to the current directory:[root@admin ceph-ansible]# cp infrastructure-playbooks/purge-docker-cluster.yml .
Use the
purge-docker-cluster.ymlplaybook to purge the Ceph cluster.To remove all packages, containers, configuration files, and all the data created by the
ceph-ansibleplaybook:[user@admin ceph-ansible]$ ansible-playbook purge-docker-cluster.yml
To specify a different inventory file than the default one (
/etc/ansible/hosts), use-iparameter:ansible-playbook purge-docker-cluster.yml -i [inventory-file]
Replace
[inventory-file]with the path to the inventory file.For example:
[user@admin ceph-ansible]$ ansible-playbook purge-docker-cluster.yml -i ~/ansible/hosts
To skip the removal of the Ceph container image, use the
--skip-tags=”remove_img”option:[user@admin ceph-ansible]$ ansible-playbook --skip-tags="remove_img" purge-docker-cluster.yml
To skip the removal of the packages that were installed during the installation, use the
--skip-tags=”with_pkg”option:[user@admin ceph-ansible]$ ansible-playbook --skip-tags="with_pkg" purge-docker-cluster.yml
3.4. Upgrading a Red Hat Ceph Storage Cluster That Runs in Containers
This section describes how to upgrade to a newer minor or major version of the Red Hat Ceph Storage container image.
Please contact Red Hat support prior to upgrading, if you have a large Ceph Object Gateway storage cluster with millions of objects present in buckets.
For more details refer to the Red Hat Ceph Storage 3.0 Release Notes, under the Slow OSD startup after upgrading to Red Hat Ceph Storage 3.0 heading.
Use the Ansible rolling_update.yml playbook located in the /usr/share/ceph-ansible/infrastructure-playbooks/ directory from the administration node to upgrade between two major or minor versions of Red Hat Ceph Storage, or to apply asynchronous updates.
Ansible upgrades the Ceph nodes in the following order:
- Monitor nodes
- MGR nodes
- OSD nodes
- MDS nodes
- Ceph Object Gateway nodes
- All other Ceph client nodes
Red Hat Ceph Storage 3 introduces several changes in Ansible configuration files located in the /usr/share/ceph-ansible/group_vars/ directory; certain parameters were renamed or removed. Therefore, make backup copies of the all.yml and osds.yml files before creating new copies from the all.yml.sample and osds.yml.sample files after upgrading to version 3. For more details about the changes, see Appendix A, Changes in Ansible Variables Between Version 2 and 3.
Red Hat Ceph Storage 3.1 introduces new Ansible playbooks to optimize storage for performance when using Object Gateway and high speed NVMe based SSDs (and SATA SSDs). The playbooks do this by placing journals and bucket indexes together on SSDs, which can increase performance compared to having all journals on one device. These playbooks are designed to be used when installing Ceph. Existing OSDs continue to work and need no extra steps during an upgrade. There is no way to upgrade a Ceph cluster while simultaneously reconfiguring OSDs to optimize storage in this way. To use different devices for journals or bucket indexes requires reprovisioning OSDs. For more information see Using NVMe with LVM optimally in Ceph Object Gateway for Production.
The rolling_update.yml playbook includes the serial variable that adjusts the number of nodes to be updated simultaneously. Red Hat strongly recommends to use the default value (1), which ensures that Ansible will upgrade cluster nodes one by one.
When using the rolling_update.yml playbook to upgrade to Red Hat Ceph Storage 3.0 and from version 3.0 to other zStream releases of 3.0, users who use the Ceph File System (CephFS) must manually update the Metadata Server (MDS) cluster. This is due to a known issue.
Comment the MDS hosts in /etc/ansible/hosts before upgrading the entire cluster using ceph-ansible rolling-upgrade.yml, and then upgrade MDS manually. In the /etc/ansible/hosts file:
#[mdss] #host-abc
For more details about this known issue, including how to update the MDS cluster, refer to the Red Hat Ceph Storage 3.0 Release Notes.
Prerequisites
On all nodes in the cluster, enable the
rhel-7-server-extras-rpmsrepository.# subscription-manager repos --enable=rhel-7-server-extras-rpms
If upgrading from Red Hat Ceph Storage 2.x to 3.x, on the Ansible administration node and the RBD mirroring node, enable the Red Hat Ceph Storage 3 Tools repository and Ansible repository:
[root@admin ~]# subscription-manager repos --enable=rhel-7-server-rhceph-3-tools-rpms --enable=rhel-7-server-ansible-2.4-rpms
If upgrading from Red Hat Ceph Storage 3.0 to 3.1 and using Red Hat Ceph Storage Dashboard, before upgrading the cluster, purge the old cephmetrics installation from the cluster. This avoids an issue where the dashboard won’t display data after upgrade.
If the
cephmetrics-ansiblepackage isn’t already updated, update it:[root@admin ~]# yum update cephmetrics-ansible
Change to the /usr/share/cephmetrics-ansible/ directory.
[root@admin ~]# cd /usr/share/cephmetrics-ansible
Purge the existing cephmetrics installation.
[root@admin cephmetrics-ansible]# ansible-playbook -v purge.yml
Install the updated Red Hat Ceph Storage Dashboard
[root@admin cephmetrics-ansible]# ansible-playbook -v playbook.yml
On the Ansible administration node, ensure the latest version of the
ansibleandceph-ansiblepackages are installed.[root@admin ~]# yum update ansible ceph-ansible
Procedure
Use the following commands from the Ansible administration node.
Navigate to the
/usr/share/ceph-ansible/directory:[user@admin ~]$ cd /usr/share/ceph-ansible/
Back up the
group_vars/all.ymlandgroup_vars/osds.ymlfiles. Skip this step when upgrading from version 3.x to the latest version.[root@admin ceph-ansible]# cp group_vars/all.yml group_vars/all_old.yml [root@admin ceph-ansible]# cp group_vars/osds.yml group_vars/osds_old.yml
Create new copies of the
group_vars/all.yml.sampleandgroup_vars/osds.yml.samplenamedgroup_vars/all.ymlandgroup_vars/osds.ymlrespectively and edit them according to you deployment. Skip this step when upgrading from version 3.x to the latest version. For details, see Appendix A, Changes in Ansible Variables Between Version 2 and 3 and Section 1.2, “Installing a Red Hat Ceph Storage Cluster in Containers” .[root@admin ceph-ansible]# cp group_vars/all.yml.sample group_vars/all.yml [root@admin ceph-ansible]# cp group_vars/osds.yml.sample group_vars/osds.yml
When upgrading from 2.x to 3.x, in the
group_vars/all.ymlfile change theceph_docker_imageparameter to point to the Ceph 3 container version.ceph_docker_image: rhceph/rhceph-3-rhel7
Add the
fetch_directoryparameter to thegroup_vars/all.ymlfile.fetch_directory: <full_directory_path>
Replace:
-
<full_directory_path>with a writable location, such as the Ansible user’s home directory.
-
If the cluster you want to upgrade contains any Ceph Object Gateway nodes, add the
radosgw_interfaceparameter to thegroup_vars/all.ymlfile.radosgw_interface: <interface>
Replace:
-
<interface>with the interface that the Ceph Object Gateway nodes listen to.
-
In the Ansible inventory file located at
/etc/ansible/hosts, add the Ceph Manager (ceph-mgr) nodes under the[mgrs]section. Colocate the Ceph Manager daemon with Monitor nodes. Skip this step when upgrading from version 3.x to the latest version.[mgrs] <monitor-host-name> <monitor-host-name> <monitor-host-name>
Copy
rolling_update.ymlfrom theinfrastructure-playbooksdirectory to the current directory.[root@admin ceph-ansible]# cp infrastructure-playbooks/rolling_update.yml .
Create the
/var/log/ansible/directory and assign the appropriate permissions for theansibleuser:[root@admin ceph-ansible]# mkdir /var/log/ansible [root@admin ceph-ansible]# chown ansible:ansible /var/log/ansible [root@admin ceph-ansible]# chmod 755 /var/log/ansible
Edit the
/usr/share/ceph-ansible/ansible.cfgfile, updating thelog_pathvalue as follows:log_path = /var/log/ansible/ansible.log
Run the playbook:
[user@admin ceph-ansible]$ ansible-playbook rolling_update.yml
To use the playbook only for a particular group of nodes on the Ansible inventory file, use the
--limitoption. For details, see Section 1.5, “Understanding thelimitoption”.From the RBD mirroring daemon node, upgrade
rbd-mirrormanually:# yum upgrade rbd-mirror
Restart the daemon:
# systemctl restart ceph-rbd-mirror@<client-id>
Verify that the cluster health is OK.
From a monitory node, list all running containers.
[root@monitor ~]# docker ps
Verify that the cluster health is OK.
[root@monitor ~]# docker exec ceph-mon-<mon-id> ceph -s
Replace:
-
<mon-id>with the name of the Monitor container found in the first step.
For example:
[root@monitor ~]# docker exec ceph-mon-monitor ceph -s
-
Chapter 4. Monitoring Ceph Clusters Running in Containers with the Red Hat Ceph Storage Dashboard
The Red Hat Ceph Storage Dashboard provides a monitoring dashboard to visualize the state of a Ceph Storage Cluster. Also, the Red Hat Ceph Storage Dashboard architecture provides a framework for additional modules to add functionality to the storage cluster.
- To learn about the Dashboard, see Section 4.1, “The Red Hat Ceph Storage Dashboard”.
- To install the Dashboard, see Section 4.2, “Installing the Red Hat Ceph Storage Dashboard”.
- To access the Dashboard, see Section 4.3, “Accessing the Red Hat Ceph Storage Dashboard”.
- To change the default password after installing the Dashboard, see Section 4.4, “Changing the default Red Hat Ceph Storage dashboard password”.
- To learn about the Prometheus plugin, see Section 4.5, “The Prometheus plugin for Red Hat Ceph Storage”.
- To learn about the Red Hat Ceph Storage Dashboard alerts and how to configure them, see Section 4.6, “The Red Hat Ceph Storage Dashboard alerts”.
Prerequisites
- A Red Hat Ceph Storage cluster running in containers
4.1. The Red Hat Ceph Storage Dashboard
The Red Hat Ceph Storage Dashboard provides a monitoring dashboard for Ceph clusters to visualize the storage cluster state. The dashboard is accessible from a web browser and provides a number of metrics and graphs about the state of the cluster, Monitors, OSDs, Pools, or the network.
With the previous releases of Red Hat Ceph Storage, monitoring data was sourced through a collectd plugin, which sent the data to an instance of the Graphite monitoring utility. Starting with Red Hat Ceph Storage 3.1, monitoring data is sourced directly from the ceph-mgr daemon, using the ceph-mgr Prometheus plugin.
The introduction of Prometheus as the monitoring data source simplifies deployment and operational management of the Red Hat Ceph Storage Dashboard solution, along with reducing the overall hardware requirements. By sourcing the Ceph monitoring data directly, the Red Hat Ceph Storage Dashboard solution is better able to support Ceph clusters deployed in containers.
With this change in architecture, there is no migration path for monitoring data from Red Hat Ceph Storage 2.x and 3.0 to Red Hat Ceph Storage 3.1.
The Red Hat Ceph Storage Dashboard uses the following utilities:
- The Ansible automation application for deployment.
-
The embedded Prometheus
ceph-mgrplugin. -
The Prometheus
node-exporterdaemon, running on each node of the storage cluster. - The Grafana platform to provide a user interface and alerting.
The Red Hat Ceph Storage Dashboard supports the following features:
- General Features
- Support for Red Hat Ceph Storage 3.1 and higher
- SELinux support
- Support for FileStore and BlueStore OSD back ends
- Support for encrypted and non-encrypted OSDs
- Support for Monitor, OSD, the Ceph Object Gateway, and iSCSI roles
- Initial support for the Metadata Servers (MDS)
- Drill down and dashboard links
- 15 second granularity
- Support for Hard Disk Drives (HDD), Solid-state Drives (SSD), Non-volatile Memory Express (NVMe) interface, and Intel® Cache Acceleration Software (Intel® CAS)
- Node Metrics
- CPU and RAM usage
- Network load
- Configurable Alerts
- Out-of-Band (OOB) alerts and triggers
- Notification channel is automatically defined during the installation
The Ceph Health Summary dashboard created by default
See the Red Hat Ceph Storage Dashboard Alerts section for details.
- Cluster Summary
- OSD configuration summary
- OSD FileStore and BlueStore summary
- Cluster versions breakdown by role
- Disk size summary
- Host size by capacity and disk count
- Placement Groups (PGs) status breakdown
- Pool counts
- Device class summary, HDD vs. SSD
- Cluster Details
-
Cluster flags status (
noout,nodown, and others) -
OSD or Ceph Object Gateway hosts
upanddownstatus - Per pool capacity usage
- Raw capacity utilization
- Indicators for active scrub and recovery processes
- Growth tracking and forecast (raw capacity)
-
Information about OSDs that are
downornear full, including the OSD host and disk - Distribution of PGs per OSD
- OSDs by PG counts, highlighting the over or under utilized OSDs
-
Cluster flags status (
- OSD Performance
- Information about I/O operations per second (IOPS) and throughput by pool
- OSD performance indicators
- Disk statistics per OSD
- Cluster wide disk throughput
- Read/write ratio (client IOPS)
- Disk utilization heat map
- Network load by Ceph role
- The Ceph Object Gateway Details
- Aggregated load view
- Per host latency and throughput
- Workload breakdown by HTTP operations
- The Ceph iSCSI Gateway Details
- Aggregated views
- Configuration
- Performance
- Per Gateway resource utilization
- Per client load and configuration
- Per Ceph Block Device image performance
4.2. Installing the Red Hat Ceph Storage Dashboard
The Red Hat Ceph Storage Dashboard provides a visual dashboard to monitor various metrics in a running Ceph Storage Cluster.
Prerequisites
- A Ceph Storage cluster running in containers deployed with the Ansible automation application.
The storage cluster nodes use Red Hat Enterprise Linux 7.
For details, see Section 1.1.1, “Registering Red Hat Ceph Storage Nodes to the CDN and Attaching Subscriptions”.
- A separate node, the Red Hat Ceph Storage Dashboard node, for receiving data from the cluster nodes and providing the Red Hat Ceph Storage Dashboard.
Prepare the Red Hat Ceph Storage Dashboard node:
- Register the system with the Red Hat Content Delivery Network (CDN), attach subscriptions, and enable Red Hat Enterprise Linux repositories. For details, see Section 1.1.1, “Registering Red Hat Ceph Storage Nodes to the CDN and Attaching Subscriptions”.
Enable the Tools repository.
[root@admin ~]# subscription-manager repos --enable=rhel-7-server-rhceph-3-tools-rpms --enable=rhel-7-server-ansible-2.4-rpms
If using a firewall, then ensure that the following TCP ports are open:
Table 4.1. TCP Port Requirements
Port Use Where? 3000Grafana
The Red Hat Ceph Storage Dashboard node.
9090Basic Prometheus graphs
The Red Hat Ceph Storage Dashboard node.
9100Prometheus'
node-exporterdaemonAll storage cluster nodes.
9283Gathering Ceph data
All
ceph-mgrnodes.9287Ceph iSCSI gateway data
All Ceph iSCSI gateway nodes.
For more details see the Using Firewalls chapter in the Security Guide for Red Hat Enterprise Linux 7.
Procedure
Use the following commands on the Ansible administration node as the root user.
Install the
cephmetrics-ansiblepackage.[root@admin ~]# yum install cephmetrics-ansible
Using the Ceph Ansible inventory as a base, add the Red Hat Ceph Storage Dashboard node to under the
[ceph-grafana]section of the Ansible inventory file, by default located at/etc/ansible/hosts.[ceph-grafana] $HOST_NAME
Replace:
-
$HOST_NAMEwith the name of the Red Hat Ceph Storage Dashboard node
For example:
[ceph-grafana] node0
-
Change to the
/usr/share/cephmetrics-ansible/directory.[root@admin ~]# cd /usr/share/cephmetrics-ansible
Replace:
-
<password>with the new password
-
+ For example:
+
grafana: admin_password: CGqf5HhUaZ
Use the Ansible playbook.
[root@admin cephmetrics-ansible]# ansible-playbook -v playbook.yml
NoteThe
cephmetricsAnsible playbook does the following actions:-
Updates the
ceph-mgrinstance to enable the prometheus plugin and opens TCP port 9283. Deploys the Prometheus
node-exporterdaemon to each node in the storage cluster.- Opens TCP port 9100.
-
Starts the
node-exporterdaemon.
Deploys Grafana and Prometheus containers under Docker/systemd on the Red Hat Ceph Storage Dashboard node.
- Prometheus is configured to gather data from the ceph-mgr nodes and the node-exporters running on each ceph host
- Opens TCP port 3000.
- The dashboards, themes and user accounts are all created in Grafana.
- Outputs the URL of Grafana for the administrator.
ImportantEvery time you update the cluster configuration, for example, you add a MON or OSD node, you must re-run the
cephmetricsAnsible playbook.-
Updates the
4.3. Accessing the Red Hat Ceph Storage Dashboard
Accessing the Red Hat Ceph Storage Dashboard gives you access to the web-based management tool for administrating Red Hat Ceph Storage clusters.
Prerequisites
Procedure
Enter the following URL to a web browser:
http://$HOST_NAME:3000
Replace:
-
$HOST_NAMEwith the name of the Red Hat Ceph Storage Dashboard node
For example:
http://cephmetrics:3000
-
Enter the password for the
adminuser. If you did not set the password during the installation, useadmin, which is the default password.Once logged in, you are automatically placed on the Ceph At a Glance dashboard. The Ceph At a Glance dashboard provides a high-level overviews of capacity, performance, and node-level performance information.
Example
Additional Resources
- See the Changing the Default Red Hat Ceph Storage Dashboard Password section in the Red Hat Ceph Storage Administration Guide.
4.4. Changing the default Red Hat Ceph Storage dashboard password
The default user name and password for accessing the Red Hat Ceph Storage Dashboard is set to admin and admin. For security reasons, you might want to change the password after the installation.
If you redeploy the Red Hat Ceph Storage dashboard using Ceph Ansible, then the password will be reset to the default value. Update the Ceph Ansible inventory file (/etc/ansible/hosts) with the custom password to prevent the password from resetting to the default value.
Prerequisites
Procedure
- Click the Grafana icon in the upper-left corner.
-
Hover over the user name you want to modify the password for. In this case
admin. -
Click
Profile. -
Click
Change Password. -
Enter the new password twice and click
Change Password.
Additional Resource
- If you forgot the password, follow the Reset admin password procedure on the Grafana web pages.
4.5. The Prometheus plugin for Red Hat Ceph Storage
As a storage administrator, you can gather performance data, export that data using the Prometheus plugin module for the Red Hat Ceph Storage Dashboard, and then perform queries on this data. The Prometheus module allows ceph-mgr to expose Ceph related state and performance data to a Prometheus server.
4.5.1. Prerequisites
- Running Red Hat Ceph Storage 3.1 or higher.
- Installation of the Red Hat Ceph Storage Dashboard.
4.5.2. The Prometheus plugin
The Prometheus plugin provides an exporter to pass on Ceph performance counters from the collection point in ceph-mgr. The Red Hat Ceph Storage Dashboard receives MMgrReport messages from all MgrClient processes, such as Ceph Monitors and OSDs. A circular buffer of the last number of samples contains the performance counter schema data and the actual counter data. This plugin creates an HTTP endpoint and retrieves the latest sample of every counter when polled. The HTTP path and query parameters are ignored; all extant counters for all reporting entities are returned in a text exposition format.
Additional Resources
- See the Prometheus documentation for more details on the text exposition format.
4.5.3. Managing the Prometheus environment
To monitor a Ceph storage cluster with Prometheus you can configure and enable the Prometheus exporter so the metadata information about the Ceph storage cluster can be collected.
Prerequisites
- A running Red Hat Ceph Storage 3.1 cluster
- Installation of the Red Hat Ceph Storage Dashboard
Procedure
Open and edit the
/etc/prometheus/prometheus.ymlfile.Under the
globalsection, set thescrape_intervalandevaluation_intervaloptions to 15 seconds.Example
global: scrape_interval: 15s evaluation_interval: 15s
Under the
scrape_configssection, add thehonor_labels: trueoption, and edit thetargets, andinstanceoptions for each of theceph-mgrnodes.Example
scrape_configs: - job_name: 'node' honor_labels: true static_configs: - targets: [ 'node1.example.com:9100' ] labels: instance: "node1.example.com" - targets: ['node2.example.com:9100'] labels: instance: "node2.example.com"NoteUsing the
honor_labelsoption enables Ceph to output properly-labelled data relating to any node in the Ceph storage cluster. This allows Ceph to export the properinstancelabel without Prometheus overwriting it.To add a new node, simply add the
targets, andinstanceoptions in the following format:Example
- targets: [ 'new-node.example.com:9100' ] labels: instance: "new-node"NoteThe
instancelabel has to match what appears in Ceph’s OSD metadatainstancefield, which is the short host name of the node. This helps to correlate Ceph stats with the node’s stats.
Add Ceph targets to the
/etc/prometheus/ceph_targets.ymlfile in the following format.Example
[ { "targets": [ "cephnode1.example.com:9283" ], "labels": {} } ]Enable the Prometheus module:
# ceph mgr module enable prometheus
4.5.4. Working with the Prometheus data and queries
The statistic names are exactly as Ceph names them, with illegal characters translated to underscores, and ceph_ prefixed to all names. All Ceph daemon statistics have a ceph_daemon label that identifies the type and ID of the daemon they come from, for example: osd.123. Some statistics can come from different types of daemons, so when querying you will want to filter on Ceph daemons starting with osd to avoid mixing in the Ceph Monitor and RocksDB stats. The global Ceph storage cluster statistics have labels appropriate to what they report on. For example, metrics relating to pools have a pool_id label. The long running averages that represent the histograms from core Ceph are represented by a pair of sum and count performance metrics.
The following example queries can be used in the Prometheus expression browser:
Show the physical disk utilization of an OSD
(irate(node_disk_io_time_ms[1m]) /10) and on(device,instance) ceph_disk_occupation{ceph_daemon="osd.1"}
Show the physical IOPS of an OSD as seen from the operating system
irate(node_disk_reads_completed[1m]) + irate(node_disk_writes_completed[1m]) and on (device, instance) ceph_disk_occupation{ceph_daemon="osd.1"}
Pool and OSD metadata series
Special data series are output to enable the displaying and the querying on certain metadata fields. Pools have a ceph_pool_metadata field, for example:
ceph_pool_metadata{pool_id="2",name="cephfs_metadata_a"} 1.0
OSDs have a ceph_osd_metadata field, for example:
ceph_osd_metadata{cluster_addr="172.21.9.34:6802/19096",device_class="ssd",ceph_daemon="osd.0",public_addr="172.21.9.34:6801/19096",weight="1.0"} 1.0Correlating drive statistics with node_exporter
The Prometheus output from Ceph is designed to be used in conjunction with the generic node monitoring from the Prometheus node exporter. Correlation of Ceph OSD statistics with the generic node monitoring drive statistics, special data series are output, for example:
ceph_disk_occupation{ceph_daemon="osd.0",device="sdd", exported_instance="node1"}
To get disk statistics by an OSD ID, use either the and operator or the asterisk (*) operator in the Prometheus query. All metadata metrics have the value of 1 so they act neutral with asterisk operator. Using asterisk operator allows to use group_left and group_right grouping modifiers, so that the resulting metric has additional labels from one side of the query. For example:
rate(node_disk_bytes_written[30s]) and on (device,instance) ceph_disk_occupation{ceph_daemon="osd.0"}Using label_replace
The label_replace function can add a label to, or alter a label of, a metric within a query. To correlate an OSD and its disks write rate, the following query can be used:
label_replace(rate(node_disk_bytes_written[30s]), "exported_instance", "$1", "instance", "(.*):.*") and on (device,exported_instance) ceph_disk_occupation{ceph_daemon="osd.0"}Additional Resources
- See Prometheus querying basics for more information on constructing queries.
-
See Prometheus'
label_replacedocumentation for more information.
4.5.5. Using the Prometheus expression browser
Use the builtin Prometheus expression browser to run queries against the collected data.
Prerequisites
- A running Red Hat Ceph Storage 3.1 cluster
- Installation of the Red Hat Ceph Storage Dashboard
Procedure
Enter the URL for the Prometh the web browser:
http://$DASHBOARD_SEVER_NAME:9090/graph
Replace…
-
$DASHBOARD_SEVER_NAMEwith the name of the Red Hat Ceph Storage Dashboard server.
-
Click on Graph, then type in or paste the query into the query window and press the Execute button.
- View the results in the console window.
- Click on Graph to view the rendered data.
Additional Resources
- See the Prometheus expression browser documentation on the Prometheus web site for more information.
4.5.6. Additional Resources
4.6. The Red Hat Ceph Storage Dashboard alerts
This section includes information about alerting in the Red Hat Ceph Storage Dashboard.
- To learn about the Red Hat Ceph Storage Dashboard alerts, see Section 4.6.2, “About Alerts”.
- To view the alerts, see Section 4.6.3, “Accessing the Alert Status dashboard”.
- To configure the notification target, see Section 4.6.4, “Configuring the Notification Target”.
- To change the default alerts or add new ones, see Section 4.6.5, “Changing the Default Alerts and Adding New Ones”.
4.6.1. Prerequisites
4.6.2. About Alerts
The Red Hat Ceph Storage Dashboard supports alerting mechanism that is provided by the Grafana platform. You can configure the dashboard to send you a notification when a metric that you are interested in reaches certain value. Such metrics are in the Alert Status dashboard.
By default, Alert Status already includes certain metrics, such as Overall Ceph Health, OSDs Down, or Pool Capacity. You can add metrics that you are interested in to this dashboard or change their trigger values.
Here is a list of the pre-defined alerts that are included with Red Hat Ceph Storage Dashboard:
- Overall Ceph Health
- Disks Near Full (>85%)
- OSD Down
- OSD Host Down
- PG’s Stuck Inactive
- OSD Host Less - Free Capacity Check
- OSD’s With High Response Times
- Network Errors
- Pool Capacity High
- Monitors Down
- Overall Cluster Capacity Low
- OSDs With High PG Count
4.6.3. Accessing the Alert Status dashboard
Certain Red Hat Ceph Storage Dashboard alerts are configured by default in the Alert Status dashboard. This section shows two ways to access it.
Procedure
To access the dashboard:
- In the main At the Glance dashboard, click the Active Alerts panel in the upper-right corner.
Or..
- Click the dashboard menu from in the upper-left corner next to the Grafana icon. Select Alert Status.
4.6.4. Configuring the Notification Target
A notification channel called cephmetrics is automatically created during installation. All preconfigured alerts reference the cephmetrics channel but before you can receive the alerts, complete the notification channel definition by selecting the desired notification type. The Grafana platform supports a number of different notification types including email, Slack, and PagerDuty.
Procedure
- To configure the notification channel, follow the instructions in the Alert Notifications section on the Grafana web page.
4.6.5. Changing the Default Alerts and Adding New Ones
This section explains how to change the trigger value on already configured alerts and how to add new alerts to the Alert Status dashboard.
Procedure
To change the trigger value on alerts or to add new alerts, follow the Alerting Engine & Rules Guide on the Grafana web pages.
ImportantTo prevent overriding custom alerts, the Alert Status dashboard will not be updated when upgrading the Red Hat Ceph Storage Dashboard packages when you change the trigger values or add new alerts.
Additional Resources
- The Grafana web page
Appendix A. Changes in Ansible Variables Between Version 2 and 3
With Red Hat Ceph Storage 3, certain variables in the configuration files located in the /usr/share/ceph-ansible/group_vars/ directory have changed or have been removed. The following table lists all the changes. After upgrading to version 3, copy the all.yml.sample and osds.yml.sample files again to reflect these changes. See Section 3.4, “Upgrading a Red Hat Ceph Storage Cluster That Runs in Containers” for details.
| Old Option | New Option | File |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
