Chapter 3. Deployment of the Ceph File System

As a storage administrator, you can deploy Ceph File Systems (CephFS) in a storage environment and have clients mount those Ceph File Systems to meet the storage needs.

Basically, the deployment workflow is three steps:

  1. Create a Ceph File System on a Ceph Monitor node.
  2. Create a Ceph client user with the appropriate capabilities, and make the client key available on the node where the Ceph File System will be mounted.
  3. Mount CephFS on a dedicated node, using either a kernel client or a File System in User Space (FUSE) client.

3.1. Prerequisites

  • A running, and healthy Red Hat Ceph Storage cluster.
  • Installation and configuration of the Ceph Metadata Server daemon (ceph-mds).

3.2. Layout, quota, snapshot, and network restrictions

These user capabilities can help you restrict access to a Ceph File System (CephFS) based on the needed requirements.

Important

All user capability flags, except rw, must be specified in alphabetical order.

Layouts and Quotas

When using layouts or quotas, clients require the p flag, in addition to rw capabilities. Setting the p flag restricts all the attributes being set by special extended attributes, those with a ceph. prefix. Also, this restricts other means of setting these fields, such as openc operations with layouts.

Example

client.0
    key: AQAz7EVWygILFRAAdIcuJ10opU/JKyfFmxhuaw==
    caps: [mds] allow rwp
    caps: [mon] allow r
    caps: [osd] allow rw tag cephfs data=cephfs_a

client.1
    key: AQAz7EVWygILFRAAdIcuJ11opU/JKyfFmxhuaw==
    caps: [mds] allow rw
    caps: [mon] allow r
    caps: [osd] allow rw tag cephfs data=cephfs_a

In this example, client.0 can modify layouts and quotas on the file system cephfs_a, but client.1 cannot.

Snapshots

When creating or deleting snapshots, clients require the s flag, in addition to rw capabilities. When the capability string also contains the p flag, the s flag must appear after it.

Example

client.0
    key: AQAz7EVWygILFRAAdIcuJ10opU/JKyfFmxhuaw==
    caps: [mds] allow rw, allow rws path=/temp
    caps: [mon] allow r
    caps: [osd] allow rw tag cephfs data=cephfs_a

In this example, client.0 can create or delete snapshots in the temp directory of file system cephfs_a.

Network

Restricting clients connecting from a particular network.

Example

client.0
  key: AQAz7EVWygILFRAAdIcuJ10opU/JKyfFmxhuaw==
  caps: [mds] allow r network 10.0.0.0/8, allow rw path=/bar network 10.0.0.0/8
  caps: [mon] allow r network 10.0.0.0/8
  caps: [osd] allow rw tag cephfs data=cephfs_a network 10.0.0.0/8

The optional network and prefix length is in CIDR notation, for example, 10.3.0.0/16.

Additional Resources

3.3. Creating a Ceph File System

You can create a Ceph File System (CephFS) on a Ceph Monitor node.

Important

By default, you can create only one CephFS per Ceph Storage cluster.

Prerequisites

  • A running, and healthy Red Hat Ceph Storage cluster.
  • Installation and configuration of the Ceph Metadata Server daemon (ceph-mds).
  • Root-level access to a Ceph monitor node.

Procedure

  1. Create two pools, one for storing data and one for storing metadata:

    Syntax

    ceph osd pool create NAME _PG_NUM

    Example

    [root@mon ~]# ceph osd pool create cephfs_data 64
    [root@mon ~]# ceph osd pool create cephfs_metadata 64

    Typically, the metadata pool can start with a conservative number of Placement Groups (PGs) as it will generally have far fewer objects than the data pool. It is possible to increase the number of PGs if needed. Recommended metadata pool sizes range from 64 PGs to 512 PGs. Size the data pool is proportional to the number and sizes of files you expect in the file system.

    Important

    For the metadata pool, consider to use:

    • A higher replication level because any data loss to this pool can make the whole file system inaccessible.
    • Storage with lower latency such as Solid-State Drive (SSD) disks because this directly affects the observed latency of file system operations on clients.
  2. Create the CephFS:

    Syntax

    ceph fs new NAME METADATA_POOL DATA_POOL

    Example

    [root@mon ~]# ceph fs new cephfs cephfs_metadata cephfs_data

  3. Verify that one or more MDSs enter to the active state based on you configuration.

    Syntax

    ceph fs status NAME

    Example

    [root@mon ~]# ceph fs status cephfs
    cephfs - 0 clients
    ======
    +------+--------+-------+---------------+-------+-------+
    | Rank | State  |  MDS  |    Activity   |  dns  |  inos |
    +------+--------+-------+---------------+-------+-------+
    |  0   | active | node1 | Reqs:    0 /s |   10  |   12  |
    +------+--------+-------+---------------+-------+-------+
    +-----------------+----------+-------+-------+
    |       Pool      |   type   |  used | avail |
    +-----------------+----------+-------+-------+
    | cephfs_metadata | metadata | 4638  | 26.7G |
    |   cephfs_data   |   data   |    0  | 26.7G |
    +-----------------+----------+-------+-------+
    
    +-------------+
    | Standby MDS |
    +-------------+
    |    node3    |
    |    node2    |
    +-------------+----

Additional Resources

3.4. Creating Ceph File Systems with erasure coding (Technology Preview)

By default, Ceph uses replicated pools for data pools. You can also add an additional erasure-coded data pool, if needed. Ceph File Systems (CephFS) backed by erasure-coded pools use less overall storage compared to Ceph File Systems backed by replicated pools. While erasure-coded pools use less overall storage, they also use more memory and processor resources than replicated pools.

Important

The Ceph File System using erasure-coded pools is a Technology Preview feature. Technology Preview features are not supported with Red Hat production service level agreements (SLAs), might not be functionally complete, and Red Hat does not recommend to use them for production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. See the support scope for Red Hat Technology Preview features for more details.

Important

For production environments, Red Hat recommends using a replicated pool as the default data pool.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • A running CephFS environment.
  • Pools using BlueStore OSDs.
  • User-level access to a Ceph Monitor node.

Procedure

  1. Create a replicated metadata pool for CephFS metadata:

    Syntax

    ceph osd pool create METADATA_POOL PG_NUM

    Example

    [root@mon ~]# ceph osd pool create cephfs-metadata 64

    This example creates a pool named cephfs-metadata with 64 placement groups.

  2. Create a default replicated data pool for CephFS:

    Syntax

    ceph osd pool create DATA_POOL PG_NUM

    Example

    [root@mon ~]# ceph osd pool create cephfs-data 64

    This example creates a replicated pool named cephfs-data with 64 placement groups.

  3. Create an erasure-coded data pool for CephFS:

    Syntax

    ceph osd pool create DATA_POOL PG_NUM erasure

    Example

    [root@mon ~]# ceph osd pool create cephfs-data-ec 64 erasure

    This example creates an erasure-coded pool named cephfs-data-ec with 64 placement groups.

  4. Enable overwrites on the erasure-coded pool:

    Syntax

    ceph osd pool set DATA_POOL allow_ec_overwrites true

    Example

    [root@mon ~]# ceph osd pool set cephfs-data-ec allow_ec_overwrites true

    This example enables overwrites on an erasure-coded pool named cephfs-data-ec.

  5. Add the erasure-coded data pool to the CephFS Metadata Server (MDS):

    Syntax

    ceph fs add_data_pool cephfs-ec DATA_POOL

    Example

    [root@mon ~]# ceph fs add_data_pool cephfs-ec cephfs-data-ec

    1. Optionally, verify the data pool was added:

      [root@mon ~]# ceph fs ls
  6. Create the CephFS:

    Syntax

    ceph fs new cephfs METADATA_POOL DATA_POOL

    Example

    [root@mon ~]# ceph fs new cephfs cephfs-metadata cephfs-data

    Important

    Using an erasure-coded pool for the default data pool is not recommended.

  7. Create the CephFS using erasure coding:

    Syntax

    ceph fs new cephfs-ec METADATA_POOL DATA_POOL

    Example

    [root@mon ~]# ceph fs new cephfs-ec cephfs-metadata cephfs-data-ec

  8. Verify that one or more Ceph FS Metadata Servers (MDS) enters the active state:

    Syntax

    ceph fs status FS_EC

    Example

    [root@mon ~]# ceph fs status cephfs-ec
    cephfs-ec - 0 clients
    ======
    +------+--------+-------+---------------+-------+-------+
    | Rank | State  |  MDS  |    Activity   |  dns  |  inos |
    +------+--------+-------+---------------+-------+-------+
    |  0   | active | node1 | Reqs:    0 /s |   10  |   12  |
    +------+--------+-------+---------------+-------+-------+
    +-----------------+----------+-------+-------+
    |       Pool      |   type   |  used | avail |
    +-----------------+----------+-------+-------+
    | cephfs-metadata | metadata | 4638  | 26.7G |
    |  cephfs-data    |   data   |    0  | 26.7G |
    |  cephfs-data-ec |   data   |    0  | 26.7G |
    +-----------------+----------+-------+-------+
    
    +-------------+
    | Standby MDS |
    +-------------+
    |    node3    |
    |    node2    |
    +-------------+

  9. To add a new erasure-coded data pool to an existing file system.

    1. Create an erasure-coded data pool for CephFS:

      Syntax

      ceph osd pool create DATA_POOL PG_NUM erasure

      Example

      [root@mon ~]# ceph osd pool create cephfs-data-ec1 64 erasure

    2. Enable overwrites on the erasure-coded pool:

      Syntax

      ceph osd pool set DATA_POOL allow_ec_overwrites true

      Example

      [root@mon ~]# ceph osd pool set cephfs-data-ec1 allow_ec_overwrites true

    3. Add the erasure-coded data pool to the CephFS Metadata Server (MDS):

      Syntax

      ceph fs add_data_pool cephfs-ec DATA_POOL

      Example

      [root@mon ~]# ceph fs add_data_pool cephfs-ec cephfs-data-ec1

  10. Create the CephFS using erasure coding:

    Syntax

    ceph fs new cephfs-ec METADATA_POOL DATA_POOL

    Example

    [root@mon ~]# ceph fs new cephfs-ec cephfs-metadata cephfs-data-ec1

Additional Resources

3.5. Creating client users for a Ceph File System

Red Hat Ceph Storage uses cephx for authentication, which is enabled by default. To use cephx with the Ceph File System, create a user with the correct authorization capabilities on a Ceph Monitor node and make its key available on the node where the Ceph File System will be mounted.

Prerequisites

  • A running Red Hat Ceph Storage cluster.
  • Installation and configuration of the Ceph Metadata Server daemon (ceph-mds).
  • Root-level access to a Ceph monitor node.
  • Root-level access to a Ceph client node.

Procedure

  1. On a Ceph Monitor node, create a client user:

    Syntax

    ceph fs authorize FILE_SYSTEM_NAME client.CLIENT_NAME /DIRECTORY CAPABILITY [/DIRECTORY CAPABILITY] ...

    • To restrict the client to only writing in the temp directory of filesystem cephfs_a:

      Example

      [root@mon ~]# ceph fs authorize cephfs_a client.1 / r /temp rw
      
      client.1
        key: AQBSdFhcGZFUDRAAcKhG9Cl2HPiDMMRv4DC43A==
        caps: [mds] allow r, allow rw path=/temp
        caps: [mon] allow r
        caps: [osd] allow rw tag cephfs data=cephfs_a

    • To completely restrict the client to the temp directory, remove the root (/) directory:

      Example

      [root@mon ~]# ceph fs authorize cephfs_a client.1 /temp rw

    Note

    Supplying all or asterisk as the file system name grants access to every file system. Typically, it is necessary to quote the asterisk to protect it from the shell.

  2. Verify the created key:

    Syntax

    ceph auth get client.ID

    Example

    [root@mon ~]# ceph auth get client.1

  3. Copy the keyring to the client.

    1. On the Ceph Monitor node, export the keyring to a file:

      Syntax

      ceph auth get client.ID -o ceph.client.ID.keyring

      Example

      [root@mon ~]# ceph auth get client.1 -o ceph.client.1.keyring
      exported keyring for client.1

    2. Copy the client keyring from the Ceph Monitor node to the /etc/ceph/ directory on the client node:

      Syntax

      scp root@MONITOR_NODE_NAME:/root/ceph.client.1.keyring /etc/ceph/

      Replace_MONITOR_NODE_NAME_with the Ceph Monitor node name or IP.

      Example

      [root@client ~]# scp root@mon:/root/ceph.client.1.keyring /etc/ceph/ceph.client.1.keyring

  4. Set the appropriate permissions for the keyring file:

    Syntax

    chmod 644 KEYRING

    Example

    [root@client ~]# chmod 644 /etc/ceph/ceph.client.1.keyring

Additional Resources

  • See the User Management chapter in the Red Hat Ceph Storage Administration Guide for more details.

3.6. Mounting the Ceph File System as a kernel client

You can mount the Ceph File System (CephFS) as a kernel client, either manually or automatically on system boot.

Important

Clients running on other Linux distributions, aside from Red Hat Enterprise Linux, are permitted but not supported. If issues are found in the CephFS Metadata Server or other parts of the storage cluster when using these clients, Red Hat will address them. If the cause is found to be on the client side, then the issue will have to be addressed by the kernel vendor of the Linux distribution.

Prerequisites

  • Root-level access to a Linux-based client node.
  • User-level access to a Ceph Monitor node.
  • An existing Ceph File System.

Procedure

  1. Configure the client node to use the Ceph storage cluster.

    1. Enable the Red Hat Ceph Storage 4 Tools repository:

      Red Hat Enterprise Linux 7

      [root@client ~]# subscription-manager repos --enable=rhel-7-server-rhceph-4-tools-rpms

      Red Hat Enterprise Linux 8

      [root@client ~]# subscription-manager repos --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms

    2. Install the ceph-common package:

      Red Hat Enterprise Linux 7

      [root@client ~]# yum install ceph-common

      Red Hat Enterprise Linux 8

      [root@client ~]# dnf install ceph-common

    3. Copy the Ceph client keyring from the Ceph Monitor node to the client node:

      Syntax

      scp root@MONITOR_NODE_NAME:/etc/ceph/KEYRING_FILE /etc/ceph/

      Replace MONITOR_NODE_NAME with the Ceph Monitor host name or IP address.

      Example

      [root@client ~]# scp root@192.168.0.1:/etc/ceph/ceph.client.1.keyring /etc/ceph/

    4. Copy the Ceph configuration file from a Ceph Monitor node to the client node:

      Syntax

      scp root@MONITOR_NODE_NAME:/etc/ceph/ceph.conf /etc/ceph/ceph.conf

      Replace MONITOR_NODE_NAME with the Ceph Monitor host name or IP address.

      Example

      [root@client ~]# scp root@192.168.0.1:/etc/ceph/ceph.conf /etc/ceph/ceph.conf

    5. Set the appropriate permissions for the configuration file:

      [root@client ~]# chmod 644 /etc/ceph/ceph.conf
    6. Choose either automatically or manually mounting.

Manually Mounting

  1. Create a mount directory on the client node:

    Syntax

    mkdir -p MOUNT_POINT

    Example

    [root@client]# mkdir -p /mnt/cephfs

  2. Mount the Ceph File System. To specify multiple Ceph Monitor addresses, separate them with commas in the mount command, specify the mount point, and set the client name:

    Note

    As of Red Hat Ceph Storage 4.1, mount.ceph can read keyring files directly. As such, a secret file is no longer necessary. Just specify the client ID with name=CLIENT_ID, and mount.ceph will find the right keyring file.

    Syntax

    mount -t ceph MONITOR-1_NAME:6789,MONITOR-2_NAME:6789,MONITOR-3_NAME:6789:/ MOUNT_POINT -o name=CLIENT_ID

    Example

    [root@client ~]# mount -t ceph mon1:6789,mon2:6789,mon3:6789:/ /mnt/cephfs -o name=1

    Note

    You can configure a DNS server so that a single host name resolves to multiple IP addresses. Then you can use that single host name with the mount command, instead of supplying a comma-separated list.

    Note

    You can also replace the Monitor host names with the string :/ and mount.ceph will read the Ceph configuration file to determine which Monitors to connect to.

  3. Verify that the file system is successfully mounted:

    Syntax

    stat -f MOUNT_POINT

    Example

    [root@client ~]# stat -f /mnt/cephfs

Automatically Mounting

  1. On the client host, create a new directory for mounting the Ceph File System.

    Syntax

    mkdir -p MOUNT_POINT

    Example

    [root@client ~]# mkdir -p /mnt/cephfs

  2. Edit the /etc/fstab file as follows:

    Syntax

    #DEVICE                 PATH           TYPE     OPTIONS               DUMP  FSCK
    HOST_NAME:_PORT_,     MOUNT_POINT  ceph     name=CLIENT_ID,        0     0
    HOST_NAME:_PORT_,                             ceph.client_mountpoint=/VOL/SUB_VOL_GROUP/SUB_VOL/UID_SUB_VOL,
    HOST_NAME:_PORT_:/                            [ADDITIONAL_OPTIONS]

    The first column sets the Ceph Monitor host names and the port number.

    The second column sets the mount point

    The third column sets the file system type, in this case, ceph, for CephFS.

    The fourth column sets the various options, such as, the user name and the secret file using the name and secretfile options, respectively. You can also set specific volumes, sub-volume groups, and sub-volumes using the ceph.client_mountpoint option.

    Set the _netdev option to ensure that the file system is mounted after the networking subsystem starts to prevent hanging and networking issues. If you do not need access time information, then setting the noatime option can increase performance.

    Set the fifth and sixth columns to zero.

    Example

    #DEVICE         PATH                   TYPE    OPTIONS         DUMP  FSCK
    mon1:6789,      /mnt/cephfs            ceph    name=1,            0     0
    mon2:6789,                                     ceph.client_mountpoint=/my_vol/my_sub_vol_group/my_sub_vol/0,
    mon3:6789:/                                    _netdev,noatime

    The Ceph File System will be mounted on the next system boot.

    Note

    As of Red Hat Ceph Storage 4.1, mount.ceph can read keyring files directly. As such, a secret file is no longer necessary. Just specify the client ID with name=CLIENT_ID, and mount.ceph will find the right keyring file.

    Note

    You can also replace the Monitor host names with the string :/ and mount.ceph will read the Ceph configuration file to determine which Monitors to connect to.

Additional Resources

  • See the mount(8) manual page.
  • See the Ceph user management chapter in the Red Hat Ceph Storage Administration Guide for more details on creating a Ceph user.
  • See the Creating a Ceph File System section of the Red Hat Ceph Storage File System Guide for details.

3.7. Mounting the Ceph File System as a FUSE client

You can mount the Ceph File System (CephFS) as a File System in User Space (FUSE) client, either manually or automatically on system boot.

Prerequisites

  • Root-level access to a Linux-based client node.
  • User-level access to a Ceph Monitor node.
  • An existing Ceph File System.

Procedure

  1. Configure the client node to use the Ceph storage cluster.

    1. Enable the Red Hat Ceph Storage 4 Tools repository:

      Red Hat Enterprise Linux 7

      [root@client ~]# subscription-manager repos --enable=rhel-7-server-rhceph-4-tools-rpms

      Red Hat Enterprise Linux 8

      [root@client ~]# subscription-manager repos --enable=rhceph-4-tools-for-rhel-8-x86_64-rpms

    2. Install the ceph-fuse package:

      Red Hat Enterprise Linux 7

      [root@client ~]# yum install ceph-fuse

      Red Hat Enterprise Linux 8

      [root@client ~]# dnf install ceph-fuse

    3. Copy the Ceph client keyring from the Ceph Monitor node to the client node:

      Syntax

      scp root@MONITOR_NODE_NAME:/etc/ceph/KEYRING_FILE /etc/ceph/

      Replace MONITOR_NODE_NAME with the Ceph Monitor host name or IP address.

      Example

      [root@client ~]# scp root@192.168.0.1:/etc/ceph/ceph.client.1.keyring /etc/ceph/

    4. Copy the Ceph configuration file from a Ceph Monitor node to the client node:

      Syntax

      scp root@MONITOR_NODE_NAME:/etc/ceph/ceph.conf /etc/ceph/ceph.conf

      Replace MONITOR_NODE_NAME with the Ceph Monitor host name or IP address.

      Example

      [root@client ~]# scp root@192.168.0.1:/etc/ceph/ceph.conf /etc/ceph/ceph.conf

    5. Set the appropriate permissions for the configuration file:

      [root@client ~]# chmod 644 /etc/ceph/ceph.conf
    6. Choose either automatically or manually mounting.

Manually Mounting

  1. On the client node, create a directory for the mount point:

    Syntax

    mkdir PATH_TO_MOUNT_POINT

    Example

    [root@client ~]# mkdir /mnt/mycephfs

    Note

    If you used the path option with MDS capabilities, then the mount point must be within what is specified by path.

  2. Use the ceph-fuse utility to mount the Ceph File System.

    Syntax

    ceph-fuse -n client.CLIENT_ID MOUNT_POINT

    Example

    [root@client ~]# ceph-fuse -n client.1 /mnt/mycephfs

    Note

    If you do not use the default name and location of the user keyring, that is /etc/ceph/ceph.client.CLIENT_ID.keyring, then use the --keyring option to specify the path to the user keyring, for example:

    Example

    [root@client ~]# ceph-fuse -n client.1 --keyring=/etc/ceph/client.1.keyring /mnt/mycephfs

    Note

    Use the -r option to instruct the client to treat that path as its root:

    Syntax

    ceph-fuse -n client.CLIENT_ID MOUNT_POINT -r PATH

    Example

    [root@client ~]# ceph-fuse -n client.1 /mnt/cephfs -r /home/cephfs

  3. Verify that the file system is successfully mounted:

    Syntax

    stat -f MOUNT_POINT

    Example

    [user@client ~]$ stat -f /mnt/cephfs

Automatically Mounting

  1. On the client node, create a directory for the mount point:

    Syntax

    mkdir PATH_TO_MOUNT_POINT

    Example

    [root@client ~]# mkdir /mnt/mycephfs

    Note

    If you used the path option with MDS capabilities, then the mount point must be within what is specified by path.

  2. Edit the /etc/fstab file as follows:

    Syntax

    #DEVICE                 PATH           TYPE          OPTIONS                  DUMP  FSCK
    HOST_NAME:_PORT_,     MOUNT_POINT  fuse.ceph     ceph.id=CLIENT_ID,        0     0
    HOST_NAME:_PORT_,                                  ceph.client_mountpoint=/VOL/SUB_VOL_GROUP/SUB_VOL/UID_SUB_VOL,
    HOST_NAME:_PORT_:/                                 [ADDITIONAL_OPTIONS]

    The first column sets the Ceph Monitor host names and the port number.

    The second column sets the mount point

    The third column sets the file system type, in this case, fuse.ceph, for CephFS.

    The fourth column sets the various options, such as, the user name and the secret file using the name and secretfile options, respectively. You can also set specific volumes, sub-volume groups, and sub-volumes using the ceph.client_mountpoint option. Set the _netdev option to ensure that the file system is mounted after the networking subsystem starts to prevent hanging and networking issues. If you do not need access time information, then setting the noatime option can increase performance.

    Set the fifth and sixth columns to zero.

    Example

    #DEVICE         PATH              TYPE         OPTIONS         DUMP  FSCK
    mon1:6789,      /mnt/cephfs       fuse.ceph    ceph.id=1,         0     0
    mon2:6789,                                     ceph.client_mountpoint=/my_vol/my_sub_vol_group/my_sub_vol/0,
    mon3:6789:/                                    _netdev,defaults

    The Ceph File System will be mounted on the next system boot.

Additional Resources

  • The ceph-fuse(8) manual page.
  • See the Ceph user management chapter in the Red Hat Ceph Storage Administration Guide for more details on creating a Ceph user.
  • See the Creating a Ceph File System section of the Red Hat Ceph Storage File System Guide for details.

3.8. Additional Resources