Chapter 6. Create a Cluster
If at any point you run into trouble and you want to start over, execute the following to purge the configuration:
ceph-deploy purgedata <ceph-node> [<ceph-node>] ceph-deploy forgetkeys
To purge the Ceph packages too, you may also execute:
ceph-deploy purge <ceph-node> [<ceph-node>]
If you execute purge, you must re-install Ceph.
On your Calamari admin node from the directory you created for holding your configuration details, perform the following steps using ceph-deploy.
Create the cluster. :
ceph-deploy new <initial-monitor-node(s)>
For example:
ceph-deploy new node1
Check the output of
ceph-deploywithlsandcatin the current directory. You should see a Ceph configuration file, a monitor secret keyring, and a log file of theceph-deployprocedures.At this stage, you may begin editing your Ceph configuration file.
NoteIf you choose not to use
ceph-deployyou will have to deploy Ceph manually or refer to Ceph manual deployment documentation and configure a deployment tool (e.g., Chef, Juju, Puppet, etc.) to perform each operationceph-deployperforms for you.Add the
public_networkandcluster_networksettings under the[global]section of your Ceph configuration file.public_network = <ip-address>/<netmask> cluster_network = <ip-address>/<netmask>
These settings distinguish which network is public (front-side) and which network is for the cluster (back-side). Ensure that your nodes have interfaces configured for these networks. We do not recommend using the same NIC for the public and cluster networks.
Turn on IPv6 if you intend to use it.
ms_bind_ipv6 = true
Add or adjust the
osd journal sizesetting under the[global]section of your Ceph configuration file.osd_journal_size = 10000
We recommend a general setting of 10GB. Ceph’s default
osd_journal_sizeis0, so you will need to set this in yourceph.conffile. A journal size should find the product of thefilestore_max_sync_intervaland the expected throughput, and multiply the product by two (2). The expected throughput number should include the expected disk throughput (i.e., sustained data transfer rate), and network throughput. For example, a 7200 RPM disk will likely have approximately 100 MB/s. Taking themin()of the disk and network throughput should provide a reasonable expected throughput.Set the number of copies to store (default is
3) and the default minimum required write data when in adegradedstate (default is2) under the[global]section of your Ceph configuration file. We recommend the default values for production clusters.osd_pool_default_size = 3 osd_pool_default_min_size = 2
For a quick start, you may wish to set
osd_pool_default_sizeto2, and theosd_pool_default_min_sizeto 1 so that you can achieve andactive+cleanstate with only two OSDs.These settings establish the networking bandwidth requirements for the cluster network, and the ability to write data with eventual consistency (i.e., you can write data to a cluster in a degraded state if it has
min_sizecopies of the data already).Set the maximum number of placement groups per OSD. The Ceph Storage Cluster has a default maximum value of 300 placement groups per OSD. You can set a different maximum value in your Ceph configuration file (i.e., where
nis the maximum number of PGs per OSD).mon_pg_warn_max_per_osd = n
Multiple pools can use the same CRUSH ruleset. When an OSD has too many placement groups associated to it, Ceph performance may degrade due to resource use and load. This setting warns you, but you may adjust it to your needs and the capabilities of your hardware.
Set a CRUSH leaf type to the largest serviceable failure domain for your replicas under the
[global]section of your Ceph configuration file. The default value is1, or host, which means that CRUSH will map replicas to OSDs on separate separate hosts. For example, if you want to make three object replicas, and you have three racks of chassis/hosts, you can setosd_crush_chooseleaf_typeto3, and CRUSH will place each copy of an object on OSDs in different racks. For example:osd_crush_chooseleaf_type = 3
The default CRUSH hierarchy types are:
- type 0 osd
- type 1 host
- type 2 chassis
- type 3 rack
- type 4 row
- type 5 pdu
- type 6 pod
- type 7 room
- type 8 datacenter
- type 9 region
- type 10 root
Set
max_open_filesso that Ceph will set the maximum open file descriptors at the OS level to help prevent Ceph OSD Daemons from running out of file descriptors.max_open_files = 131072
In summary, your initial Ceph configuration file should have at least the following settings with appropriate values assigned after the = sign:
[global] fsid = <cluster-id> mon_initial_members = <hostname>[, <hostname>] mon_host = <ip-address>[, <ip-address>] public_network = <network>[, <network>] cluster_network = <network>[, <network>] ms_bind_ipv6 = [true | false] max_open_files = 131072 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx osd_journal_size = <n> filestore_xattr_use_omap = true osd_pool_default_size = <n> # Write an object n times. osd_pool_default_min_size = <n> # Allow writing n copy in a degraded state. osd_crush_chooseleaf_type = <n>

Where did the comment section go?
Red Hat's documentation publication system recently went through an upgrade to enable speedier, more mobile-friendly content. We decided to re-evaluate our commenting platform to ensure that it meets your expectations and serves as an optimal feedback mechanism. During this redesign, we invite your input on providing feedback on Red Hat documentation via the discussion platform.