Part IV. Erasure Code Pools

Ceph storage strategies involve defining data durability requirements (the ability to sustain the loss of one or more OSDs without losing data). Ceph uses replicated pools by default, meaning Ceph copies every object to a primary OSD and to one or more secondary OSDs. You may specify erasure (erasure-coded pool) to save storage space. Erasure-coding will reduce the amount of disk space required to ensure data durability, but it is computationally a bit more expensive than replication.

Erasure coding is a method of storing an object in the Ceph Storage cluster durably where the erasure code algorithm breaks the object into data chunks (k) and coding chunks (m), and stores those chunks in different OSDs. In the event of the failure of an OSD, Ceph can retrieve the remaining data (k) and coding (m) chunks from the other OSDs and the erasure code algorithm can restore the object from those chunks. Erasure coding uses storage capacity more efficiently than replication. The n-replication approach maintains n copies of an object (3x by default in Ceph), whereas erasure coding maintains only k + m chunks (e.g., 3 data and 2 coding chunks use 1.5x the storage space of the original object) and has similar durability characteristics when compared to having multiple deep copies of an object. While erasure coding uses less storage overhead than replication, the erasure code algorithm uses more RAM and CPU than replication when it accesses or recovers objects. Erasure coding is advantageous when data storage must be durable and fault tolerant, but doesn’t require fast read performance (e.g., cold storage, historical records, etc).

For a mathematical and detailed explanation on how erasure code works in Ceph, refer to section Erasure Coded I/O in Red Hat Ceph Storage 1.3 Architecture Guide.

Ceph creates a default erasure code profile when initializing a cluster with k=2 and m=1, meaning Ceph will spread the object data over three OSDs (k+m == 3) and Ceph can lose one of those OSDs without losing data. To know more about erasure code profiling see Erasure Code Profiles

Note

Erasure-coded pools are only supported with the RADOS Gateway (RGW). Using erasure-coded pools with a RADOS Block Device (RBD) is not supported. Please see Erasure-coded Pools and Cache Tiering, for more details on erasure coding and cache tiering.