- etcd has delicate disk response requirements, and it is often necessary to ensure that the speed that etcd writes to its backing storage is fast enough for production workloads.
etcd alerts from the web console or frequent error messages such as the below may suggest that writes are taking too long:
2020-10-21T09:56:00.246667768Z 2020-10-21 09:56:00.246542 W | etcdserver: read-only range request "key:\"/kubernetes.io/serviceaccounts/openshift-kube-scheduler/localhost-recovery-client\" " with result "range_response_count:1 size:407" took too long (113.372697ms) to execute
The performance documentation on etcd suggests that in production workloads,
wal_fsync_duration_secondsp99 duration should be less than 10ms to confirm the disk is reasonably fast.
- Depending on the severity of disk speed issues, impact can range from frequent alerting to overall cluster instability.
- For more general information regarding infrastructure requirements, please see etcd backend performance requirements.
- Red Hat OpenShift Container Platform (RHOCP, OCP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.