Big-scale Red Hat OpenStack Telemetry Tuning for Ceph Storage
Issue
Large-scale OpenStack environments may require additional configuration of the Telemetry service in order to work properly and not overstress/kill the Ceph nodes.
Running 1,000 instances inside a Red Hat OpenStack Cloud with the Telemetry Service configured for CephStorage driver was successful with only tuning 2 parameters:
* Increase deployed metricd workers across each Controller from 6 to 48
* Reducing metric_processing_delay from 60s to 30s
Environment
Red Hat OpenStack 10 (code name: Newton)
Storage backends: Ceph & Swift
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.