Chapter 6. Conclusion
Red Hat solutions involving Red Hat Enterprise Linux OpenStack Platform are created to deliver a production-ready foundation that simplifies the deployment process, shares the latest best practices, and gets the utmost performance and scale for a public or private OpenStack cloud. The steps and procedures covered within this reference architecture provide system, storage, and cloud administrators the blueprint required to using the different open source benchmarking tools to assist in understanding of the performance and scale potential of an existing cloud environment.
Successfully benchmarking and scaling an existing RHEL-OSP environment consists of the following:
- Capturing baseline results using the benchmarking tool Rally
- Running performance checks on the RHEL-OSP environment using Browbeat
- Modify the RHEL-OSP environment based upon the performance check findings from Browbeat
- Re-run the Rally scenarios to capture the latest results with the tuning changes.
- Analyzing and comparing the results
The Rally test results for the reference environment demonstrated that
vif_plugging_is_fatal parameter values play a critical role in nova guest instance boot times. When the
vif_plugging_timeout value is decreased, the RHEL-OSP environment saw a great decrease in guest instance boot times by as much as 51.7%. As performance and scalability were measured when running the different scenarios, the amount of RAM, as well as, the available spindles within the Ceph nodes limited the performance and scalability of the environment. With regards to performance, as the Rally scenarios increased the concurrency value, this lead to high CPU wait times due to the Ceph nodes not being able to keep up with the amount of guest instances being booted simulatenousely leading at times to nova guest instance boot failures. To increase performance, besides adding additional spindles, adding higher speed drives (SSD, NVMe) for journals and/or data could also alleviate the Ceph node bottleneck. With regards to scalability, the max amount of guests that could be launched directly depends on the amount of RAM available on each compute node. The max guest instances per compute is easily calculated using the
nova hypervisor-stats command when launching X amount of guest instances all taking the same amount of RAM resources. When re-running the Rally scenarios with the recommendations of Browbeat, specifically the
tuned profiles for each node in the RHEL-OSP environment the boot times drastically improved even when the Ceph nodes were the bottleneck. The max-guest tests running with 2 compute nodes using a low concurrency value showed a decrease in average boot times of 25%, while boot-storm tests with a high concurrency value showed a decrease in average boot times of 13.7%. These small changes in the environment show the importance of taking the Browbeat recommendations and implementing them.
The tools described within this reference architecture help the administrator know the physical limits and proper tuning changes of the configuration under test. Monitoring these resources when in production should allow for the administrator to tweak setting as needed, add compute nodes to extend VM capacity, and storage nodes to improve IO/sec.