Heat stack failures, Timeouts during port creation.

Solution Verified - Updated -

Environment

  • Red Hat OpenStack Platform 10.0

Issue

  • AMQP server is getting a timeout during port creation.
  • Port creation is failed during heat stack-create
  • NTP not running and unsynced.

    2018-04-02 13:45:22.797 7214 INFO heat.engine.resource [req-b075a642-555d-4b72-82e2-c9f804c861a5 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] creating Port "port1" Stack "stack-0" [389d748d-f8ea-4d78-ab1f-3f5c7349527e]
    2018-04-02 13:45:22.806 7237 INFO heat.engine.resource [req-b075a642-555d-4b72-82e2-c9f804c861a5 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] creating Port "mgmt" Stack "stack-0" [389d748d-f8ea-4d78-ab1f-3f5c7349527e]
    2018-04-02 13:45:22.808 7252 INFO heat.engine.resource [req-b075a642-555d-4b72-82e2-c9f804c861a5 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] creating Port "port3" Stack "stack-0" [389d748d-f8ea-4d78-ab1f-3f5c7349527e]
    2018-04-02 13:45:41.588 7237 ERROR oslo.messaging._drivers.impl_rabbit [req-b075a642-555d-4b72-82e2-c9f804c861a5 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] [7d301a4f-a5d5-4c18-98ba-601285f76342] AMQP server on 10.1.1.17:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None
    2018-04-02 13:45:47.599 7237 ERROR oslo.messaging._drivers.impl_rabbit [req-b075a642-555d-4b72-82e2-c9f804c861a5 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] [7d301a4f-a5d5-4c18-98ba-601285f76342] AMQP server on 10.1.1.17:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None
    

Resolution

Root Cause

  • There is no NTP configured.

Diagnostic Steps

  • Logs from /var/log/heat/heat-engine.log

    overcloud-controller-0.localdomain/var/log/heat/heat-engine.log:10609:2018-04-05 23:32:16.721 630319 DEBUG oslo_messaging._drivers.amqpdriver [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] CAST unique_id: 8be124cbbf6b44588ce5b7ff2fa66543 NOTIFY exchange 'heat' topic 'notifications.error' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:432
    overcloud-controller-0.localdomain/var/log/heat/heat-engine.log:10610:2018-04-05 23:32:16.733 630319 INFO heat.engine.stack [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] Stack DELETE FAILED (stack-0): Timed out
    
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9501:2018-04-05 23:32:23.372 468593 INFO heat.engine.service [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] Deleting stack stack-0
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9502:2018-04-05 23:32:23.388 468593 DEBUG oslo_policy._cache_handler [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] Reloading cached file /etc/heat/policy.json read_cached_file /usr/lib/python2.7/site-packages/oslo_policy/_cache_handler.py:38
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9503:2018-04-05 23:32:23.420 468593 DEBUG oslo_policy.policy [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] Reloaded policy file: /etc/heat/policy.json _load_policy_file /usr/lib/python2.7/site-packages/oslo_policy/policy.py:584
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9504:2018-04-05 23:32:23.432 468593 INFO heat.engine.worker [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] [stack-0(884b140d-d098-4ddc-aafe-ea032a546d03)] Stopped all active workers for stack CREATE
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9505:2018-04-05 23:32:23.437 468593 DEBUG oslo_messaging._drivers.amqpdriver [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] sending reply msg_id: 4c4ec60a64614566b7ca7057e5ae28e3 reply queue: reply_fdd3e22d3de9442ca0aa9de7011dfdf6 time elapsed: 0.0752787880047s _send_reply /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:73
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9506:2018-04-05 23:32:23.470 468593 DEBUG oslo_messaging._drivers.amqpdriver [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] CAST unique_id: d6f775448ed04c9b9b9d9330a598dade NOTIFY exchange 'heat' topic 'notifications.info' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:432
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9507:2018-04-05 23:32:23.484 468593 INFO heat.engine.stack [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] Stack DELETE IN_PROGRESS (stack-0): Stack DELETE started
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9508:2018-04-05 23:32:23.510 468593 INFO heat.engine.stack [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] convergence_dependencies: (Dependencies([((2321, False), (2318, False)), ((2324, False), (2318, False)), ((2327, False), (2318, False))]),)
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9509:2018-04-05 23:32:23.533 468593 INFO heat.engine.stack [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] Triggering resource 2318 for cleanup
    overcloud-controller-2.localdomain/var/log/heat/heat-engine.log:9510:2018-04-05 23:32:23.534 468593 DEBUG oslo_messaging._drivers.amqpdriver [req-2a1971b2-032c-47fb-b6a1-c775e069e96d 2f88197fca8c4af78db7fadfaea55352 cd3df30e98d94cb084626c5ca5a9ab8a - - -] CAST unique_id: be796ca1706f459585a214948cff3339 exchange 'heat' topic 'engine_worker' _send /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:448
    
  • From the above logs the first request went to controller-2 where it started the deletion of stack stack-0. At this point the time-stamp at node is 2018-04-05 23:32:23

  • Then the request goes to controller-0 where the times-stamp was 2018-04-05 23:32:16, and the request timed out.
  • This seems to be caused because the servers are not in sync with each other.

    [root@overcloud-controller-0 ~]# ntpq -p 
    ntpq: read: Connection refused
    ntpq: read: Connection refused
    ntpq: read: Connection refused
    
    [root@overcloud-controller-0 ~]# ntpstat 
    Unable to talk to NTP daemon. Is it running?
    Unable to talk to NTP daemon. Is it running?
    Unable to talk to NTP daemon. Is it running?
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments