Lost network port access when nova-compute restarted, along with DB errors in log

Solution In Progress - Updated -

Issue

  • We've been trying to isolate a continuous problem we've been having which we seem to have came closer to identifying. We were hoping you could also take a glance at our logs to see if anything stands out. We performed a rabbitmq upgrade on our controller cluster today, that went very well. Unfortunately when we issued a restart of openstack-nova-compute, one of the virtual machines lost network connectivity and required an admin disable/enable of the network. This continues to be a constant problem where when we are doing rolling updates of compute nodes properly live-migrating, and things similar the VM will lose network connectivity until the network port is admin down'd/up'd.

  • Can you take a look specifically at nova-compute around 2020-04-15 16:55? That is where you'll see the VIF's come up, and specifically eed87775-9db6-4434-9132-5231ea89e5d6 is what went down.

  • We also noticed a bunch of errors about inconsistencies in the DB related to:

2020-04-15 16:55:39.109 7646 INFO nova.compute.resource_tracker [req-0f65dfe5-2bb8-4452-aa00-d664c3d33709 - - - - -] Instance a1fbad6a-8f54-4364-807a-08c3568a84e1 has allocations against this compute host but is not found in the database.
2020-04-15 16:55:39.135 7646 INFO nova.compute.resource_tracker [req-0f65dfe5-2bb8-4452-aa00-d664c3d33709 - - - - -] Instance 20d2a3c9-163f-415b-b0d7-f1dbf3ee99b4 has allocations against this compute host but is not found in the database.

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content