Spawned instances not obtaining dhcp addess
We currently have a multi-compute node stack (controller, network, block storage & 3 compute nodes.)
Upon spawning an instance, it resizes to match the flavor, then it attempts to obtain an ip address for eth0 prior to cloud-init. The process failes.
The network node (rdo3) /var/log/nova/scheduler.log is reporting...
2014-03-20 12:57:45.265 26619 ERROR nova.scheduler.filter_scheduler [req-d371cbd0-2a55-451a-ae5a-59ba30137ab8 a85956a2380648259bfc13293376847c 5942763cd7e34ee0bfd792300fbdc019] [instance: e9c85cad-7a60-4a31-a2f2-5747f302d24a] Error from last host: rdo4.ciso.leidos.com (node rdo4.ciso.leidos.com): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1043, in build_instance\n set_access_ip=set_access_ip)\n', u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1426, in _spawn\n LOG.exception((\'Instance failed to spawn\'), instance=instance)\n', u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 1423, in _spawn\n block_device_info)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 2091, in spawn\n block_device_info, context=context)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 3249, in _create_domain_and_network\n domain = self._create_domain(xml, instance=instance, power_on=power_on)\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 3192, in _create_domain\n domain.XMLDesc(0))\n', u' File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 3187, in _create_domain\n domain.createWithFlags(launch_flags)\n', u' File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 179, in doit\n result = proxy_call(self._autowrap, f, *args, kwargs)\n', u' File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 139, in proxy_call\n rv = execute(f,*args,kwargs)\n', u' File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 77, in tworker\n rv = meth(*args,**kwargs)\n', u' File "/usr/lib64/python2.6/site-packages/libvirt.py", line 708, in createWithFlags\n if ret == -1: raise libvirtError (\'virDomainCreateWithFlags() failed\', dom=self)\n', u"libvirtError: internal error Process exited while reading console log output: char device redirected to /dev/pts/1\nqemu-kvm: -netdev tap,ifname=tap8ebf8a93-46,script=,id=hostnet0: Device 'tap' could not be initialized\n\n"]
Compute nodes (rdo1/3/4) /var/log/neutron/openvswitch-agent.log are reporting...
2014-03-20 14:41:06.289 3695 ERROR neutron.agent.linux.ovsdb_monitor [-] Error received from ovsdb monitor: ovsdb-client: unix:/var/run/openvswitch/db.sock: receive failed (End of file)
Thank you.
Responses
HI Patrick,
I highly suspect the affected system is openstack and I see KVM in the output, is it virtualized? *(and sorry, I don't know about openstack)**
Hopefully someone with openstack experience will step in, but here's a few initial thoughts
I interrogated some of the output you had through google, what version of openstack you have?
Just for fun, check this out:
- Red Hat documentation on configuring dhcp agent (with openstack)
- perhaps this: http://openvswitch.org/pipermail/discuss/2012-March/006638.html and look at the part about "... executable flags on this /etc/ovs-ifup file"
- or this: hopefully might help
Hopefully someone with openstack experience will step in...
Lastly, find the pid that is going out of control (if it is just one pid) and do an 'strace -p xxxx', where 'xxxx' is the pid number.
Hi Patrick, 2 Things here,
* You may want to ask in the RDO forum, As that is the product you are using And there are some really talented people that keep an eye on the forum there with a more hands on expertise to the RDO product.
* 2nd Looking at the error you are getting with a little research I found the same error at answers.launchpad.net Can you verify that openvswitch is running on network node and you compute nodes?
Dear Patrick,
If I am getting correct then, This is looking like your instance failing to get ip address because dhcp server alive status is xxx.
If you check using command "neutron agent-list" then you can see alive status of your agents.
This is all because of RPC communication of agent to the neutron server.
So, You will have to change below setting which is suggested by one of the article provided be redhat.For getting this change you will have to change parameters in neutron.conf file and restart your neutron server and all agents..
https://access.redhat.com/site/solutions/698373
I hope so, It should help .
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
