When rebooting one of our networkers, the second begins logging messages of the form "haproxy-metadata-proxy-75b51f21-bcc6-46e0-8576-6cfcf82dd7c6[52355]: Proxy listener reached system FD limit at 11. Please check system tunables." and becomes unusable
Issue
- One of our OSP 13 network nodes became unresponsive this morning. On the console, we saw messages of the form:
audit: backlog limit exceeded
and AVCs logging is piling up quickly in /var/log/audit/audit.log*
:
[root@overcloud-controller-0 audit audit]# ls -tlr
total 38452
-r--------. 1 root root 8388727 Mar 23 10:21 audit.log.4
-r--------. 1 root root 8388685 Mar 23 11:44 audit.log.3
-r--------. 1 root root 8388749 Mar 23 13:26 audit.log.2
-r--------. 1 root root 8388872 Mar 23 14:16 audit.log.1
-rw-------. 1 root root 5802944 Mar 23 14:45 audit.log
[root@overcloud-controller-0 audit]# wc -l *
26883 audit.log
28869 audit.log.1
35549 audit.log.2
37034 audit.log.3
39155 audit.log.4
167490 total
- We were not able to log into the server. When rebooted it came up fine, but during the reboot process the second networker began to fail with messages of the form:
haproxy-metadata-proxy-75b51f21-bcc6-46e0-8576-6cfcf82dd7c6[52355]: Proxy listener reached system FD limit at 11. Please check system tunables
-
At this point, it is no longer possible to connect to the host via ssh, and the volume of logging on the console makes the console unusable. This is a critical problem: either network node should be able to operate when the other node is offline.
-
Note that due to another issue, we are running a custom kernel and have perf tracing enabled.
-
Errors similar to this are seen in
/var/log/containers/neutron/dhcp-agent.log
:
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent [-] Unable to reload_allocations dhcp for b7607204-3717-4293-9565-e9edfa29a01b.: OSError: [Errno 23] Too many open files in system: '/var/lib/neutron/dhcp/b7607204-3717-4293-9565-e9edfa29a01b/tmpRsbj4W'
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent Traceback (most recent call last):
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py", line 144, in call_driver
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent getattr(driver, action)(**action_kwargs)
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 530, in reload_allocations
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent self._spawn_or_reload_process(reload_with_HUP=True)
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 467, in _spawn_or_reload_process
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent self._output_config_files()
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 511, in _output_config_files
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent self._output_opts_file()
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 946, in _output_opts_file
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent file_utils.replace_file(name, '\n'.join(options))
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron_lib/utils/file.py", line 60, in replace_file
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib64/python2.7/tempfile.py", line 458, in NamedTemporaryFile
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib64/python2.7/tempfile.py", line 239, in _mkstemp_inner
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/eventlet/green/os.py", line 109, in open
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent OSError: [Errno 23] Too many open files in system: '/var/lib/neutron/dhcp/b7607204-3717-4293-9565-e9edfa29a01b/tmpRsbj4W'
2020-03-23 12:01:44.621 28367 ERROR neutron.agent.dhcp.agent
- Error similar to this are seen in
/var/log/containers/neutron/metadata-agent.log
:
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent [-] Unexpected error.: ConnectionError: HTTPConnectionPool(host='overcloud.internalapi.localdomain', port=8775): Max retries exceeded with url: /2009-04-04/meta-data/local-ipv4 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f54014b0490>: Failed to establish a new connection: [Errno 23] Too many open files in system',))
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent Traceback (most recent call last):
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/site-packages/neutron/agent/metadata/agent.py", line 91, in __call__
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/site-packages/neutron/agent/metadata/agent.py", line 207, in _proxy_request
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/site-packages/requests/api.py", line 58, in request
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 502, in send
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent ConnectionError: HTTPConnectionPool(host='overcloud.internalapi.localdomain', port=8775): Max retries exceeded with url: /2009-04-04/meta-data/local-ipv4 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f54014b0490>: Failed to establish a new connection: [Errno 23] Too many open files in system',))
2020-03-23 07:08:16.753 19504 ERROR neutron.agent.metadata.agent
- Error similar to this are seen in
/var/log/containers/neutron/l3-agent.log
:
2020-03-23 12:03:01.377 28366 ERROR neutron.agent.linux.utils [-] Rootwrap error running command: ['ip', 'netns', 'exec', 'qrouter-5f85a98f-bf26-4b39-8e3d-d4b26abad392', 'keepalived', '-P', '-f', '/var/lib/neutron/ha_confs/5f85a98f-bf26-4b39-8e3d-d4b26abad392/keepalived.conf', '-p', '/var/lib/neutron/ha_confs/5f85a98f-bf26-4b39-8e3d-d4b26abad392.pid', '-r', '/var/lib/neutron/ha_confs/5f85a98f-bf26-4b39-8e3d-d4b26abad392.pid-vrrp']: OSError: [Errno 23] Too many open files in system
2020-03-23 12:04:01.377 28366 ERROR neutron.agent.linux.external_process [-] keepalived for router with uuid 5f85a98f-bf26-4b39-8e3d-d4b26abad392 not found. The process should not have died
2020-03-23 12:04:01.378 28366 WARNING neutron.agent.linux.external_process [-] Respawning keepalived for uuid 5f85a98f-bf26-4b39-8e3d-d4b26abad392
- Error similar to this are seen in
/var/log/containers/neutron/openvwitch-agent.log
:
2020-03-23 10:23:57.402 28499 ERROR oslo.messaging._drivers.impl_rabbit [-] [38bfac9a-5756-4b39-aaf8-d8dfad1360e3] AMQP server on rabbitmq.internalapi.localdomain:5672 is unreachable: [Errno 23] Too many open files in system. Trying again in 1 seconds.: error: [Errno 23] Too many open files in system
Environment
- Red Hat OpenStack Platform 13.0 (RHOSP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.