RHEL6.2 : gnome-terminal crashes when communication to GConf fails

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux 6.2(X86_64)
  • kernel 2.6.32-220.4.2.el6.x86_64
  • gnome-terminal-2.31.3-6.el6.x86_64

Issue

  • A gnome-terminal crashes when communication to GConf fails with either:
Failed to summon the GConf demon; exiting.  Failed to contact
configuration server; some
possible causes are that you need to enable TCP/IP networking for
ORBit, or you have stale
NFS locks due to a system crash. See http://projects.gnome.org/gconf/
for information.
(Details -  1: Could not send message to GConf daemon: Message did
not receive a reply
(timeout by message bus))

or:

Failed to summon the GConf demon; exiting.  Failed to contact
configuration server; some
possible causes are that you need to enable TCP/IP networking for
ORBit, or you have stale
NFS locks due to a system crash. See http://projects.gnome.org/gconf/
for information.
(Details -  1: Ping to server error
IDL:omg.org/CORBA/COMM_FAILURE:1.0

Resolution

  • Install ORBit2 from this errata: https://access.redhat.com/errata/RHBA-2012:1457
  • If the problem persists after you have installed the errata and you are using the --display=DISPLAY option, you may want to try: DISPLAY=:0 /usr/bin/gnome-terminal instead.

Diagnostic Steps

How to reproduce the symptoms:

  1. Open four gnome-terminal windows on X console.

  2. Run following commands on each window

# export DISPLAY=unix:0
# while true; do strace -ttTfo test[1-n].trc gnome-terminal
--disable-factory -x ./ls.sh > 
test[1-n].log 2>&1 ; [ $? -ne 0 ] && export LANG=C && sosreport
--report --batch && break; done

message #1 error happens after about 30 min.
message #2 error happens after about 4 hours.

Though customer tried to somaxconn to max(2147483647),
the symptom happened again.

# tar tf trace.tar
ping_case/
ping_case/sosreport-sth09u03-20120709212911-1ee3.tar.xz
ping_case/strace_error.tgz
timeout_case/
timeout_case/strace_error.tgz
timeout_case/sosreport-sth09u03-20120709225350-1bb6.tar.xz

Problem Analysis:
On error message #1(timeout case)

When the symptom happened, read() failed with EAGAIN error

// from test2.trc

9062    22:52:07.668982 poll([{fd=6, events=POLLIN}], 1, 25000) = 1
([{fd=6, revents=POLLIN}
]) <0.048326>
9062    22:52:07.717361 read(6,
"l\3\1\1=\0\0\0\3\0\0\0q\0\0\0\5\1u\0\2\0\0\0\4\1s\0\"\0\0\0
"..., 2048) = 197 <0.000007>
9062    22:52:07.717417 read(6, 0x1951c60, 2048) = -1 EAGAIN (Resource
temporarily unavailable) <0.000010>

On eror message #2 (ping fail case)

When the symptom happend, connect() failed with EAGAIN error.

// from strace.30.log

18077 21:27:27.981668 connect(11, {sa_family=AF_FILE,
path="/tmp/orbit-root/linc-28c3-0-29
706e2876b6e"}, 44) = -1 EAGAIN (Resource temporarily unavailable)
<0.000011>

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.