postmaster dead but pid file exists
Environment
- Red Hat Satellite or Proxy 5.6
Issue
Hiour satellite server seems having issue with postgresql services, it is stopped and not able to be restarted.
[root@example ~]# rhn-satellite status
postmaster dead but pid file exists
router (pid 2106) is running...
sm (pid 2123) is running...
c2s (pid 2131) is running...
s2s (pid 2139) is running...
tomcat6 (pid 2043) is running... [ OK ]
httpd (pid 2093) is running...
osa-dispatcher (pid 2160) is running...
rhn-search is running (2193).
cobblerd (pid 2260) is running...
RHN Taskomatic is running (2292).
[root@example tmp]# rhn-satellite stop
Shutting down spacewalk services...
Stopping RHN Taskomatic...
RHN Taskomatic was not running.
Stopping cobbler daemon: [ OK ]
Stopping rhn-search...
Stopped rhn-search.
Stopping MonitoringScout ...
[ OK ]
Stopping Monitoring ...
[ OK ]
Shutting down osa-dispatcher: [ OK ]
Stopping httpd: [ OK ]
Stopping tomcat6: [ OK ]
Terminating jabberd processes ...
Stopping s2s: [ OK ]
Stopping c2s: [ OK ]
Stopping sm: [ OK ]
Stopping router: [ OK ]
^[[AStopping postgresql service: [FAILED]
Done.
[root@example tmp]# rhn-satellite start
Starting spacewalk services...
Starting postgresql service: [FAILED]
Initializing jabberd processes ...
Starting router: [ OK ]
Starting sm: [ OK ]
Starting c2s: [ OK ]
Starting s2s: [ OK ]
Starting tomcat6: [ OK ]
Waiting for tomcat to be ready ...
Starting httpd: [ OK ]
Starting osa-dispatcher: Spacewalk 4287 2014/06/04 10:31:53 +11:00: ('Error caught:',)
Spacewalk 4287 2014/06/04 10:31:53 +11:00: ('Traceback (most recent call last):\n File "/usr/share/rhn/osad/jabber_lib.py", line 117, in main\n self.setup_config(config)\n File "/usr/share/rhn/osad/osa_dispatcher.py", line 112, in setup_config\n rhnSQL.initDB()\n File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/__init__.py", line 102, in initDB\n __init__DB(backend, host, port, username, password, database)\n File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/__init__.py", line 55, in __init__DB\n __DB.connect()\n File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 174, in connect\n return self.connect(reconnect=0)\n File "/usr/lib/python2.6/site-packages/spacewalk/server/rhnSQL/driver_postgresql.py", line 163, in connect\n password=str(self.password))\nSQLConnectError: (None, None, \'rhnschema\', \'Attempting Re-Connect to the database failed\')\n',)
[ OK ]
Starting Monitoring ...
[ OK ]
Starting MonitoringScout ...
[ OK ]
Starting rhn-search...
Starting cobbler daemon: [ OK ]
Starting RHN Taskomatic...
Done.
Resolution
- Please verify first the log of startup of postgresql in /var/lib/pgsql/pgstartup.log
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale en_US.UTF-8.
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to "english".
fixing permissions on existing directory /var/lib/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in /var/lib/pgsql/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
Success. You can now start the database server using:
/usr/bin/postgres -D /var/lib/pgsql/data
or
/usr/bin/pg_ctl -D /var/lib/pgsql/data -l logfile start
2014-05-12 01:28:35.579 GMT FATAL: lock file "postmaster.pid" already exists
2014-05-12 01:28:35.579 GMT HINT: Is another postmaster (PID 5291) running in data directory "/var/lib/pgsql/data"?
2014-05-13 01:51:10.490 GMT FATAL: lock file "postmaster.pid" already exists
2014-05-13 01:51:10.490 GMT HINT: Is another postmaster (PID 1890) running in data directory "/var/lib/pgsql/data"?
2014-06-04 09:39:04.101 EST FATAL: could not remove old lock file "/tmp/.s.PGSQL.5432.lock": Permission denied
2014-06-04 09:39:04.101 EST HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
2014-06-04 09:46:42.492 EST FATAL: could not remove old lock file "/tmp/.s.PGSQL.5432.lock": Permission denied
2014-06-04 09:46:42.492 EST HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again.
2014-06-04 09:50:03.330 EST FATAL: could not create lock file "/tmp/.s.PGSQL.5432.lock": Permission denied
2014-06-04 10:05:57.877 EST FATAL: could not create lock file "/tmp/.s.PGSQL.5432.lock": Permission denied
-
For we can see in the logs of postgresql is complain of access denied in the /tmp
-
Looking the permission of /tmp directory
example:
drxrxrt. 28 root root 20480 Jun 4 11:21 tmp <--incorrect permissions
drwxrwxrwt. 28 root root 20480 Jun 4 11:21 tmp <--correct permissions
- Please make sure the permission of /tmp are right and correct if necessary after this please try again to restart satellite services with the following command:
rhn-satellite start
- If after this recommendation still the issue and the message "Attempting Re-Connect to the database failed'" please contact Red Hat Technical Support for further assistance.
Root Cause
- tmp directory have wrong permissions
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments