Sat 6.2.4 - Last Checkin date/time not updating
Systems were registered using activation keys and the katello-agent is installed and running on the systems. However, the last checkin day/time has not been updated since the systems have been registered. Should they not be checking in every 30 minutes or so?
Responses
Gary,
Apologies for the late reply. Is this problem still occurring?
If so, when you say "last checkin day/time", are you referring to the "Last Report" column for all hosts when you open the Satellite web UI and navigate to Hosts > All hosts?
The frequency with which a host reports to either a Satellite Server or Capsule Server is determined by the host's Puppet agent's configuration. The default frequency is between 30 minutes and 60 minutes.
Gary,
I'll check with an engineer, but it's my understanding that the "Last Checkin" reflects the most recent date and time on which the host's Puppet agent "checked in" with the Puppet Master. Since you're not installing the Puppet agent on hosts, that would explain the last check-in information not being updated.
Russell--
Is that new behavior with 6.2? in the past (at least my experience up to Satellite 6.1.x), "Last Checkin" referred to the Katello agent (gopherd) checking in, i.e. the "Pulp" side of Satellite 6's split personality, not Puppet/config mgmt. side. So checking 'gopherd' status & connectivity on the client would be my first guess.
Gary,
Thanks for clarifying which field you were looking at. I realise, after reading your comment above, that I was looking at the wrong place in the Satellite web UI. I was looking at Hosts > All hosts, but you were looking at Hosts > Content Hosts, which is a different view. Apologies for confusing the discussion.
I will check and confirm just where the "Last Checkin" date comes from.
Gary,
Can you please check something on 1 or 2 of those hosts for which the "Last Checkin" date and time is not being updated? Check if the Red Hat Subscription Manager daemon, rhsmcertd , is running? I was just reading about a customer case where hosts were migrated from Satellite 5 to Satellite 6. The customer had disabled the rhsmcertd daemon when the hosts were under Satellite 5 management, and they were successfully migrated to Satellite 6 management. However, without the rhsmcertd running, they were not reporting their status to the Satellite 6 server and so their "Last checkin" date and time was not being updated.
The rhsmcertd daemon's log file is located at /var/log/rhsm/rhsmcertd.log. Before enabling the daemon, it would be interesting to check the log file and see if the time stamps recorded there match with the time stamps noted in the Satellite web UI.
Unless there is some strange thing going on here puppet should not be related to rhsm (the Red Hat Subscription Manager system. I'm quite sure Last Checkin has to do with the Katello/Candlepin part of Satellite that you find under Hosts > Content Hosts and not to the Foreman/Puppet part that is found under Hosts > All Hosts.
So:
Puppet -> Last report under Hosts > All Hosts.
rhsm -> Last Hosts > Content Hosts
It might be useful to see what it looks on a working system, so checking on our test system I see that a host has last checkin 14:10:06 UTC.
On the host I see this in the logs (note that this is the +0100 timezone)
/var/log/rhsm/rhsm.log
2017-01-13 15:10:06,353 [INFO] rhsmcertd-worker:3171:MainThread @connection.py:830 - Connection built: host=satellite01.example.com port=443 handler=/rhsm auth=identity_cert ca_dir=/etc/rhsm/ca/ verify=False
2017-01-13 15:10:06,604 [INFO] rhsmcertd-worker:3171:MainThread @entcertlib.py:131 - certs updated:
Total updates: 0
...etc. Continues with the certificate numbers and repo list etc.
/var/log/rhsm/rhsmcertd.log
Fri Jan 13 07:10:07 2017 [INFO] (Cert Check) Certificates updated.
Fri Jan 13 11:10:04 2017 [INFO] (Auto-attach) Certificates updated.
Fri Jan 13 11:10:07 2017 [INFO] (Cert Check) Certificates updated.
Fri Jan 13 15:10:07 2017 [INFO] (Cert Check) Certificates updated
...so every 4 hours.
On the satellite server I see this searching for the subscription id (UUID from Content Host page):
root@satellite01:~> grep 4cc65712-53c8-442c-9ee9-5eb23c60820d/certificates /var/log/foreman/production.log
2017-01-13 07:10:06 [app] [I] Started GET "/rhsm/consumers/4cc65712-53c8-442c-9ee9-5eb23c60820d/certificates/serials" for 10.236.7.132 at 2017-01-13 07:10:06 +0100
2017-01-13 11:10:04 [app] [I] Started GET "/rhsm/consumers/4cc65712-53c8-442c-9ee9-5eb23c60820d/certificates/serials" for 10.236.7.132 at 2017-01-13 11:10:04 +0100
2017-01-13 11:10:06 [app] [I] Started GET "/rhsm/consumers/4cc65712-53c8-442c-9ee9-5eb23c60820d/certificates/serials" for 10.236.7.132 at 2017-01-13 11:10:06 +0100
2017-01-13 15:10:06 [app] [I] Started GET "/rhsm/consumers/4cc65712-53c8-442c-9ee9-5eb23c60820d/certificates/serials" for 10.236.7.132 at 2017-01-13 15:10:06 +0100
Gary and Terje,
Thanks for that additional information. It seems we don't yet have a definitive answer. I have asked the Satellite 6 engineers to confirm the source of the last checkin date and time.
Gary,
In looking over the content of the example rhsmcertd.log file, I'm concerned about the presence of the message "...unable to get lock". This may indicate that one or more of the Satellite background tasks is paused, and has a lock on resources required by RHSM.
Fri Jan 13 10:02:29 2017 [INFO] Starting rhsmcertd...
Fri Jan 13 10:02:29 2017 [INFO] Auto-attach interval: 1440.0 minute(s) [86400 second(s)]
Fri Jan 13 10:02:29 2017 [INFO] Cert check interval: 240.0 minute(s) [14400 second(s)]
Fri Jan 13 10:02:29 2017 [INFO] Waiting 120 second(s) [2.0 minute(s)] before running updates.
Fri Jan 13 10:04:32 2017 [INFO] (Auto-attach) Certificates updated.
Fri Jan 13 10:04:34 2017 [INFO] (Cert Check) Certificates updated.
Fri Jan 13 10:05:43 2017 [ERROR] unable to get lock, exiting
To check this, open the Satellite 6 web UI, navigate to Monitor > Tasks. All tasks are listed in descending order by date and time, so the very latest will be at the top. If you still can't see any that are at paused state, put the following search criteria into the Search field - state = paused.
I'm still waiting on an answer to my question about the "Last checkin" field, but checking for paused tasks in the meantime would be a useful step.
Gary,
OK - it's good to hear that the hosts are now checking in, and the Satellite web UI reflects currrent timestamps. I'm a little concerned about that paused task, and would suggest you raise a support case with Red Hat to get that resolved.
Regarding the "Last Checkin" time listed for hosts in the Satellite web UI, I confirmed with an engineer that is the last date and time the rhsmcertd daemon reported its status to the Satellite Server. It seems that starting that service, and stopping the paused task autoreloading, has fixed the hosts' checkins.
Gary,
Thanks for your reply. It's great to know that all systems are checking in at the expected interval.
As to your question about the state of rhsmcertd, it definitely should have been running, so why it was not is a mystery, and one I'd live to resolve. The method of registration should not, I believe, have any effect on the rhsmcertd service. Can you please confirm - are these systems new? What version of Red Hat Enterprise Linux are they running? Is there any other management-oriented software installed on them?
As an FYI, I opened bz1414993 to add the capability to start (and enable on startup) to bootstrap.py.
Gary,
Thanks for that reply. It's great to know that you now have the hosts successfully reporting, and have found the root cause of the problem.
Since you're migrating hosts from Satellite 5 to Satellite 6, I would recommend you use the bootstrap script, a CLI tool which you run on individual hosts. It is available in Satelite 6.2+ and was created solely for the purpose of migrating hosts from either RHSM or Satellite 5 to Satellite 6.
For full details of the bootstrap tool, including example uses, see the KBase article Red Hat Satellite 6.2 Feature Overview: Importing Existing Hosts via the Bootstrap Script.
As you noted above, you have had to manually install the subscription-manager package on the hosts to be migrated. The bootstrap script completes this step for you. As you don't wish to use Puppet, the example usages includes "Registering a system to Satellite 6, omitting puppet setup".
Near the bottom of the article is a video which describes the script and provides a demonstration of its use.
Gary,
You're welcome. I'm a technical writer, assigned to Red Hat Satellite. Although I'd prefer that customers didn't encounter these sorts of problems, I learn a lot in working through them. It helps us better understand what aspects of the product customers have difficulty with, and so which need refinement.
Hi Guys,
Sorry for joining late the party but I do have the same issue which was my "Last Report" for every client does not reflect in the Web GUI (it does not appear)
I tried to restart rhsmcertd as per previous post and no error appear when i run command "rhsmcertd -n"
But, still the "Last Report" did not show up and i found below error in rhsmcertd.log
"[ERROR] unable to get lock, exiting"
I did found the solution here https://access.redhat.com/solutions/3132321
But, I would really appreciate if I can have a solution without rebooting the server first.
Please help me.
Thanks.
Hey guys,
Please ignore the previous question as I managed to get it done.
I just realize that I need to stop daemon rhsmcertd first and then run rhsmcertd -n. It seems like when Subscription Manager has been scheduled then we received this kind of alert especially when we want to run it immediately.
Anyway, I have another question. I did rhsmcertd -n and hoping that "Last Report" in the "Hosts > All Hosts" will appear, but it doesn't.
May i know what other things that i can check?
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
