Pushing scheduled tasks out to endpoints with osad

Latest response

Hello,

I was wondering if anyone might know what else we might need to look at to resolve this issue. All of our RHEL systems are managed by Satellite and run osad. For most of them, osad picks up scheduled jobs almost immediately when they are scheduled. Not all of them do this, however, and for those systems we end up having to wait for the hourly check-in for the job to be picked up.

All of the servers have access to port 5222 on the Satellite server, which was what I understood was the only requirement. We have verified that this port is open to these endpoints and that osad is enabled and running.

Is there something else we should be looking at to determine why osad is not picking up jobs right away?

Thanks in advance!

Responses

I am running on a similar situation, our issue however is that the RHEL servers fail to reconnect to the Satellite server if it goes down for any particular reason, once the osad daemon is restarted, the server picks the scheduled events without a problem.

So, just to be clear, it's working as expected, but when Satellite becomes unavailable, the osad agent disconnects. Then, when Satellite is back up again, the osad agent does not automatically re-establish connection with Satellite unless the osad service is restarted on each client. Is that correct?

Enrique,

I think I am at the same point where you are, now. After working on this with support, at some point everything started working correctly and scheduled jobs were being picked up right away. Things have been working great until I did patching on my Satellite server last night and now, jobs are no longer being picked up right away.

Today, as I was looking into it, I found that osa-dispatcher was not running. I tried running 'service osa-dispatcher start' and it would give me an 'OK' but it didn't actually start.

Next, I ran:

service osa-dispatcher stop
service jabberd restart
service osa-dispatcher start

After that, my jobs are being picked up again. I'm wondering if osa-dispatcher is trying to start too soon and jabberd isn't running yet. I'll probably try changing some of the boot priority settings (I'm on RHEL6, so not using systemd). I'll post results when I have them.

Dan,

Thanks in advance, I am looking forward to hearing your results, our Satellite server is on RHEL6 as well, I will check this out.

thank you again,
Enrique Sanchez.

Ok...It may be too soon to call it "resolved" but here's what I've done thusfar:

On Satellite:

  1. service osa-dispatcher stop
  2. service jabberd stop
  3. cp -rp /var/lib/jabberd/db /var/lib/jabberd/db-bak
  4. rm -rf /var/lib/jabberd/db/*
  5. service jabberd start
  6. service osa-dispatcher start
  7. rhn-satellite restart

On proxies:

  1. service jabberd stop
  2. cp -rp /var/lib/jabberd/db /var/lib/jabberd/db-bak
  3. rm -rf /var/lib/jabberd/db/*
  4. service jabberd start
  5. rhn-proxy restart

After doing this, I was able to schedule a task to run a simple command ('ls /tmp') and all 178 endpoints completed in less than 10 minutes.

If you only do the steps on the Satellite server, the jobs seem to still take a long time. After going through this process, I restarted the Satellite server and the same job took almost an hour. Then, I ran 'rhn-proxy restart' on both my proxies and ran the test again and it completed in 10 minutes.

Not sure this will resolve your problem, but I'll keep hope alive!

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.