Deleting Tasks in Satellite 6.1

Latest response

I'm setting up a new Satellite 6.1 server and in my enthusiasm I started a bunch of repo resync tasks for RH 5, 6, and 7. Bad idea as it was taking forever to retrieve the information. Since I have a load of ISOs already, I cancelled the tasks and planned on following the install docs to simply move the rpms from the ISOs into the appropriate directories and then let Satellite sync a much smaller number of rpms. Unfortunately something got stuck and I have two tasks that have errored out waiting on the completion of two other tasks so the locks can be removed.

Problem:

Task 1 is waiting on Task 2 to complete to release the lock. Task 2 has been canceled but didn't delete the lock. So Task 1 can't be restarted.

Task 3 is waiting on Task 4 to complete to release the lock. But Task 4 has completed successfully.

Is there a way to remove tasks from the system? I'd like to clean up the canceled ones, just delete them, and delete the others and then when I get the rpms in place, simply start up new tasks to sync them.

I can certainly blow the system away and start over as this is a new install but I'm concerned that 6 months from now, for example, if I get a task stuck I won't be able to blow the system away.

Thanks for any help.

Carl

Responses

Hello, you can do that. I think it is under Content -> Sync Status . Expand the "Product" and look at the Result column. If you click on the result entry you will see options such as resume or delete.

Try also: Monitor --> Tasks --> Running Steps "Cancel" will appear in top left if available.

Both show only Resume. No Delete option in either place.

Carl

stuck tasks (shutdown satellite, katello restart apache and mongo and pulp

Yes, that also helped me:

# katello-service restart

should be enough

Not sure what this means (stuck tasks). I did reboot the system though which should do the same thing.

Do not delete the task " Listen on candlepin events ", it need to run all the time.

Not sure folks understand here. There is no option to Delete a task. I have 'Start auto-reloading', 'Dynflow console', and 'Resume' which is greyed out.

Carl

Unix Admin - At this point I would recommend you raise a support case with Red Hat. It's likely the quickest means of getting this issue resolved.

Okay thanks. We have L3 support with Red Hat so I'm working through our vendor (Yasir latif above there :) ). I was exploring other avenues as well.

Carl

Refer note: https://access.redhat.com/solutions/1554783

for: Satellite 6 : How to stop all paused/running synchronization processes at once using "foreman-rake console"

This is how I remove failed tasks from Satellite 6.1:

su - postgres
psql foreman
delete from foreman_tasks_tasks where id in(select id from foreman_tasks_tasks where state = 'paused' and result = 'error');
delete from foreman_tasks_tasks where id in(select id from foreman_tasks_tasks where state = 'stopped' and result = 'error');
\q

This will release the locks also.

You can remove any tasks in any state with any result. Usually I clean up the whole database on regular basis since we have 2500+ content hosts and there's no any kind of housekeeping in place.

After DB cleanup don't forget to cleanup the backend objects and reindex the ES DB:

foreman-rake katello:clean_backend_objects --trace
foreman-rake katello:reindex --trace

It will speed up Satellite.

But I really recommend to upgrade to Satellite 6.2. We are in the middle of migration now and it's a big improvement. More stable, faster, lots of new features.

Franky

This is still handy. One update: katello:reindex is now katello:reimport, katello:reindex will be removed in 6.3

Satellite is a very good reason to dump RedHat and look for another operating system. So huge, so complicated and so limiting. I'm aware that my answer has got nothing to do with this thread, simply had to get rid of the frustration.

Maybe I can help ? Version 6.0 & 6.1 will haunt Red Hat, they should never have been released. Have you tried version 6.2 ? It does work and takes about an hour to install.

I dunno about dumping Red Hat, but dumping Satellite is not a bad idea. It's hard to imagine that Red Hat in good conscience takes money from customers for this disaster masquerading as software. It's the most complex, bloated, buggy, slug-slow, and woefully unstable piece of garbage I've seen since Internet Explorer 4 alpha. Every time it's "have you upgraded to the latest patches?" YES, we've upgraded, and upgraded, and upgraded. Guess what? It's still absolutely worthless. There has been enough time to right this ship that the only conclusion to be drawn is that Red Hat business direction does not include providing adequate resources to remediate this stain on their name. It's a shame since they've worked so hard to become a leader in the space.

I'm starting to have the same experience at work. I can't even get any Errata tasks to run anymore. The Errata list shows many available packages but when I commit the task they never show in the Tasks for the hosts. It's weird because the "last contact time" for all the servers are reasonable. Both of my access support cases are still open even after being open for about six months. One was redhat satellite not honoring the installonly_limit (the /boot would fill up with obsolete kernels), bug two is the default kernel not updating in grub.conf to the new kernel. Both of these force me to manually visit hosts and manually fix these errors, making RHS pointless for me

I agree, Satellite must be one of the worst SW ever created.

I disagree.

I have been using Satellite since the v4 days and it has done what I have needed it to do. Sure there are bugs, but they get worked through and squashed. Satellite v6 has been very good for us again with the same understanding that bugs happen and you work through them. These kinds of posts typically sound to me like you need training / professional services to help plan your satellite 6 deployment (software like Satellite 6 is not a ./run && done kind of thing and you might have structural issues with your deployment / workload).

BTW - Satellite 6 is the commercially supported version of the upstream/opensource foreman project. If you aren't getting the kind of assistance from Red Hat, try reaching out to those guys and filling out a bug report or joining their irc channel on freenode.

This whole thread is awesome. But Greg, I have to disagree. I have been using spacewalk and home grown patching tools for years. They do what I need them to do. Satellite 6 has been a pain. We have support and have used it, It's not the bugs that are in issue for me. It's the ridiculous way you create repos, lifecycles, content views, etc. And pushing patches is also horrible. To each their own, but this has not been a pleasant experience for my team. Due to corporate policies, we have to use it. But if I could, I would dump it.

I think they went this route in order to make sure that people were buying entitlements. I get it.

I agree. I wound up mostly using it to be able to create a local mirror of repositories for CentOS systems. For Ubuntu, apt-mirror works like a champ. Simple to config, set it and forget it functionality. Satellite there is always some care and feeding that needs to happen to keep it working. I just wish my company would allow rsync through the firewall, and then I'd dump this beast in a heartbeat.

Stephen Wadeley - or anyone - has anyone been through the following and found a solution for this? I found and tried part of the first solution (the 3rd one may not fit)

And so I'm facing this with a disconnected Red Hat Satellite server and the DynFlow console for 11 of the channels I'm attempting to ingest is hung at 63% completion.

I'm having the issue with ""17: Actions::PUlp::Repository::RegenerateApplicability (suspended) [350626.28s/2095.07s]" "

https://access.redhat.com/solutions/2420841
https://access.redhat.com/solutions/1381053
and
https://access.redhat.com/solutions/1489373 (this might not be an exact fit for my situation)

I noted 2 years ago that Red-Hatter Stephen Wadeley - mentioned...

..but not always. See Red Hat Satellite6 repository sync 
pending on Actions::Pulp::Repository::RegenerateApplicability 
on the Red Hat Customer Portal.
(https://access.redhat.com/solutions/1381053)

Now the less-than-amusing thing is that the content view I imported WORKED on a different/separate disconnected Red Hat Satellite server that is ALSO on version 6.2.11. So I don't believe the content view I am importing is to blame, since it worked fine, first try on the other disconnected satellite server.

So my scenario - I have ONE Satellite 6.2.12 satellite server that faces the public internet. I have numerous entitled -disconnected- Satellite servers that I feed updates using a method Red-Hatter Rich Jerrido devised using a dump of a content view and an import on the gaining disconnected satellites. That generally works great except for now and only one one of my multiple Red Hat Satellite servers.

Only 11 channels have an issue (below) and I've typed out a DynFlow where it shows the failure

======================================

    The 11 channels affected are:
    Red Hat Enterprise Linux Workstation
    ->Red Hat Enterprise LInux 7 Workstation - Optional RPMs x86_64 7Workstation  EDITED (New Packages 35 (335 MB)
    ->Red Hat Enterprise LInux 7 Workstation - RH Common RPMs x86_64 7Workstation EDITED "No new packages"
    ->Red Hat Enterprise LInux 7 Workstation - Supplementary RPMs X86_64 7Workstation EDITED "No new packages"
    ->Red Hat Enterprise LInux 7 Workstation RPMs x86_64 7Workstation EDITED (New Packages 232 (572 MB)        
    ->Red Hat Enterprise LInux 7 Workstation - Extras RPMs x86_64 EDITED "No new packages"
    ->Red Hat Satellite Tools 6.2 for RHEL 7 Workstation RPMs x86_64 EDITED (New Packages 8 (435KB)
    Oracle Java for RHEL Workstation
    -> Red Hat Enterprise Linux 7 Workstation - Oracle Java RPMs x86_64 7Workstation EDITED "No new packages"
    Red Hat Enterprise LInux Server
    -> Red Hat Enterprise Linux 6 Server - RH Common RPMs x86_64 6Server EDITED "No new packages"
    -> Red Hat Enterprise Linux 6 Server - Supplementary RPMs x86_64 6Server EDITED "No new packages"
    7Server
    -> RHN Tools for Red Hat Enterprise Linux 7 Server RPMs x86_64 7Server EDITED "No new packages"
    -> Red Hat Satellite Tools 6.2 for RHEL 7 Server RPMs x86_64 EDITED "No new packages"`

It is still hung on 63% Here's an example DynFlow output:

############################ DynFlow output
    Go to the "Run" tab in Dynaflow
    I see the following:

    "3: Actions::Pulp::Repository::Sync(success){time-in-seconds]"
    "6: Actions::Katello::Repository::IndexContent (success) [time-in-seconds]"
    "14: Actions::Katello::Repsitory::ErrataMail (success) [time-in-seconds]"
    "17: Actions::Pulp::Repository::RegenerateApplicability (suspended) [350626.28s/2095.07s]"   Could be the issue!!?
    "18: Actions::Katello::Repository::Sync (pending)"  ("pending" probably because the line above is still in "suspended state")
    "20: Actions::Katello::Repository::ImportApplicability (pending)"  ("pending" again, probably because the thing 2 lines up is in a "suspeneded" state)

Note: this is pretty much what I get for all failed channels listed earlier, the only thing that differs is the DynFlow portion:

    "pulp_id", the "group_id" and "total" under "Actions::Katello::Repository::ERegenerateApplicability (suspended) 

======================================

So I've raised a ticket with Red Hat. but thought I'd post here just in case anyone else has suffered & solved this.

Appreciate any assistance

Ive had some success with different invocations of foreman-rake foreman_tasks:cleanup.

Consider something like (YMMV):

foreman-rake foreman_tasks:cleanup TASK_SEARCH='label = "That::type::of::thing"' verbose=true STATES=paused NOOP=true (remove NOOP to actually do the cleanup)

I know when I have needed to wipe out error'ed or stuck tasks this tool has helped quite a bit. You may need to do some digging in postgres (su - postgres ; psql -d foreman .. then select data from dynflow_actions; and poke around here to get the label/state ).

If you are facing this issue, run on your satellite katello-service stop then katello-service start This places a list of applicable satellite services into an array running systemctl stop [@array1] then systemctl start [@array1] (where "@array1" is really the applicable service in question. )

The recommendation after was to run katello-service status

That takes the same array and runs systemctl status @array1 and so forth.

in ==my== specific case, this seems likely to be a related issue, this one service that was not functioning compared to my known-good satellite server.

For pulp_resource_manager.service - Pulp Resource Manager, I see the following that is not consistent with my FUNCTIONING satellite server:

[root@redacted.someplace.com] # katello-service status (OUTPUT IS ISOLATED TO pulp_resource_manager.service]
● pulp_resource_manager.service - Pulp Resource Manager
   Loaded: loaded (/usr/lib/systemd/system/pulp_resource_manager.service; enabled; vendor preset: disabled)
   Active: active (running) since MOn 2017-11-20 15:09:10 EST; 2min 16s ago
 Main PID: 22352 (celery)
   CGroup: /system.slice/pulp_resource_manager.service
           ├─2047 /usr/bin/python /usr/bin/celery worker -A pulp.server.async.app -n resource_manager@%h -Q resource_manager -c 1 --events --umask 18 --pidfile=/var/run/pulp/resource_manager.pid --heartbeat-interval=30
           └─2937 /usr/bin/python /usr/bin/celery worker -A pulp.server.async.app -n resource_manager@%h -Q resource_manager -c 1 --events --umask 18 --pidfile=/var/run/pulp/resource_manager.pid --heartbeat-interval=30
Nov 20 15:90:12 redacted.someplace.com celery[22352]: - ** ---------- .> transport     qpid://redacted.someplace.com:5671//
Nov 20 15:90:12 redacted.someplace.com celery[22352]: - ** ---------- .>  disabled
Nov 20 15:90:12 redacted.someplace.com celery[22352]: - *** --- * --- .>  concurrency: 1 (prefork)
Nov 20 15:90:12 redacted.someplace.com celery[22352]:  -- ******* ----
Nov 20 15:90:12 redacted.someplace.com celery[22352]:  --- ***** ----- [queues]
Nov 20 15:90:12 redacted.someplace.com celery[22352]:  -------------- .>  resource_manager exchange=resource_manager(direct)key=resource_manager@redacted.someplace.com
Nov 20 15:90:12 redacted.someplace.com pulp[22352]: kombu.transport.qpid:info: Connected to qpid with SASL mechanism ANONYMOUS
Nov 20 15:90:12 redacted.someplace.com pulp[22352]: celery.worker.consumer:INFO: Connected to qpid with SASL mechanism ANONYMOUS

This seems to follow other issues I've noted with journalctl output:
Nov 20 09:54:53 redacted.someplace.com pulp[1310]: celery.beat:INFO: Scheduler: Sending due task download_deferred_content (pulp.service.controllers.repository.queue_download_deferred)
Nov 20 09:54:53 redacted.someplace.com pulp[1460]: celery.worker.strategy:INFO Received task: pulp.server.controllers.repository.queue_download_deferred[b40a71ae-f0f1-488d-b680-57aedbc61041]
Nov 20 09:54:53 redacted.someplace.com pulp[2207] kombu.transport.qpid:INFO: Connected to qpid with SASL mechanism ANONYMOUS
Nov 20 09:54:53 redacted.someplace.com pulp[1448]: celery.worker.strategy:INFO Received task: pulp.server.controllers.repository.download_deferred[a2c11b76-0cf9-4ac-8c69-8691b5ff76b7]
Nov 20 09:54:53 redacted.someplace.com pulp[2207]: py.warnings:WARNING: (2207-75936) /usr/lib/python2.7/site-packages/pulp/server/db/model/__init__.py:536: DeprecationWarning: update is deprecated. use replace_one, update_one or update_many instead.

I also see pulp errors about MongoClient opened before fork, and it says to "Create MongoClient with connect=False, or create client after forking.

There are other complaints of "MongoClient opened before fork. Create MongoClient "