How To Kill Bulk Running Workflow Jobs in Ansible Automation Platform?

Solution Verified - Updated -

Environment

  • Ansible Automation Platform 1.2.X.
  • Ansible Automation Platform 2.X.X.

Issue

  • User mistakenly created 5700+ jobs through api and was not able to kill Workflow jobs using tower_cli command.

Resolution

  • Run the below command to cancel the running workflow jobs:

    # su - awx
    
    $ echo "UnifiedJob.objects.filter(status='running').update(status='canceled')" | awx-manage shell_plus
    
  • After all the workflow jobs are canceled, please restart all Ansible Automation platform nodes using the following command :

    * For Ansible Tower (AAP1.2) use the following command: 
    $ ansible-tower-service restart 
    
    * For Automation Controller Node (AAP2.x) Use the Following Command 
    $ automation-controller-service restart 
    

Root Cause

  • Running Workflow jobs are not easy to cancel using simple awx-manage shell_plus command.
    The process takes too long

    Example:
    ```
    >>> start = time.time(); UnifiedJob.objects.filter(id=1096679).update(status='canceled'); end = time.time(); print(end - start)
    1
    263.3683326244354
    

    With this speed, it would take approximately 2 weeks to cancel 4500+ running workflow jobs.

Diagnostic Steps

  • Command to cancel the running ansible jobs returned ok but was not effective on workflow jobs and they were still in running mode.

    # tower-cli workflow_job list -W 2389 --status running
    ======= ===================== =========================== =======
      id    workflow_job_template           created           status
    ======= ===================== =========================== =======
    1096571                  2389 2022-09-26T13:20:39.150265Z running
    1096573                  2389 2022-09-26T13:20:43.415625Z running
    1096574                  2389 2022-09-26T13:20:47.948050Z running
    1096579                  2389 2022-09-26T13:20:52.186344Z running
    1096580                  2389 2022-09-26T13:20:56.741558Z running
    1096583                  2389 2022-09-26T13:21:00.827540Z running
    1096584                  2389 2022-09-26T13:21:05.154718Z running
    1096585                  2389 2022-09-26T13:21:09.275868Z running
    1096588                  2389 2022-09-26T13:21:13.725400Z running
    1096590                  2389 2022-09-26T13:21:18.275462Z running
    1096594                  2389 2022-09-26T13:21:22.411275Z running
    1096595                  2389 2022-09-26T13:21:26.738721Z running
    1096599                  2389 2022-09-26T13:21:31.279409Z running
    1096603                  2389 2022-09-26T13:21:35.386969Z running
    1096604                  2389 2022-09-26T13:21:39.628363Z running
    1096607                  2389 2022-09-26T13:21:43.820806Z running
    1096608                  2389 2022-09-26T13:21:47.898231Z running
    1096612                  2389 2022-09-26T13:21:52.032458Z running
    1096613                  2389 2022-09-26T13:21:56.383005Z running
    1096614                  2389 2022-09-26T13:22:00.596535Z running
    1096618                  2389 2022-09-26T13:22:05.834738Z running
    1096619                  2389 2022-09-26T13:22:09.969598Z running
    1096623                  2389 2022-09-26T13:22:14.080368Z running
    1096627                  2389 2022-09-26T13:22:18.412507Z running
    1096628                  2389 2022-09-26T13:22:22.671783Z running
    ======= ===================== =========================== ======= (Page 1 of 193.)
    
    
    #  for ID in $(tower-cli workflow_job list -W 2389 --status running | awk '/running/ {print $1}'); do tower-cli workflow_job cancel $ID; done
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    OK. (changed: true)
    
    # tower-cli workflow_job list -W 2389 --status running
    ======= ===================== =========================== =======
      id    workflow_job_template           created           status
    ======= ===================== =========================== =======
    1096571                  2389 2022-09-26T13:20:39.150265Z running
    1096573                  2389 2022-09-26T13:20:43.415625Z running
    1096574                  2389 2022-09-26T13:20:47.948050Z running
    1096579                  2389 2022-09-26T13:20:52.186344Z running
    1096580                  2389 2022-09-26T13:20:56.741558Z running
    1096583                  2389 2022-09-26T13:21:00.827540Z running
    1096584                  2389 2022-09-26T13:21:05.154718Z running
    1096585                  2389 2022-09-26T13:21:09.275868Z running
    1096588                  2389 2022-09-26T13:21:13.725400Z running
    1096590                  2389 2022-09-26T13:21:18.275462Z running
    1096594                  2389 2022-09-26T13:21:22.411275Z running
    1096595                  2389 2022-09-26T13:21:26.738721Z running
    1096599                  2389 2022-09-26T13:21:31.279409Z running
    1096603                  2389 2022-09-26T13:21:35.386969Z running
    1096604                  2389 2022-09-26T13:21:39.628363Z running
    1096607                  2389 2022-09-26T13:21:43.820806Z running
    1096608                  2389 2022-09-26T13:21:47.898231Z running
    1096612                  2389 2022-09-26T13:21:52.032458Z running
    1096613                  2389 2022-09-26T13:21:56.383005Z running
    1096614                  2389 2022-09-26T13:22:00.596535Z running
    1096618                  2389 2022-09-26T13:22:05.834738Z running
    1096619                  2389 2022-09-26T13:22:09.969598Z running
    1096623                  2389 2022-09-26T13:22:14.080368Z running
    1096627                  2389 2022-09-26T13:22:18.412507Z running
    1096628                  2389 2022-09-26T13:22:22.671783Z running
    ======= ===================== =========================== ======= (Page 1 of 193.)
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments