Chapter 5. Troubleshooting cluster deployments

This document describes how to troubleshoot cluster deployment errors.

5.1. Obtaining information on a failed cluster

If a cluster deployment fails, the cluster is put into an "error" state.

Procedure

Run the following command to get more information:

$ rosa describe cluster -c <my_cluster_name> --debug

5.2. Failing to create a cluster with an osdCcsAdmin error

If a cluster creation action fails, you can receive the following error message.

Example output

Failed to create cluster: Unable to create cluster spec: Failed to get access keys for user 'osdCcsAdmin': NoSuchEntity: The user with name osdCcsAdmin cannot be found.

Procedure

To fix this issue:

  1. Delete the stack:

    $ rosa init --delete
  2. Reinitialize your account:

    $ rosa init

5.3. Creating the Elastic Load Balancing (ELB) service-linked role

If you have not created a load balancer in your AWS account, it is possible that the service-linked role for Elastic Load Balancing (ELB) might not exist yet. You may receive the following error:

Error: Error creating network Load Balancer: AccessDenied: User: arn:aws:sts::xxxxxxxxxxxx:assumed-role/ManagedOpenShift-Installer-Role/xxxxxxxxxxxxxxxxxxx is not authorized to perform: iam:CreateServiceLinkedRole on resource: arn:aws:iam::xxxxxxxxxxxx:role/aws-service-role/elasticloadbalancing.amazonaws.com/AWSServiceRoleForElasticLoadBalancing"

Procedure

To resolve this issue, ensure that the role exists on your AWS account. If not, create this role with the following command:

aws iam get-role --role-name "AWSServiceRoleForElasticLoadBalancing" || aws iam create-service-linked-role --aws-service-name "elasticloadbalancing.amazonaws.com"
Note

This command only needs to be executed once per account.

5.4. Repairing a cluster that cannot be deleted

In specific cases, the following error appears in OpenShift Cluster Manager Hybrid Cloud Console if you attempt to delete your cluster.

Error deleting cluster
CLUSTERS-MGMT-400: Failed to delete cluster <hash>: sts_user_role is not linked to your account. sts_ocm_role is linked to your organization <org number> which requires sts_user_role to be linked to your Red Hat account <account ID>.Please create a user role and link it to the account: User Account <account ID> is not authorized to perform STS cluster operations

Operation ID: b0572d6e-fe54-499b-8c97-46bf6890011c

If you try to delete your cluster from the CLI, the following error appears.

E: Failed to delete cluster <hash>: sts_user_role is not linked to your account. sts_ocm_role is linked to your organization <org_number> which requires sts_user_role to be linked to your Red Hat account <account_id>.Please create a user role and link it to the account: User Account <account ID> is not authorized to perform STS cluster operations

This error occurs when the user-role is unlinked or deleted.

Procedure

  1. Run the following command to create the user-role IAM resource:

    $ rosa create user-role
  2. After you see that the role has been created, you can delete the cluster. The following confirms that the role was created and linked:

    I: Successfully linked role ARN <user role ARN> with account <account ID>