In an RHV IPI environment, while adding a new node, it gets stuck in 'Provisioned' state and doesn't get added to the cluster.

Solution In Progress - Updated -

Issue

  • Earlier the kublet logs showed that,
kubelet_node_status.go:92] Unable to register node "failing-worker0-node" with API server: Post https://api-int.<cluster-name>.<subdomain>:6443/api/v1/nodes: dial tcp: lookup api-int.<cluster-name>.<subdomain> on <externald-dns-server-IP>:53: no such host
  • Later the DNS resolution issue was resolved and then the node was still in 'Provisioned' state with below errors;
$ oc  logs machine-api-controllers-xxxxx -c machine-controller | grep failing-worker0-node
{"level":"error","ts":1601989711.119791,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"machine_controller","request":"openshift-machine-api/failing-worker0-node","error":"Aborting reconciliation while VM failing-worker0-node  state is reboot_in_progress","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/cluster-api-provider-ovirt/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/cluster-api-provider-ovirt/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:218\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/cluster-api-provider-ovirt/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/cluster-api-provider-ovirt/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/cluster-api-provider-ovirt/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/cluster-api-provider-ovirt/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/cluster-api-provider-ovirt/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
I1006 13:08:51.600245       1 controller.go:164] Reconciling Machine "failing-worker0-node"
I1006 13:08:51.600367       1 controller.go:376] Machine "failing-worker0-node" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I1006 13:08:51.649498       1 controller.go:284] Reconciling machine "failing-worker0-node" triggers idempotent update
E1006 13:08:51.694577       1 controller.go:286] Error updating machine "openshift-machine-api/failing-worker0-node": Aborting reconciliation while VM failing-worker0-node  state is reboot_in_progress

Environment

  • Red Hat OpenShift Container Platform 4.5
  • IPI based Installation with Red Hat Vitualization Platform.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content