CRI-O service constantly crashes in an endless loop in Red Hat OpenShift 4

Solution Verified - Updated -

Issue

  • The Infra/Worker node is not booting and is not part of the cluster.
  • The worker node disables scheduling and stays in the NotReady state.
  • The kubelet service is restarting continuously on a worker node.
  • CRI-O is continuously killed by a SIG ABRT generating the following stack trace:

    May 24 07:22:01 odf-01.qa.ocp.example.com systemd[1]: crio.service: Main process exited, code=killed, status=6/ABRT
    May 24 07:22:01 odf-01.qa.ocp.example.com systemd[1]: crio.service: Failed with result 'signal'.
    May 24 07:22:01 odf-01.qa.ocp.example.com systemd[1]: crio.service: Consumed 861ms CPU time
    May 24 07:22:01 odf-01.qa.ocp.example.com systemd-coredump[1276211]: Process 1276062 (crio) of user 0 dumped core.
       Stack trace of thread 1276205:
       #0  0x000055f0c63e7961 runtime.raise (crio)
       #1  0x000055f0c63c35f1 runtime.sigfwdgo (crio)
       #2  0x000055f0c63c1df4 runtime.sigtrampgo (crio)
       #3  0x000055f0c63e7ce3 runtime.sigtramp (crio)
       #4  0x00007fc0a0899b20 __restore_rt (libpthread.so.0)
       #5  0x000055f0c63e7961 runtime.raise (crio)
       #6  0x000055f0c63ab62e runtime.fatalpanic (crio)
       #7  0x000055f0c63aaf65 runtime.gopanic (crio)
       #8  0x000055f0c6ea0e4b github.com/cri-o/cri-o/vendor/go.etcd.io/bbolt.(*freelist).read (crio)
       #9  0x000055f0c6eab597 github.com/cri-o/cri-o/vendor/go.etcd.io/bbolt.(*DB).loadFreelist.func1 (crio)
       #10 0x000055f0c63ff1ce sync.(*Once).doSlow (crio)
       #11 0x000055f0c6e9b68c github.com/cri-o/cri-o/vendor/go.etcd.io/bbolt.(*DB).loadFreelist (crio)
       #12 0x000055f0c6e9b12f github.com/cri-o/cri-o/vendor/go.etcd.io/bbolt.Open (crio)
       #13 0x000055f0c6eadf95 github.com/cri-o/cri-o/vendor/github.com/containers/image/v5/pkg/blobinfocache/boltdb.(*cache).update (crio)
       #14 0x000055f0c6eae68f github.com/cri-o/cri-o/vendor/github.com/containers/image/v5/pkg/blobinfocache/boltdb.(*cache).RecordKnownLocation (crio)
       #15 0x000055f0c7195849 github.com/cri-o/cri-o/vendor/github.com/containers/image/v5/docker.(*dockerImageSource).GetBlob (crio)
       #16 0x000055f0c70e354a github.com/cri-o/cri-o/vendor/github.com/containers/image/v5/copy.(*imageCopier).copyLayer (crio)
       #17 0x000055f0c70ebda5 github.com/cri-o/cri-o/vendor/github.com/containers/image/v5/copy.(*imageCopier).copyLayers.func1 (crio)
       #18 0x000055f0c63e6141 runtime.goexit (crio)
    May 24 07:22:01 odf-01.qa.ocp.example.com systemd[1]: crio.service: Service RestartSec=100ms expired, scheduling restart.
    May 24 07:22:01 odf-01.qa.ocp.example.com systemd[1]: crio.service: Scheduled restart job, restart counter is at 54727.
    May 24 07:22:01 odf-01.qa.ocp.example.com systemd[1]: Stopping Kubernetes Kubelet...
    

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • 4.9
  • Container runtime
    • CRI-O

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content