AMQ212037 AMQ222010: Critical IO Error, shutting down the server on OCP over Azure deployment

Solution Verified - Updated -

Environment

  • Microsoft Azure Red Hat OpenShift 4.6.15
  • AMQ 7.8

Issue

  • Many restarts in brokers, with the following message
AMQ212037: Connection failure to /server:port has been detected: readAddress(..) failed: Connection reset by peer [code=GENERIC_EXCEPTION]  ...
WARN [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /server:port has been detected: readAddress(..) failed: Connection reset by peer [code=GENERIC_EXCEPTION] ...
WARN [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /server:port has been detected: readAddress(..) failed: Connection reset by peer [code=GENERIC_EXCEPTION] ...
WARN [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /server:port has been detected: readAddress(..) failed: Connection reset by peer [code=GENERIC_EXCEPTION] ...
WARN [org.apache.activemq.artemis.core.server] AMQ222010: Critical IO Error, shutting down the server. file=NIOSequentialFile /opt/broker-amq/data/paging/file-name.page, message=Bad file descriptor: ActiveMQIOErrorException[errorType=IO_ERROR message=Bad file descriptor]* at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.close(NIOSequentialFile.java:226) [artemis-journal-2.16.0.redhat-00012.jar:2.16.0.redhat-00012] at org.apache.activemq.artemis.core.paging.impl.Page.close(Page.java:511) [artemis-server-2.16.0.redhat-00012.jar:2.16.0.redhat-00012] at org.apache.activemq.artemis.core.paging.impl.Page.close(Page.java:490) [artemis-server-2.16.0.redhat-00012.jar:2.16.0.redhat-00012] at org.apache.activemq.artemis.core.paging.impl.PagingStoreImpl.openNewPage(PagingStoreImpl.java:1107) [artemis-server-2.16.0.redhat-00012.jar:2.16.0.redhat-00012] at org.apache.activemq.artemis.core.paging.impl.PagingStoreImpl.page(PagingStoreImpl.java:859) [artemis-server-2.16.0.redhat-00012.jar:2.16.0.redhat-00012] at org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.addToPage(AbstractJournalStorageManager.java:2235) [artemis-server-2.16.0.redhat-00012.jar:2.16.0.redhat-00012]

Resolution

  • Change PV storage class to "managed-premium"

Root Cause

  • At the moment this KCS is written, NFS 4.1 is still "preview tech" in Azure, so "managed-premium" storage class should be used for Persistent Volumes
  • The Persistent Volume has another storage class different than "managed-premium"

Diagnostic Steps

  • Check the PVC, it has another storage class different than "managed-premium"
$ oc get pvc
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
broker-amq-broker-amq-ss-0   Bound    pvc-544e53ea-d2d2-42f0-9611-35eec486b4c4   5Gi        RWO            azure-file     17d

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments