AMQ Streams crashes when deleting log
Issue
-
AMQ Streams crashed when deleting logs and unable to start cleanly. The Kafka pods ends with crash/loop backoff status, and raise the following exception in the log:
09:02:07.274 kafka 2019-12-11 09:02:07,274 INFO [ReplicaAlterLogDirsManager on broker 0] Removed fetcher for partitions __consumer_offsets-22,__consumer_offsets-30,bridge-alfa-0,__consumer_offsets-8,__consumer_offsets-21,bulki-feedback-store-0,__consumer_offsets-4,event-notification-0,bridge-ca-0,__consumer_offsets-27,bulki-event-log-0,__consumer_offsets-7,bulki-user-access-0,__consumer_offsets-9,__consumer_offsets-46,event-notification-2-0,__consumer_offsets-25,__consumer_offsets-35,feed-data-0,grants-redemption-replay-0,bulki-access-bulki-user-access-store-changelog-0,__consumer_offsets-41,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,__consumer_offsets-16,__consumer_offsets-28,bulki-access-updates-0,__consumer_offsets-31,__consumer_offsets-36,__consumer_offsets-42,__consumer_offsets-3,__consumer_offsets-18,__consumer_offsets-37,feeds-audit-jobs-0,__consumer_offsets-15,__consumer_offsets-24,message-stats-0,grants-settlement-replay-0,command-log-0,__consumer_offsets-38,__consumer_offsets-17,__consumer_offsets-48,bulki-account-bulki-user-account-store-changelog-0,bulki-account-updates-0,bulki-chunk-updates-3,__consumer_offsets-19,__consumer_offsets-11,asset-notification-0,rsam-live-replay-0,feeds-event-log-0,__consumer_offsets-13,__consumer_offsets-2,__consumer_offsets-43,__consumer_offsets-6,__consumer_offsets-14,bulki-chunk-updates-1,__consumer_offsets-20,__consumer_offsets-0,__consumer_offsets-44,__consumer_offsets-39,bulki-file-data-0,__consumer_offsets-12,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-5,__consumer_offsets-26,__consumer_offsets-29,__consumer_offsets-34,__consumer_offsets-10,__consumer_offsets-32,__consumer_offsets-40,bulki-chunk-updates-2 (kafka.server.ReplicaAlterLogDirsManager) [LogDirFailureHandler] 09:02:07.269 kafka at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) 09:02:07.269 kafka at java.nio.file.Files.move(Files.java:1395) 09:02:07.269 kafka at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786) 09:02:07.269 kafka at org.apache.kafka.common.record.FileRecords.renameTo(FileRecords.java:211) 09:02:07.269 kafka at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:488) 09:02:07.269 kafka at kafka.log.Log.$anonfun$replaceSegments$6(Log.scala:1814) 09:02:07.269 kafka at kafka.log.Log.$anonfun$replaceSegments$6$adapted(Log.scala:1814) 09:02:07.269 kafka at scala.collection.immutable.List.foreach(List.scala:389) 09:02:07.269 kafka at kafka.log.Log.replaceSegments(Log.scala:1814) 09:02:07.269 kafka at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:533) 09:02:07.269 kafka at kafka.log.Cleaner.$anonfun$doClean$6(LogCleaner.scala:465) 09:02:07.269 kafka at kafka.log.Cleaner.$anonfun$doClean$6$adapted(LogCleaner.scala:464) 09:02:07.269 kafka at scala.collection.immutable.List.foreach(List.scala:389) 09:02:07.269 kafka at kafka.log.Cleaner.doClean(LogCleaner.scala:464) 09:02:07.269 kafka at kafka.log.Cleaner.clean(LogCleaner.scala:442) 09:02:07.269 kafka at kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:303) 09:02:07.269 kafka at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:289) 09:02:07.269 kafka at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82) 09:02:07.269 kafka Suppressed: java.nio.file.FileAlreadyExistsException: /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log.swap -> /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log 09:02:07.269 kafka at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) 09:02:07.269 kafka at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 09:02:07.269 kafka at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396) 09:02:07.269 kafka at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) 09:02:07.269 kafka at java.nio.file.Files.move(Files.java:1395) 09:02:07.269 kafka at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:783) 09:02:07.269 kafka ... 15 more 09:02:07.269 kafka 2019-12-11 09:02:07,269 INFO [ReplicaManager broker=0] Stopping serving replicas in dir /var/lib/kafka/kafka-log0 (kafka.server.ReplicaManager) [LogDirFailureHandler] 09:02:07.268 kafka java.nio.file.FileAlreadyExistsException: /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log.swap -> /var/lib/kafka/kafka-log0/__consumer_offsets-16/00000000000000000000.log 09:02:07.268 kafka at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 09:02:07.268 kafka 2019-12-11 09:02:07,268 ERROR Failed to clean up log for __consumer_offsets-16 in dir /var/lib/kafka/kafka-log0 due to IOException (kafka.server.LogDirFailureChannel) [kafka-log-cleaner-thread-0] 09:02:07.268 kafka at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) 09:02:07.268 kafka at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:457) 09:02:07.095 kafka 2019-12-11 09:02:07,095 INFO Cleaner 0: Swapping in cleaned segment LogSegment(baseOffset=0, size=1612) for segment(s) List(LogSegment(baseOffset=0, size=1619), LogSegment(baseOffset=715905, size=16049337)) in log Log(/var/lib/kafka/kafka-log0/__consumer_offsets-16) (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] 09:02:06.986 kafka 2019-12-11 09:02:06,985 INFO Cleaner 0: Cleaning segment 0 in log __consumer_offsets-16 (largest timestamp Wed Dec 04 09:01:45 UTC 2019) into 0, retaining deletes. (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] 09:02:06.986 kafka 2019-12-11 09:02:06,986 INFO Cleaner 0: Cleaning segment 715905 in log __consumer_offsets-16 (largest timestamp Wed Dec 11 09:01:48 UTC 2019) into 0, retaining deletes. (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] 09:02:06.958 kafka 2019-12-11 09:02:06,958 INFO Cleaner 0: Cleaning log __consumer_offsets-16 (cleaning prior to Wed Dec 11 09:01:48 UTC 2019, discarding tombstones prior to Tue Dec 03 09:01:45 UTC 2019)... (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] 09:02:06.957 kafka 2019-12-11 09:02:06,957 INFO Cleaner 0: Offset map for log __consumer_offsets-16 complete. (kafka.log.LogCleaner) [kafka-log-cleaner-thread-0] 09:02:06.909 kafka 2019-12-11T09:02:06.865+0000: 57766.127: [GC pause (G1 Evacuation Pause) (young), 0.0431825 secs]
Environment
- AMQ Streams 1.0.0
- OpenShift Container 3.11
- Storage: Gluster File Storage
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.