System crashed while performing LPAR memory reconfiguration

Solution In Progress - Updated -

Issue

  • While performing an LPAR memory reconfiguration the host crashed with the following:
[19454848.744890] Offlined Pages 4096
[19454848.803812] Offlined Pages 4096
[19454848.808600] Offlined Pages 4096
[19454848.809205] Unable to handle kernel paging request for data at address 0x00008398
[19454848.809211] Faulting instruction address: 0xc000000000367710
[19454848.809217] Oops: Kernel access of bad area, sig: 11 [#1]
[19454848.809220] SMP NR_CPUS=2048 NUMA pSeries
[19454848.809227] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache rpadlpar_io rpaphp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag sg pseries_rng binfmt_misc ip_tables ext4 mbcache jbd2 dm_service_time sd_mod sr_mod crc_t10dif crct10dif_generic cdrom crct10dif_common ibmvfc ibmvscsi scsi_transport_fc scsi_transport_srp ibmveth scsi_tgt dm_multipath dm_mirror dm_region_hash dm_log dm_mod
[19454848.809280] CPU: 3 PID: 55842 Comm: kworker/u448:3 Kdump: loaded Tainted: G    B   W      ------------   3.10.0-1160.36.2.el7.ppc64le #1
[19454848.809290] Workqueue: pseries hotplug workque pseries_hp_work_fn
[19454848.809295] task: c000000fe6809a00 ti: c000000fe4cec000 task.ti: c000000fe4cec000
[19454848.809300] NIP: c000000000367710 LR: c00000000039506c CTR: 000000000000a6a4
[19454848.809304] REGS: c000000fe4cef740 TRAP: 0300   Tainted: G    B   W      ------------    (3.10.0-1160.36.2.el7.ppc64le)
[19454848.809309] MSR: 8000000100009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 280f8d84  XER: 20000000
[19454848.809322] CFAR: c0000000000093ec DAR: 0000000000008398 DSISR: 40000000 SOFTE: 1
GPR00: c00000000039506c c000000fe4cef9c0 c00000000148c000 0000000000000007
GPR04: 0000000000000800 c000000fe4cef880 ffffffff00000000 ffffffffffffffff
GPR08: c00000000183c000 c0000000016379d8 0000000000010000 0000000000000040
GPR12: 0000000000000000 c000000007ad1c80 c00000000013a5a8 c000000fe7cefb40
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 00000000003fffff 0000000000000001 c000001fe8546000
GPR24: 0000000000000007 c0000000013a1bd0 c0000000013a1c08 0000001010000000
GPR28: 0000000010000000 c000000001861a80 0000000000000000 0000000000000007
[19454848.809384] NIP [c000000000367710] try_offline_node+0x40/0x210
[19454848.809390] LR [c00000000039506c] remove_memory+0x1bc/0x280
[19454848.809394] Call Trace:
[19454848.809399] [c000000fe4cef9c0] [c000000fe4cefa00] 0xc000000fe4cefa00 (unreliable)
[19454848.809405] [c000000fe4cefa00] [c00000000039506c] remove_memory+0x1bc/0x280
[19454848.809411] [c000000fe4cefab0] [c0000000000ba57c] dlpar_remove_lmb+0x1cc/0x230
[19454848.809417] [c000000fe4cefb00] [c0000000000bb1a0] dlpar_memory+0x760/0xe40
[19454848.809422] [c000000fe4cefbc0] [c0000000000b11e8] pseries_hp_work_fn+0x78/0x220
[19454848.809428] [c000000fe4cefc40] [c00000000012d71c] process_one_work+0x1dc/0x680
[19454848.809434] [c000000fe4cefce0] [c00000000012dd60] worker_thread+0x1a0/0x520
[19454848.809439] [c000000fe4cefd80] [c00000000013a694] kthread+0xf4/0x100
[19454848.809445] [c000000fe4cefe30] [c00000000000a62c] ret_from_kernel_thread+0x5c/0x70
[19454848.809449] Instruction dump:
[19454848.809453] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffc1 786a1f24 7c7f1b78
[19454848.809462] 3d22001b 3929b9d8 7fc9502a 3d5e0001 <e92a8398> e8aa83a8 7ca92a14 7fa92840
[19454848.809476] ---[ end trace 1acaf4ccbaeeed95 ]---
[19454848.816836]
[19454848.816858] Sending IPI to other CPUs
[19454848.817950] IPI complete

Environment

  • Red Hat Enterprise Linux 7.9
  • PPC64LE

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content