RHEL 5.8 Occasional panic during tape write (backups)
Issue
- Periodically a server crashes when running a back to FC attached tape.
- Something that st_do_scsi is calling is overwriting the return instruction pointer on the kernel stack with the a bad value (000000000000005f).
- Bad value is the same in all dumps.
- The problem is because we're returning from a function and the return IP is incorrect we have no idea where we are.
- The problem happens around once a month.
Environment
- Red Hat Enterprise Linux 5 Update 6
- x86_64 RHEL 5.6 kernel version 2.6.18-308.8.2.el5
- The customer has a systemtap scripted loaded (iostat_60s)
- Dell Inc./PowerEdge R715
- The tape drives are:
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: HP Model: Ultrium 5-SCSI Rev: I57Z
Type: Sequential-Access ANSI SCSI revision: 06
Vendor: ADIC Model: Scalar i6000 Rev: 615Q
Type: Medium Changer ANSI SCSI revision: 03
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
