System hung due to blocked tasks of IBM Infosphere Datastage Engine

Solution Verified - Updated -

Issue

  • jbd2/dm-11-8 high utilization, cannot run sosreport
  • osh processes of IBM DataStage Engine parallel jobs going into "D" state

  • When the VM guest is vmotioned to a different hypervisor the issue goes away for a few hours and occurs again when the osh tasks (of IBM Infosphere DataStage Engine) gets into un-interruptable D state and the system was getting hung and unresponsive.

crash> ps -m | grep UN 
[0 00:00:00.012] [UN]  PID: 23191  TASK: ffff88057b2bb520  CPU: 2   COMMAND: "chgrp"
[0 00:00:00.013] [UN]  PID: 2106   TASK: ffff8809289a2040  CPU: 1   COMMAND: "jbd2/dm-12-8"
[0 00:00:00.013] [UN]  PID: 30054  TASK: ffff88092bf7cab0  CPU: 1   COMMAND: "flush-253:12"
[0 00:00:04.164] [UN]  PID: 17412  TASK: ffff88089e7c4040  CPU: 1   COMMAND: "osh"
[0 00:00:04.696] [UN]  PID: 20503  TASK: ffff8808771a0040  CPU: 1   COMMAND: "osh"
[0 00:00:04.743] [UN]  PID: 20630  TASK: ffff880825b68ab0  CPU: 1   COMMAND: "osh"
[0 00:00:22.565] [UN]  PID: 27524  TASK: ffff88092cc5b520  CPU: 3   COMMAND: "osh"
[0 00:00:26.847] [UN]  PID: 27443  TASK: ffff88091e97a040  CPU: 2   COMMAND: "osh"
[0 00:00:27.753] [UN]  PID: 20031  TASK: ffff8804c194c040  CPU: 2   COMMAND: "osh"
[0 00:00:34.031] [UN]  PID: 20421  TASK: ffff8805f6bb9520  CPU: 0   COMMAND: "osh"
[0 00:00:35.467] [UN]  PID: 18698  TASK: ffff88011e6de040  CPU: 3   COMMAND: "osh"
[0 00:00:39.865] [UN]  PID: 19404  TASK: ffff88071947b520  CPU: 0   COMMAND: "osh"
[0 00:00:46.310] [UN]  PID: 20336  TASK: ffff880731ae3520  CPU: 1   COMMAND: "osh"
[0 00:00:50.430] [UN]  PID: 17656  TASK: ffff88011e6df520  CPU: 1   COMMAND: "osh"
[0 00:00:54.730] [UN]  PID: 13791  TASK: ffff880737d5b520  CPU: 1   COMMAND: "osh"
[0 00:01:09.968] [UN]  PID: 1137   TASK: ffff8808872e1520  CPU: 1   COMMAND: "osh"
[0 00:01:18.562] [UN]  PID: 26692  TASK: ffff880102becab0  CPU: 3   COMMAND: "osh"
[0 00:01:30.395] [UN]  PID: 22222  TASK: ffff880923f64040  CPU: 1   COMMAND: "osh"
[0 00:01:33.840] [UN]  PID: 12621  TASK: ffff880146eeeab0  CPU: 2   COMMAND: "osh"
[0 00:01:36.298] [UN]  PID: 11269  TASK: ffff8805f6a2a040  CPU: 1   COMMAND: "osh"
[0 00:01:41.918] [UN]  PID: 20785  TASK: ffff88091cb1eab0  CPU: 0   COMMAND: "osh"
[0 00:01:53.106] [UN]  PID: 24394  TASK: ffff8808a4696ab0  CPU: 2   COMMAND: "osh"
[0 00:01:58.707] [UN]  PID: 18383  TASK: ffff88092e9c0040  CPU: 2   COMMAND: "osh"
crash> 

Environment

  • Red Hat Enterprise Linux 6.7
  • Vmware
  • IBM Infosphere Datastage Software (high performance parallel ETL jobs processing Engine)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In