System hung due to blocked tasks of IBM Infosphere Datastage Engine

Solution Verified - Updated -

Issue

  • jbd2/dm-11-8 high utilization, cannot run sosreport
  • osh processes of IBM DataStage Engine parallel jobs going into "D" state

  • When the VM guest is vmotioned to a different hypervisor the issue goes away for a few hours and occurs again when the osh tasks (of IBM Infosphere DataStage Engine) gets into un-interruptable D state and the system was getting hung and unresponsive.

crash> ps -m | grep UN 
[0 00:00:00.012] [UN]  PID: 23191  TASK: ffff88057b2bb520  CPU: 2   COMMAND: "chgrp"
[0 00:00:00.013] [UN]  PID: 2106   TASK: ffff8809289a2040  CPU: 1   COMMAND: "jbd2/dm-12-8"
[0 00:00:00.013] [UN]  PID: 30054  TASK: ffff88092bf7cab0  CPU: 1   COMMAND: "flush-253:12"
[0 00:00:04.164] [UN]  PID: 17412  TASK: ffff88089e7c4040  CPU: 1   COMMAND: "osh"
[0 00:00:04.696] [UN]  PID: 20503  TASK: ffff8808771a0040  CPU: 1   COMMAND: "osh"
[0 00:00:04.743] [UN]  PID: 20630  TASK: ffff880825b68ab0  CPU: 1   COMMAND: "osh"
[0 00:00:22.565] [UN]  PID: 27524  TASK: ffff88092cc5b520  CPU: 3   COMMAND: "osh"
[0 00:00:26.847] [UN]  PID: 27443  TASK: ffff88091e97a040  CPU: 2   COMMAND: "osh"
[0 00:00:27.753] [UN]  PID: 20031  TASK: ffff8804c194c040  CPU: 2   COMMAND: "osh"
[0 00:00:34.031] [UN]  PID: 20421  TASK: ffff8805f6bb9520  CPU: 0   COMMAND: "osh"
[0 00:00:35.467] [UN]  PID: 18698  TASK: ffff88011e6de040  CPU: 3   COMMAND: "osh"
[0 00:00:39.865] [UN]  PID: 19404  TASK: ffff88071947b520  CPU: 0   COMMAND: "osh"
[0 00:00:46.310] [UN]  PID: 20336  TASK: ffff880731ae3520  CPU: 1   COMMAND: "osh"
[0 00:00:50.430] [UN]  PID: 17656  TASK: ffff88011e6df520  CPU: 1   COMMAND: "osh"
[0 00:00:54.730] [UN]  PID: 13791  TASK: ffff880737d5b520  CPU: 1   COMMAND: "osh"
[0 00:01:09.968] [UN]  PID: 1137   TASK: ffff8808872e1520  CPU: 1   COMMAND: "osh"
[0 00:01:18.562] [UN]  PID: 26692  TASK: ffff880102becab0  CPU: 3   COMMAND: "osh"
[0 00:01:30.395] [UN]  PID: 22222  TASK: ffff880923f64040  CPU: 1   COMMAND: "osh"
[0 00:01:33.840] [UN]  PID: 12621  TASK: ffff880146eeeab0  CPU: 2   COMMAND: "osh"
[0 00:01:36.298] [UN]  PID: 11269  TASK: ffff8805f6a2a040  CPU: 1   COMMAND: "osh"
[0 00:01:41.918] [UN]  PID: 20785  TASK: ffff88091cb1eab0  CPU: 0   COMMAND: "osh"
[0 00:01:53.106] [UN]  PID: 24394  TASK: ffff8808a4696ab0  CPU: 2   COMMAND: "osh"
[0 00:01:58.707] [UN]  PID: 18383  TASK: ffff88092e9c0040  CPU: 2   COMMAND: "osh"
crash> 

Environment

  • Red Hat Enterprise Linux 6.7
  • Vmware
  • IBM Infosphere Datastage Software (high performance parallel ETL jobs processing Engine)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content