vLLM Inference Server Hangs on A+ Server 4125GS-TNRT System with NVIDIA L40 GPUs during RHEL AI Certification

Solution In Progress - Updated -

Issue

The vLLM Inference Server hangs during the initialization.

Jan 22 03:29:05 localhost.localdomain kernel: AMD-Vi: IOMMU Event log restarting
Jan 22 03:29:05 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0x1a000000000 flags=0x0030]
Jan 22 03:29:05 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xb6139070 flags=0x0020]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xb6139070 flags=0x0020]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0x1a000000000 flags=0x0030]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xb6139068 flags=0x0020]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0x1a000000000 flags=0x0030]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xb6139068 flags=0x0020]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xb6139070 flags=0x0020]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0x1a000000000 flags=0x0030]
Jan 22 03:29:06 localhost.localdomain kernel: nvidia 0000:c3:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0012 address=0xb6139068 flags=0x0020]
Jan 22 03:29:45 localhost.localdomain rhaiis[65872]: (EngineCore_DP0 pid=168) INFO 01-22 03:29:45 [shm_broadcast.py:466] No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work (e.g. compilation).

Environment

  • Red Hat Enterprise Linux AI 3.0
  • AMD EPYC Processors with NVIDIA L40 GPU
  • Super Micro Computer, Inc. A+ Server 4125GS-TNRT

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content