After installing Nvidia and Cuda drivers on rhel 7.6, boot process is very slow

Latest response

Hello Everyone,

It appears something got broken in the process of installing the nvidia drivers on rhel 7.6, but im wondering what I might be able to do to improve/investigate this. Last time I installed the drivers with centos 7.5 they seemed fine, but I switched to rhel 7.6 yesterday.

Responses

if I install the nvidia driver as root i see this error- "An Incomplete installation of libglvnd was found. all of the essential libglvnd libraries are present, but one or more optional components are missing. do you want to install a full copy of libglvnd? this will overwrite any existing libglvnd libraries".

It doesn't matter if i select to install and overwrite thugh because if I run the installer again, I would expect this message to dissapear if it was successful, and it always shows up.

Hi Andrew,

You need to install libglvnd-devel before you install the NVIDIA drivers ... :)
sudo yum install libglvnd-devel prevents the installer from overwriting.

Regards,
Christian

Thanks Christian, ill give that a shot!

You're welcome, Andrew ! :) It should work, I've done that several times - please give us a feedback.

Regards,
Christian

This round (after reimaging to pre cuda install), after installation, I was able to build and test the cuda examples. good sign. However - when rebooted to log in, it will display a console for a moment, and then ask me to log in again.

I followed these steps, with alterations- https://access.redhat.com/solutions/1453633

I ran these commands instead of the wget instructions to get the epel release, and also installed lbglnvd-devel to avoid the nvidia driver complaining about this.

sudo yum install epel-release sudo yum install libglvnd-devel

when installing the cuda driver, I did not allow it to replace my x config. I also said yes to creating the symlinks. I installed the nvidia driver with cuda.

after installing the cuda driver, I ran nvidia-smi straight away to initialise the files that exist at path /dev/nvidia*

also on this instruction point below, its not specified that should be replaced with the actual version (eg 10). is it possible to omit version altogether though since the cuda install creates siymlinks from /usr/local/cuda/ to /usr/local/cuda-10/ ? thats what I did-

export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

I added these changes to the /etc/environment file to make them permanent.

I tried another iteration with some success - this time I did allow the cuda installer to update the x config. I also decided to not touch any envirnment variables since I suspected they were causing problems, and I can boot with the nvidia driver functioning. so the next step I have to figure out is what steps do I take to set the environment permanently and correctly?

Thanks again for any help.

Hi Andrew,

Good to read that you've got the drivers running properly ... which was the most important part.
Unfortunately I don't have enough experience with CUDA to give you best possible instructions.
But I think you will get that figured out on your own - right by testing and trying things out ... :)

Regards,
Christian

cheers Christian, Ill get there :)

I don't work for RHT support but thought I'd post my procedure to install CUDA and nvidia drivers on RHEL7.6 server (minimal install).

# yum -y update

# reboot 

# yum -y install kernel-devel-$(uname -r) kernel-headers-$(uname -r) pciutils

# yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

# yum -y install https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.0.130-1.x86_64.rpm

# yum clean all

# yum -y install cuda

Thanks for sharing your approach, ill keep that in mind for next time. at what stage is nouveau disabled there?

I suspect the nouveau kernel module is unloaded when yum installs cuda and the nvidia drivers. At least I did not have to manually disable or remove nouveau.

in rhel 7.5 after installing the epel repo, I can't install libglvnd-devel for some reason.

[root@workstation tmp]# sudo yum install libglvnd-devel Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager No package libglvnd-devel available. Error: Nothing to do [root@workstation tmp]#

Hi Andrew,

The package is available in the rhel-7-server-rpms repository. Eventually you need to enable the extended update support rhel-7-server-eus-rpms repository to gain access to the libglvnd-devel package for RHEL 7.5 - please check out if that works. :)

Regards,
Christian

Thanks Christian. does this look correct to you? it appears to not be available

[user@workstation ~]$ sudo yum install yum-utils -y Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager Package yum-utils-1.1.31-46.el7_5.noarch already installed and latest version Nothing to do [user@workstation ~]$ sudo yum-config-manager --enable rhel-7-server-eus-rpms Loaded plugins: langpacks, product-id, subscription-manager [user@workstation ~]$ sudo yum install libglvnd-devel Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager No package libglvnd-devel available. Error: Nothing to do [user@workstation ~]$

Hi Andrew, are you using the workstation edition ? In case you do, please check if it is available there too : Red Hat Package Browser ... on my server system the libglvnd-devel package is available. :)

$ sudo yum list libglvnd-devel
Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager
Available Packages
libglvnd-devel.i686          1:1.0.1-0.8.git5baa1e5.el7         rhel-7-server-rpms
libglvnd-devel.x86_64        1:1.0.1-0.8.git5baa1e5.el7         rhel-7-server-rpms  

Regards,
Christian

I'm using rhel workstation. a while back i was on server because I downloaded the wrong iso, but im on workstation now.

[user@workstation ~]$ rpm -q libglvnd-devel package libglvnd-devel is not installed [user@workstation ~]$ sudo yum list libglvnd-devel [sudo] password for user: Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager Error: No matching Packages to list

Did you check with the Package Browser ? The package might be not available for workstation.
Workstation is only a sub set of Server - can you attach a valid Server subscription, Andrew ? :)

Regards,
Christian

Hi Andrew,

Did you install RHEL server 7.5 or RHEL workstation 7.5?

In the later case you should not activate any server repositories.

Still the installation should work.

Maybe the rpm is already installed, what does the following command show?

rpm -q libglvnd-devel

Regards,

Jan Gerrit Kootstra

Hey jan - i get this on RHEL workstation 7.5 [user@workstation ~]$ rpm -q libglvnd-devel package libglvnd-devel is not installed

I had to remove the Nvidia card compeltely because it would periodically freeze up. I originally installed it on a 7.5 workstation. Have you had any luck post libglvnd package? I might give it a another shot if that's the case.

my problem on RHEL 7.6 is that even if I blacklisted noveau both in GRUB and also in the blacklist conf file... Its still opening in Graphical mode instead of text or multiuser.target mode.. I will try the guide that Andrew was able to do with..

Hi Jatin,

Execute sudo systemctl set-default multi-user.target - reboot and you reach run level 3 ("text mode") :)

Regards,
Christian

Now after following the CUDA environment page, I am stuck on GRUB when i boot its black screen :( Please make RPM fusion driver to work, i know it dont work with RHEL only with fedora but please :(

Hi Jatin,

You may want to check out the original NVIDIA drivers ... probably they work better than the RPM Fusion drivers. :)

Regards,
Christian

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.