'Server with GUI' Failing After NVIDIA Driver Install

Latest response

Hello,

I am trying to install NVIDIA graphics drivers 390.59 on my RHEL-7.5 dell laptop. After installing them, the laptop fails to boot. It gets to the gray background screen after the spinning wheel but then freezes. After modifying the xorg.conf file I have eliminated all of the errors in xorg.0.log, but it still does not boot (there are some warnings that I cannot work out how to eliminate).

Can anyone please suggest what else I can investigate to find out what is preventing the GUI from booting properly? I do not know which other logs might show the failure.

Thanks!

Attachments

Responses

Hey Stuart

Red hat released an article which ive posted below that explains the current nvidia proprietary driver issue in rhel 7.5, however i have found a workaround that involves using lightdm in place of gdm, for some reason the 2 do not want to talk nicely to one another, ive put in place the following workaround:

  • Enabled epel
  • Installed lightdm, lightdm-settings, slick-greeter
  • systemctl disable gdm
  • systemctl mask gdm
  • systemctl enable lightdm
  • edit /etc/lightdm/lightdm.conf, uncomment line that contains "example-gtk", and change to slick-greeter no spaces after the =
  • Reboot pc

Link Below

https://access.redhat.com/solutions/3438891

Hi Ryan,

Tired these steps and my display seems to be working now. Here, you said to uncomment the "example-gtk" , and change to slick-greeter no spaces after the =. Should i uncomment the line and append slick-greeter to that? Please let me know.

Best, Prem Rao

Hi Prem

So you would un comment the line and replace "example-gtk" with "slick greeter", so it should look like this "=slick-greeter" no " marks :)

Hi Ryan,

Thanks for the reply. It works. Even my restart/ shutdown is not working now.I am manually shutting and starting the system. Do you have a workaround for this? Hence this all happened after the installation of Nvidia 390.87 and Cuda 9.1 and enabling lightdm. Now all three are working fine,but I need to have a proper shutdown,rather than shutting down manually.

Did you had the same issue after the installation?

Best, Prem Rao.

Hi Prem,

Do you mean a "hard shut down" by pressing the power button ? That would be a terrible option. Does the system shut down properly when you execute sudo poweroff ? It could be some kind of workaround. But even though you say that it works, there's still something not set up properly. :)

Regards,
Christian

Yes, Hard shutdown. SUDO shutdown doesn't turn of my processor and GPU. Only my screen turns off immediately after sudo shutdown. The shutdown issue started after installing CUDA in the machine.

Then this is definitely not a solution, Prem ... you are about to ruin your whole operating system.
Better revert everything you did, especially uninstall CUDA and if necessary the drivers as well. :)

Regards,
Christian

Hi Prem

Yes i did have that issue, but it was hit and miss, try a systemctl reboot and systemctl poweroff, the other steps i took were to use a version or 2 below the latest graphics driver to achieve a more stable platform.

Make sure your bios is set to discrete only aswell.

Kind Regards

Ryan

Hi Ryan,

The possibility to disable NVIDIA Optimus in the BIOS is not supported by every vendor.
On some machines you can set the NVIDIA GPU to be used exclusively, on others not. :)

Regards,
Christian

Hi Ryan,

I have used systemctl reboot and systemctl poweroff. But these commands are just making my screen go to sleep. My processor and GPU is still ON. I tried with the latest 410 drivers as well. Its still the same.

Best, Prem Rao

Nvidia drivers have been an issue since version 7.4 for me aswell, another suggestion i can make is installing elrepo enabling the kernel repository and trying with a later kernel than the stock.

Its not officially supported however it could be a helpful troubleshooting method , let me know how you go :)

Hi Stuart,

Please check whether Secure Boot is enabled (in case you have a machine with an EFI based BIOS) - if it is, disable it. :)

Regards,
Christian

Thanks very much Ryan and Christian for both of your suggestions.

I checked the 'Secure Boot' and unfortunately it was already disabled.

I followed Ryan's suggested work around and it all seemed to go correctly (except I had 'example-gtk-gnome' instead of 'example-gtk') but it still does not boot past the gray screen. It did remove all of the 'MIT-MAGIC-COOKIE-1' disconnections from the xorg.log though, but nothing else seems to have changed.

Unfortunately, I really need to use NVIDIA for my system, so perhaps my only option is to reformat and install 7.4.

Hi Stuart,

How did you install the drivers ? Did you use .run file from the NVIDIA website ? If yes, remove everything related to the currently installed NVIDIA drivers completely and try if the NVIDIA drivers from RPM Fusion work - good luck ! :)

Regards,
Christian

Hi Christian,

I was using the .run from NVIDIA. I have removed it and installed it from RPM fusion (including vulkan-filesystem as: here) but it did not seem to change the error. In fact, I have even re-installed 7.4 and I am getting the exact same failure.

I wonder if this is just a problem with the hardware configuration. The laptop has a dedicated intel card as well as the nvidia GPU (optimus) and perhaps this is the source of the problem. I have tried changing several things in the xorg.conf, but without any error logs to tell me why it is failing, I am not sure where to go.

Hi Stuart,

Please check out if enabling Direct Rendering Manager Kernel Mode Setting does solve your problem.
The NVIDIA driver’s PRIME Synchronization support relies on DRM-KMS, which is disabled by default.

Execute sudo vi /etc/default/grub and add nvidia-drm.modeset=1 to the line GRUB_CMDLINE_LINUX.
Save the change and then afterwards execute sudo grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg.

Now reboot the operating system and post back whether it works now or not - I wish you good luck ! :)

Regards,
Christian

Hi Stuart

Have you disabled optimus in the bios and set it to discrete only ?, at the grub menu press e and then put 3 at the end of the linux line and hit ctrl x this will boot you into run level 3, after you've done that do lspci | grep VGA if you can see both the intel gpu and the nvidia chances are your bios is set for optimus and not the discrete card only.

Let me know how you go :)

PS: i use the negativo repo for my nvidia and cuda packages ive found them to be more reliable and the maintainers is a fedora packager so you can be relatively sure its safe, just make sure you enable epel

https://negativo17.org/nvidia-driver/

To all who struggle with this issue,

If you encounter the problem where you're unable to get to the gnome login-screen, have no fear. When you get to the point where your system does not boot any farther, "ctrl + alt + f2" to log-in via single-user mode. Change working directories to the directory in which your NVIDIA driver is saved to. Issue the following command as an elevated user (root - also ignore the quotations) "sh NVIDIA-LINUX-X86_64-version.xx.run -a -s"

NVIDIA will reiterate the install. After completion, run the following command "nvidia-xconfig". This rewrites and backs-up NVIDIA configuration files. Once completed, issue a reboot and you should be able to reach the GUI log-in.

Cheers.

Ryan Coe,

Thanks man. You pointed me in the right direction. The problem was that in EFI BIOS mode, the integrated graphics was running. The indicator that gave it away was I ran nvidia-detect from elrepo and it warned me that integrated graphics was running. I went into the BIOS and disabled hybrid graphics and it worked!

I got into this mess by trying to downgrade from cuda 10.1 to cuda 9.2 on rhel 7.6. I'm still fighting it... but something useful to add/know is if the ctrl+alt+f2 doesn't work, you can still boot into run level 1. When grub comes up, that's the list of OS's to boot from, press "e" to edit the one you normally use. On the screen that comes up, there will be a really long line for linuxefi.. right arrow to the very end of it and add a " 1", then "ctrl-x" to restart. It isn't a permanent mod, but it will allow you to boot to a login prompt and work on things. You have to know the root password.