ubuntu系统,安装mmdeploy时想换nvidia驱动,换了之后按网上说的换cuda,重启后出问题,进不了系统,黑屏,闪烁之后也不出来字,出不来系统选择目录。

把主板上的纽扣电池拿下来放电,过几分钟后装上,系统能进系统选择界面了。

但进入到系统初始界面后就卡住,Ctrl+Fx一开始也会卡住,在started GNOME Display Manager这一行,重启几次后按Ctrl+Fx可以切换到其他tty,但时会闪烁一段时间,后来就好了可以登陆。

nvidia-smi也会有响应,但就是没有可视化界面。应该是nvidia驱动相关的问题。
 

ipmitool lan print 利用服务器工具远程可以看显示器画面

Set in Progress         : Set Complete
Auth Type Support       : NONE MD2 MD5 PASSWORD
Auth Type Enable        : Callback : MD2 MD5 PASSWORD
                        : User     : MD2 MD5 PASSWORD
                        : Operator : MD2 MD5 PASSWORD
                        : Admin    : MD2 MD5 PASSWORD
                        : OEM      : MD2 MD5 PASSWORD
IP Address Source       : DHCP Address
IP Address              : 192.168.8.11
Subnet Mask             : 255.255.255.0
MAC Address             : ac:1f:6b:e6:ce:bb
SNMP Community String   : public
IP Header               : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
BMC ARP Control         : ARP Responses Enabled, Gratuitous ARP Disabled
Default Gateway IP      : 192.168.8.1
Default Gateway MAC     : 00:00:00:00:00:00
Backup Gateway IP       : 0.0.0.0
Backup Gateway MAC      : 00:00:00:00:00:00
802.1q VLAN ID          : Disabled
802.1q VLAN Priority    : 0
RMCP+ Cipher Suites     : 1,2,3,6,7,8,11,12
Cipher Suite Priv Max   : XaaaXXaaaXXaaXX
                        :     X=Cipher Suite Unused
                        :     c=CALLBACK
                        :     u=USER
                        :     o=OPERATOR
                        :     a=ADMIN
                        :     O=OEM
Bad Password Threshold  : 3
Invalid password disable: yes
Attempt Count Reset Int.: 300
User Lockout Interval   : 300

 cd /etc/X11/ 该目录下是与界面相关的内容

ls
app-defaults  default-display-manager           fonts    xinit  xorg.conf.backup  Xreset    Xresources  Xsession.d        xsm         Xwrapper.config
cursors       default-display-manager.dpkg-tmp  rgb.txt  xkb    xrdp              Xreset.d  Xsession    Xsession.options  XvMCConfig

 cat xorg.conf.backup
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 450.66


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

cat default-display-manager
/usr/sbin/gdm3

systemctl status lightdm
● lightdm.service - Light Display Manager
   Loaded: loaded (/lib/systemd/system/lightdm.service; static; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:lightdm(1)
root@lthpc:/etc/X11# systemctl start lightdm
Job for lightdm.service failed because the control process exited with error code.
See "systemctl status lightdm.service" and "journalctl -xe" for details.
root@lthpc:/etc/X11# journalctl  -xe
-- Unit lightdm.service has finished shutting down.
Dec 29 17:34:18 lthpc systemd[1]: Starting Light Display Manager...
-- Subject: Unit lightdm.service has begun start-up
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit lightdm.service has begun starting up.
Dec 29 17:34:18 lthpc systemd[1]: lightdm.service: Control process exited, code=exited status=1
Dec 29 17:34:18 lthpc systemd[1]: lightdm.service: Failed with result 'exit-code'.
Dec 29 17:34:18 lthpc systemd[1]: Failed to start Light Display Manager.
-- Subject: Unit lightdm.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit lightdm.service has failed.
--
-- The result is RESULT.
Dec 29 17:34:19 lthpc systemd[1]: lightdm.service: Service hold-off time over, scheduling restart.
Dec 29 17:34:19 lthpc systemd[1]: lightdm.service: Scheduled restart job, restart counter is at 5.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Automatic restarting of the unit lightdm.service has been scheduled, as the result for
-- the configured Restart= setting for the unit.
Dec 29 17:34:19 lthpc systemd[1]: Stopped Light Display Manager.
-- Subject: Unit lightdm.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit lightdm.service has finished shutting down.
Dec 29 17:34:19 lthpc systemd[1]: lightdm.service: Start request repeated too quickly.
Dec 29 17:34:19 lthpc systemd[1]: lightdm.service: Failed with result 'exit-code'.
Dec 29 17:34:19 lthpc systemd[1]: Failed to start Light Display Manager.
-- Subject: Unit lightdm.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit lightdm.service has failed.
--
-- The result is RESULT.

可视化界面起不来。

nvidia-uninstall 删一遍

sudo apt-get remove --purge nvidia*

--purge是完全卸载驱动程序,包括配置文件和依赖项;nvidia*表示卸载所有与nvidia相关的软件。

完事后sudo reboot。

wget https://cn.download.nvidia.cn/XFree86/Linux-x86_64/535.146.02/NVIDIA-Linux-x86_64-535.146.02.run

chmod a+x NVIDIA-Linux-x86_64-535.146.02.run

./NVIDIA-Linux-x86_64-535.146.02.run  --no-opengl-files

--no-opengl-files

    Would you like to register the kernel module souces with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later?

询问是否要使用DKMS注册内核模块,这样更新Linux内核后会自动生成新的模块,选择no

Would you like to run the nvidia-xconfigutility to automatically update your x configuration so that the NVIDIA x driver will be used when you restart x? Any pre-existing x confile will be backed up. 选择 Yes
 

df -h 查看系统存储情况

Filesystem      Size  Used Avail Use% Mounted on
udev             63G     0   63G   0% /dev
tmpfs            13G  2.5M   13G   1% /run
/dev/sda7       182G  153G   20G  89% /
tmpfs            63G     0   63G   0% /dev/shm
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs            63G     0   63G   0% /sys/fs/cgroup
/dev/sda5       943M  285M  593M  33% /boot
tmpfs            13G     0   13G   0% /run/user/1000
tmpfs            13G  4.0K   13G   1% /run/user/121

最终重装了下桌面 可以进了

sudo apt-get remove --purge nvidia-*  # 卸载nvidia相关组件
sudo apt purge gdm gdm3 # 卸载gdm和gdm3
sudo apt install gdm3 ubuntu-desktop    # 重新安装gdm3
systemctl restart gdm       # 重新启动gdm3服务

或者 apt-get install --reinstall  ubuntu-desktop

gdm3和lightdm切换:sudo dpkg-reconfigure gdm3

感谢这个好用

12-30 09:36