现象描述

输入 nvidia-smi显示如下错误:

jiang@jiang-ThinkStation-P520:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

前几天测试的时候还好好的,突然不行了。
然后查看cuda和cudnn都是有的。

jiang@jiang-ThinkStation-P520:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)#include "driver_types.h"
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$ 

原因分析

然后百度发现,有的说是内核自动升级了与英伟达显卡不匹配导致的,得指定内核版本。

解决办法

然后我用了下面方式最后正常了,

(1)首先,查看自己安装的nvidia版本

ls /usr/src | grep nvidia
jiang@jiang-ThinkStation-P520:~$ ls /usr/src | grep nvidia
nvidia-460.56
jiang@jiang-ThinkStation-P520:~$

(2)然后,终端执行一下命令

sudo apt install dkms
sudo dkms install -m nvidia -v 460.56

(3)再次输入nvidia-smi,显示:

过程日志

jiang@jiang-ThinkStation-P520:~$ ls /usr/src | grep nvidia
nvidia-460.56
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$ sudo apt install dkms
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
dkms 已经是最新版 (2.3-3ubuntu9.7)。
dkms 已设置为手动安装。
下列软件包是自动安装的并且现在不需要了:libatomic1:i386 libbsd0:i386 libdrm-amdgpu1:i386 libdrm-intel1:i386 libdrm-nouveau2:i386 libdrm-radeon1:i386 libdrm2:i386 libedit2:i386 libelf1:i386 libexpat1:i386libffi6:i386 libfwup1 libgl1:i386 libgl1-mesa-dri:i386 libglapi-mesa:i386 libglvnd0:i386 libglx-mesa0:i386 libglx0:i386 libllvm10:i386 libllvm9 libnvidia-cfg1-440-serverlibnvidia-cfg1-450-server libnvidia-common-440 libnvidia-common-450 libnvidia-common-460 libnvidia-compute-440-server libnvidia-compute-450-serverlibnvidia-decode-440-server libnvidia-decode-450-server libnvidia-encode-440-server libnvidia-encode-450-server libnvidia-extra-440-server libnvidia-extra-450-serverlibnvidia-fbc1-440-server libnvidia-fbc1-450-server libpciaccess0:i386 libsensors4:i386 libstdc++6:i386 libx11-6:i386 libx11-xcb1:i386 libxau6:i386 libxcb-dri2-0:i386libxcb-dri3-0:i386 libxcb-glx0:i386 libxcb-present0:i386 libxcb-sync1:i386 libxcb1:i386 libxdamage1:i386 libxdmcp6:i386 libxext6:i386 libxfixes3:i386 libxnvctrl0libxshmfence1:i386 libxxf86vm1:i386 linux-hwe-5.4-headers-5.4.0-47 linux-hwe-5.4-headers-5.4.0-48 nvidia-compute-utils-440-server nvidia-compute-utils-450-servernvidia-prime nvidia-settings nvidia-utils-440-server nvidia-utils-450-server screen-resolution-extra xserver-xorg-video-nvidia-440-serverxserver-xorg-video-nvidia-450-server
使用'sudo apt autoremove'来卸载它(它们)。
升级了 0 个软件包,新安装了 0 个软件包,要卸载 0 个软件包,有 73 个软件包未被升级。
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$ sudo dkms install -m nvidia -v 460.56Creating symlink /var/lib/dkms/nvidia/460.56/source ->/usr/src/nvidia-460.56DKMS: add completed.Kernel preparation unnecessary for this kernel.  Skipping...Building module:
cleaning build area...
'make' -j8 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=5.4.0-91-generic IGNORE_CC_MISMATCH='' modules.......
.
..
Signing module:- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia-uvm.ko- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia-modeset.ko- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia-drm.ko- /var/lib/dkms/nvidia/460.56/5.4.0-91-generic/x86_64/module/nvidia.ko
Secure Boot not enabled on this system.
cleaning build area...DKMS: build completed.nvidia.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/nvidia-uvm.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/nvidia-modeset.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/nvidia-drm.ko:
Running module version sanity check.- Original module- No original module exists within this kernel- Installation- Installing to /lib/modules/5.4.0-91-generic/updates/dkms/depmod....DKMS: install completed.
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$ nvidia-smi
Fri Dec 31 15:52:30 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.156.00   Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 2080    Off  | 00000000:65:00.0 Off |                  N/A |
| 22%   50C    P0    29W / 225W |      0MiB /  7974MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
jiang@jiang-ThinkStation-P520:~$
jiang@jiang-ThinkStation-P520:~$ 

参考:

1、https://www.jianshu.com/p/6b998ba2c6a6
2、https://blog.csdn.net/sinat_23619409/article/details/85220561

输入nvidia-smi 显示NVIDIA-SMI has failed because it couldn‘t communicate wi相关推荐

  1. NVIDIA无法连接:NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver

    前言 在使用Ubuntu服务器的时候,有时会碰到GPU无法使用的情况,即当输入指令'nvidia-smi'显示NVIDIA-SMI has failed because it couldn't com ...

  2. nvidia-smi命令显示NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver

    1.在aicloud虚拟机查看硬件信息,显示共0B,剩0B. 从这里可以看出,能找到硬件设备,但无法正确读取硬件相关信息,即证明驱动问题导致的. 2.nvidia-smi NVIDIA-SMI has ...

  3. 无法连接NVIDIA驱动:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver

    NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver 重启服务器之后就出现连接不上NVIDIA驱动的 ...

  4. 无法连接NVIDIA驱动:NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver

    1. 问题 今天正要用GPU跑代码的时候,发现cuda不可用,然后在终端执行nvidia-smi发现无法连接NVIDIA驱动,问题如下: NVIDIA-SMI has failed because i ...

  5. nvidia-smi报错:NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver 解决方案

    nvidia-smi报错:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure ...

  6. WIN10 NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver解决方案

    WIN10 NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver 若在win10下输入NVIDIA- ...

  7. Centos无法连接NVIDIA驱动:NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver

    Centos无法连接NVIDIA驱动:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver 重启服务 ...

  8. nvidia-smi报错:NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver 原因及避坑解决方案

    nvidia-smi报错:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver 原因及解决方案 过了 ...

  9. ubuntu22.04:NVIDIA-SMI has failed because it couldn‘t communicate with the NVIDIA driver 解决方案

    ubuntu22.04:NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver 解决方案 文章目录 u ...

最新文章

  1. windows 2012 nps配置
  2. JAVA web项目转客户端(nativefier)
  3. HBase数据压缩编码探索
  4. oracle导出pdm文件命令,利用PowerDesigner逆向工程导出PDM模型及生成文档
  5. python3生成器_Python3.7之生成器
  6. Angular 项目中使用 ECharts 图表示例
  7. Fade out transition effect using CSS3
  8. 跳转外部地址 带header_微信公众号如何加入超链接?个人订阅号实现点击跳转链接的方法!...
  9. 远程服务器窗口调大,远程桌面缩放
  10. 使用Vue对接网易云音乐
  11. Oracle启动报错ORA-03113解决
  12. zktime 协议_Zktime8.0安装使用说明及常见故障分析
  13. 台式计算机 主控芯片型号,win10系统查看U盘的主控芯片型号的图文方法
  14. Go实战--Gorilla web toolkit使用之gorilla/sessions(iris+sessions)
  15. CAD删除数据库对象
  16. 机器视觉技术的发展动态
  17. 普通本科生面试总结以及在校编程经历
  18. 7-45 循环结构7:求x+x2+x3+...+xn的值 (50 分) 输入整数x,n(x与n均为1到8之间的整数),输出x1+x2+x3+...+xn(第二个数字表示指数)的值,不能使用pow函数
  19. 便宜蜂销售额SPSS预测
  20. 计算两个并联电阻的总电阻

热门文章

  1. 2014年Android面试题及其答案
  2. 线性代数逆矩阵和矩阵方程题目
  3. [模式识别].(希腊)西奥多里蒂斯第四版笔记3之__线性分类器
  4. 第五届双态IT乌镇用户大会-智能运维算法研讨会圆满落幕
  5. 还在为画“类Word文档报表”而发愁吗?
  6. 程序员的修炼之道,原文:程序员如何赚大钱?
  7. 看别人的世界,品自己的人生
  8. 恩施机器人编程_真慧学机器人编程:挖掘孩子的自身潜力,开发孩子智力
  9. 黑客术语基础知识快速了解 新手必看
  10. 芦溪中学2021高考成绩查询,芦溪中学2020年高考喜报!