Ubuntu系统yolov5训练报错集合

问题1：
TypeError: can’t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
解决问题：
根据错误提示找到
/home/xx/anaconda3/envs/deepshare/lib/python3.7/site-packages/torch/tensor.py文件的第621行
或者直接单击错误提示栏进入
将

 def __array__(self, dtype=None):if dtype is None:return self.numpy()else:return self.numpy().astype(dtype, copy=False)

中的

return self.numpy()

改为：

return self.cpu().detach().numpy()

再次运行训练指令即可正常运行，问题解决

参考:https://blog.csdn.net/qq_44703886/article/details/117231542
同理：

Traceback (most recent call last):File "train.py", line 456, in <module>train(hyp, opt, device, tb_writer)File "train.py", line 314, in trainresults, maps, times = test.test(opt.data,File "/media/xx/新加卷/yolov5-master-ubuntu/test.py", line 193, in testplot_images(img, output_to_target(output, width, height), paths, str(f), names)  # predictionsFile "/media/xx/新加卷/yolov5-master-ubuntu/utils/general.py", line 942, in output_to_targetreturn np.array(targets)File "/home/xx/anaconda3/envs/yolov5/lib/python3.8/site-packages/torch/_tensor.py", line 643, in __array__return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

问题2：
RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.
解决问题：
145行加上 with torch.no_grad():
具体更改为：

def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency# cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.m = self.model[-1]  # Detect() modulefor mi, s in zip(m.m, m.stride):  # fromb = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)with torch.no_grad():b[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)b[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # clsmi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)

参考：https://blog.51cto.com/u_15194128/2795983

问题 3：如果报显卡和pytorch不兼容
例如：显卡A6000算力高，而你的torch版本支持的算力达不到。
解决方法：可以升级torch版本

问题4：RuntimeError: Unable to find a valid cuDNN algorithm to run

Traceback (most recent call last):File "train.py", line 586, in <module>main(opt)File "train.py", line 485, in maintrain(opt.hyp, opt, device)File "train.py", line 315, in trainscaler.scale(loss).backward()File "/home/image522/anaconda3/envs/yolov5_v5.0/lib/python3.8/site-packages/torch/tensor.py", line 245, in backwardtorch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)File "/home/image522/anaconda3/envs/yolov5_v5.0/lib/python3.8/site-packages/torch/autograd/__init__.py", line 145, in backwardVariable._execution_engine.run_backward(
RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

解决方法：修改batch-size大小

Ubuntu系统yolov5训练报错集合相关推荐

更新 Ubuntu 系统，避免报错「校验和不符」
为什么80%的码农都做不了架构师?>>> 1. 问题使用 Ubuntu 操作系统,执行 sudo apt-get update 更新系统时,经常会见到类似如下所示的报错信息: ...
Ubuntu系统软件安装报错：Could not get lock /var/lib/dpkg/lock-frontend - open解决方法
问题在Ubuntu系统上安装使用apt-get install命令安装软件报以下错误: E: Could not get lock /var/lib/dpkg/lock-frontend - o ...
Ubuntu系统rosdep update报错的解决办法（2022.10.3亲测有效）
目录一. 问题: Ubuntu22.04系统下面,rosdep update总是报错二. 方法一一道来: 1. 直接访问raw.githubusercontent.com是不行的. 按照网上的解决 ...
YOLOv5训练报错：result type Float can‘t be cast to the desired output type __int64
记录一个报错: 因为把Pytorch版本从1.8.0更新到1.13.1,YOLOv5-6.1版本的训练代码报错: RuntimeError: result type Float can't be ca ...
yolov5 v3.0训练报错： torch.nn.modules.module.ModuleAttributeError: ‘BatchNorm2d‘ object has no attribute
欢迎大家关注笔者,你的关注是我持续更博的最大动力原创文章,转载告知,盗版必究 yolov5 v3.0版本训练报错:torch.nn.modules.module.ModuleAttributeErr ...
PyCharm使用期间出现报错集合持续更新ing
PyCharm使用期间出现报错集合持续更新ing 啥时候用PyCharm发现了奇奇怪怪的错误就整理上来这几天帮同学安装torch的时候出现了一些奇奇怪怪的问题 1.torch始终安装失败描述一下 ...
Temporary failure resolving——Ubuntu DNS未配置报错与解决
Temporary failure resolving--Ubuntu DNS未配置报错与解决记录环境信息: WSL Ubuntu 18.04 LTS 安装unzip命令所需的工具包 sudo ap ...
Ubuntu用apt-get安装报错：E: Could not get lock /var/lib/dpkg/lock-frontend - open (11:资源暂时不可用）
Ubuntu用apt-get安装报错:E: Could not get lock /var/lib/dpkg/lock-frontend - open (11:资源暂时不可用) 文章目录: 一.错误原 ...
服务器独立显卡显示不出来,dell服务器R720+独立显卡GTX1650，进不去系统，UEIF报错...
戴尔服务器dell R720的显卡问题.操作系统是win2008R2. 现在是安装的华硕750ti,运行ok,多个屏幕. 买了技嘉gtx1650,刚出的显卡安装了.在集成显卡情况下打了驱动,设备管理显 ...

Ubuntu系统yolov5训练报错集合

Ubuntu系统yolov5训练报错集合相关推荐

最新文章

热门文章