山高水长
首页
  • 分类
  • 标签
  • 归档
友情链接
GitHub (opens new window)

山高水长

首页
  • 分类
  • 标签
  • 归档
友情链接
GitHub (opens new window)
  • mmdetection 结果可视化
  • mmdetection 训练出现 nan
  • mmdetection绘制PR曲线
  • 获取最高map的epoch
  • mmdetection报错汇总
  • mmdetection
Shanya
2022-07-21

mmdetection报错汇总

# mmdetection训练报错汇总

  • one of the variables needed for gradient computation has been modified by an inplace operation

    • 详细报错信息

      Traceback (most recent call last):
      File "tools/train.py", line 242, in <module>
          main()
      File "tools/train.py", line 231, in main
          train_detector(
      File "/home/lgh/code/mmlab/mmdetection/mmdet/apis/train.py", line 244, in train_detector
          runner.run(data_loaders, cfg.workflow)
      File "/home/lgh/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
          epoch_runner(data_loaders[i], **kwargs)
      File "/home/lgh/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 51, in train
          self.call_hook('after_train_iter')
      File "/home/lgh/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
          getattr(hook, fn_name)(self)
      File "/home/lgh/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmcv/runner/hooks/optimizer.py", line 56, in after_train_iter
          runner.outputs['loss'].backward()
      File "/home/lgh/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward
          torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/home/lgh/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/autograd/__init__.py", line 145, in backward
          Variable._execution_engine.run_backward(
      RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 2048, 25, 25]], which is output 0 of ReluBackward1, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
      
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
    • 问题分析:
      所谓inplace操作就是直接修改地址上的值。

      首先是torch中所有加 _ 的函数,x.squeeze_(),x.unsqueeze_()。这些操作直接修改变量,不返回值。

      其次一些函数可以通过设置是否inplace,比如Pytorch中 torch.relu()和torch.sigmoid()等激活函数不是inplace操作,其中ReLU可通过设置inplace=True进行inplace操作。

      以及一些算术操作,x += res是inplace操作,x = x + res不是。以及一些赋值操作。

      如果一些用于backward的值被inplace操作修改了就会报错,所以最好不使用inplace操作,以及将赋值操作放在各种需要计算梯度运算之前,gather要在赋值之后。

    • 解决方案:
      将ReLu的inplace=True移除

    • 参考文章:
      https://blog.csdn.net/qq_38163755/article/details/110957133 (opens new window)

编辑 (opens new window)
#mmdetection
上次更新: 2022/09/30, 04:53:04
获取最高map的epoch

← 获取最高map的epoch

最近更新
01
FCOS
09-30
02
Python执行终端命令
09-13
03
Android Compose 权限请求
08-12
更多文章>
Theme by Vdoing | Copyright © 2020-2022 Shanya | MIT License
  • 跟随系统
  • 浅色模式
  • 深色模式
  • 阅读模式