Skip to content

train issue #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RichardMrLu opened this issue Dec 26, 2017 · 7 comments
Closed

train issue #5

RichardMrLu opened this issue Dec 26, 2017 · 7 comments

Comments

@RichardMrLu
Copy link

/faster-rcnn.pytorch/lib/model/rpn/rpn.py:66: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
rpn_cls_prob_reshape = F.softmax(rpn_cls_score_reshape)
Traceback (most recent call last):
File "trainval_net.py", line 316, in
_, cls_prob, bbox_pred, rpn_loss, rcnn_loss = fasterRCNN(im_data, im_info, gt_boxes, num_boxes)
File "/raid/pyenv/versions/2.7.11/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/raid/faster-rcnn.pytorch/lib/model/faster_rcnn/faster_rcnn_cascade.py", line 51, in forward
rois, rpn_loss_cls, rpn_loss_bbox = self.RCNN_rpn(base_feat, im_info, gt_boxes, num_boxes)
File "/raid/.pyenv/versions/2.7.11/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/raid/faster-rcnn.pytorch/lib/model/rpn/rpn.py", line 76, in forward
im_info, cfg_key))
File "/raid/pyenv/versions/2.7.11/lib/python2.7/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/raid/faster-rcnn.pytorch/lib/model/rpn/proposal_layer.py", line 105, in forward
proposals = bbox_transform_inv(anchors, bbox_deltas, batch_size)
File "/raid/faster-rcnn.pytorch/lib/model/rpn/bbox_transform.py", line 90, in bbox_transform_inv
pred_w = np.exp(dw) * widths.unsqueeze(2)
File "/raid/.pyenv/versions/2.7.11/lib/python2.7/site-packages/torch/tensor.py", line 309, in mul
return self.mul(other)
TypeError: mul received an invalid combination of arguments - got (torch.cuda.FloatTensor), but expected one of:

  • (float value)
    didn't match because some of the arguments have invalid types: (torch.cuda.FloatTensor)
  • (torch.FloatTensor other)
    didn't match because some of the arguments have invalid types: (torch.cuda.FloatTensor)

how to solve this problem?thanks for your answer

@jwyang
Copy link
Owner

jwyang commented Dec 26, 2017

it seems that the data type does not match. what is your command to run the train?

@RichardMrLu
Copy link
Author

CUDA_VISIBLE_DEVICES=1 python trainval_net.py --dataset pascal_voc --net vgg16 --cuda --bs 8
dataset is VOC2007

@jwyang
Copy link
Owner

jwyang commented Dec 27, 2017

how many GPUs do you have on your machine? Try to remove CUDA_VISIBLE_DEVICES=1, and set batch size to 1.

@RichardMrLu
Copy link
Author

4GPUs, I used python trainval_net.py --dataset pascal_voc --net vgg16 --cuda --bs 1
but it crashed with the same error. I think in trainval_net.py, some variable are set type in torch.cuda.FloatTensor but in original py_faster's file they are calculate in other type, so type error....

@jwyang
Copy link
Owner

jwyang commented Dec 27, 2017

can you add a breakpoint before "pred_w = np.exp(dw) * widths.unsqueeze(2)"? and show the types of dw and widths? I guess dw in your running is not a number, which caused this error.

@RichardMrLu
Copy link
Author

I find it that type(np.exp(dw)) is torch.FloatTensor type(widths.unsqueeze(2)) is torch.cuda.FloatTensor,
so train error. I changed np.exp(dw) to torch.exp(dw), It begin to train.
Thanks for your work.

@jwyang
Copy link
Owner

jwyang commented Dec 27, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants