Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running hm.py #4

Open
darthgera123 opened this issue May 10, 2021 · 2 comments
Open

Error when running hm.py #4

darthgera123 opened this issue May 10, 2021 · 2 comments

Comments

@darthgera123
Copy link

darthgera123 commented May 10, 2021

Hi i wanted to run vilio for my experiment. I made a copy of fts_tsv/hm_data_tsv.py and updated HMTorchDataset class to only read from 1 single file (instead of splits). The pastebin is here. So after passing the data through the model (im using U) im getting this error :

Traceback (most recent call last):
  File "hm_uniter.py", line 392, in <module>
    main()
  File "hm_uniter.py", line 361, in main
    hm.train(hm.train_tuple, hm.valid_tuple)
  File "hm_uniter.py", line 187, in train
    logit = self.model(sent, (feats, boxes))
  File "/home/darthgera123/anaconda3/envs/vilio/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/darthgera123/vilio/entryU.py", line 200, in forward
    seq_out, pooled_output = self.model(input_ids.cuda(), None, img_feats.cuda(), img_pos_feats.cuda(), attn_masks.cuda(), gather_index=gather_index.cuda())
  File "/home/darthgera123/anaconda3/envs/vilio/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/darthgera123/vilio/src/vilio/modeling_bertU.py", line 418, in forward
    encoded_layers = self.encoder(
  File "/home/darthgera123/anaconda3/envs/vilio/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/darthgera123/vilio/src/vilio/modeling_bertU.py", line 304, in forward
    hidden_states = layer_module(hidden_states, attention_mask)
  File "/home/darthgera123/anaconda3/envs/vilio/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/darthgera123/vilio/src/vilio/modeling_bertU.py", line 185, in forward
    intermediate_output = self.intermediate(attention_output)
  File "/home/darthgera123/anaconda3/envs/vilio/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/darthgera123/vilio/src/vilio/modeling_bertU.py", line 158, in forward
    hidden_states = self.intermediate_act_fn(hidden_states)
  File "/home/darthgera123/anaconda3/envs/vilio/lib/python3.8/site-packages/torch/nn/functional.py", line 1369, in gelu
    return torch._C._nn.gelu(input)
RuntimeError: CUDA error: device-side assert triggered

I havent changed any other file and online it says that solution is to fix the numbering in labelling (which I dont think) is the issue. This error comes when in entryU.py im running seq_out, pooled_output = self.model(input_ids.cuda(), None, img_feats.cuda(), img_pos_feats.cuda(), attn_masks.cuda(), gather_index=gather_index.cuda()).
Also im using uniter and bert-base
Please please help @Muennighoff

@darthgera123
Copy link
Author

Error Trace after I added CUDA_LAUNCH_BLOCKING=1 ahead of python hm.py as the error where I was getting was deterministic.

@Muennighoff
Copy link
Owner

@darthgera123 sorry for the late reply! This is most likely a shape mismatch - What are the dimensions of your images / are you using the HM Dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants