Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

masking makes data have different shape, leading to stack problem #2

Open
starrlee356 opened this issue Sep 19, 2023 · 0 comments
Open

Comments

@starrlee356
Copy link

starrlee356 commented Sep 19, 2023

Hi, thanks for sharing the codes!
There is a problem I couldn't solve in word2box-dev-shib/src/language_modeling_with_boxes/datasets/word2vecgpu.py.

In method __getitem__(self, idx), idx = idx.unsqueeze(1) + window_range.unsqueeze(0) raised AttributeError: 'int' object has no attribute 'unsqueeze' . I found idx is orginally an int. So I changed idx += self.pad_size into idx = torch.full(size=tuple([self.pad_size]),fill_value=idx) and solved this.

However, aftering getting context (tensor[10, 10]) and center (tensor[10, 1]) from corpus, it raises RuntimeError stack expects each tensor to be equal size. I supposed it's due to the difference between center and context, so I add center = center.unsqueeze(len(context.shape)-1) and center = center.expand_as(context), making center have the same shape as context.

Then comes a new problem: after using keep to get rid of some data, data gets different shape: tensor[x, 10] and x is an int between 1 and 10. This leads to Runtime Error again, because stack expects data has the same shape. I don't know how to handle this...Could you please offer some advice? Thank you so much!

@starrlee356 starrlee356 changed the title __getitem__(self, idx: LongTensor), but idx is an int instead of a tensor masking makes data have different shape, leading to stack problem Sep 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant