masking makes data have different shape, leading to stack problem #2

starrlee356 · 2023-09-19T03:50:52Z

Hi, thanks for sharing the codes!
There is a problem I couldn't solve in word2box-dev-shib/src/language_modeling_with_boxes/datasets/word2vecgpu.py.

In method __getitem__(self, idx), idx = idx.unsqueeze(1) + window_range.unsqueeze(0) raised AttributeError: 'int' object has no attribute 'unsqueeze' . I found idx is orginally an int. So I changed idx += self.pad_size into idx = torch.full(size=tuple([self.pad_size]),fill_value=idx) and solved this.

However, aftering getting context (tensor[10, 10]) and center (tensor[10, 1]) from corpus, it raises RuntimeError stack expects each tensor to be equal size. I supposed it's due to the difference between center and context, so I add center = center.unsqueeze(len(context.shape)-1) and center = center.expand_as(context), making center have the same shape as context.

Then comes a new problem: after using keep to get rid of some data, data gets different shape: tensor[x, 10] and x is an int between 1 and 10. This leads to Runtime Error again, because stack expects data has the same shape. I don't know how to handle this...Could you please offer some advice? Thank you so much!

The text was updated successfully, but these errors were encountered:

starrlee356 changed the title ~~__getitem__(self, idx: LongTensor), but idx is an int instead of a tensor~~ masking makes data have different shape, leading to stack problem Sep 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

masking makes data have different shape, leading to stack problem #2

masking makes data have different shape, leading to stack problem #2

starrlee356 commented Sep 19, 2023 •

edited

Loading

masking makes data have different shape, leading to stack problem #2

masking makes data have different shape, leading to stack problem #2

Comments

starrlee356 commented Sep 19, 2023 • edited Loading

starrlee356 commented Sep 19, 2023 •

edited

Loading