Adding extra layer in the BEiT architecture #157
-
Hi Leondgarse, I am trying to add a Patch Merger layer from https://arxiv.org/pdf/2202.12015.pdf in the middle of the BEiT model, after layer 5. However, the shape of the layer Patch Merger layer wasn't compatible with the received input in the attention block... Any ideas or suggestions? Here is the code for the Patch Merger layer I used: class PatchMerger(Layer): |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Updated add patch_merging for beit. This should also consider the from keras_cv_attention_models import beit
mm = beit.ViTBasePatch16(patch_merging_block_id=5)
# >>>> Before patch merging: blocks with cls token: 145, attn_height: 12
# >>>> After patch merging: blocks with cls token: 9, attn_height: 3
mm = beit.BeitV2BasePatch16(patch_merging_block_id=5, patch_merging_num_tokens=16, pretrained=None)
# >>>> Before patch merging: blocks with cls token: 197, attn_height: 14
# >>>> After patch merging: blocks with cls token: 17, attn_height: 4 |
Beta Was this translation helpful? Give feedback.
Ya, for model with
positional embedding
s likeBeitV2BasePatch16
, it cann't be setpatch_merging_num_tokens=8
with square input_shape. As you can see in the printing infoAfter patch merging: blocks with cls token: 9, attn_height: 3
, means:attention_blocks = 9 - 1 = 8
, and it's not divisible byattn_height=3
, and thenattn_width = int(8 / 3) = 2
.positional_embedding
will haveattn_height * attn_width + 1 = 7
tokens, not matching with9
.It works with other values like: