[BUG] Bottleneck adapters do not work with the ViT model when original_ln_after = False
#764
Labels
question
Further information is requested
original_ln_after = False
#764
It seems like the
ViT
model does not train well with the bottleneck configs when the parameteroriginal_ln_after
is set toFalse
To reproduce
The text was updated successfully, but these errors were encountered: