Replies: 1 comment 2 replies
-
I'm rather unfamiliar with semantic segmentation models, but the default model created using Just make sure model output shape matching your y labels. An example could be: import kecam
from kecam.backend import layers, models
from kecam.attention_layers import conv2d_no_bias, layer_norm, activation_by_name
""" EVA backbone """
patch_size = 4 # Should better divisible by input_shape
backbone = kecam.models.EvaLargePatch14(input_shape=(196, 196, 3), num_classes=0, patch_size=patch_size)
print(f"{backbone.layers[-3].output_shape = }") # layer before `reduce_mean`
# backbone.layers[-3].output_shape = (None, 2401, 1024)
inputs = backbone.inputs
nn = backbone.layers[-3].output
nn = layers.Reshape([-1, int(nn.shape[1] ** 0.5), nn.shape[-1]])(nn)
print(f"{nn.shape = }")
# nn.shape = TensorShape([None, 49, 49, 1024])
""" Neck """
embed_dims = 256
nn = conv2d_no_bias(nn, embed_dims, kernel_size=1, use_bias=False, name="neck_1_")
nn = layer_norm(nn, name="neck_1_")
nn = conv2d_no_bias(nn, embed_dims, kernel_size=3, padding="SAME", use_bias=False, name="neck_2_")
nn = layer_norm(nn, name="neck_2_")
print(f"{nn.shape = }")
# nn.shape = TensorShape([None, 49, 49, 256])
""" Upsample 4x """
activation = "gelu"
nn = layers.Conv2DTranspose(embed_dims // 4, kernel_size=2, strides=2, name="up_1_conv_transpose")(nn)
nn = layer_norm(nn, epsilon=1e-6, name="up_1_") # epsilon is fixed using 1e-6
nn = activation_by_name(nn, activation=activation, name="up_1_")
nn = layers.Conv2DTranspose(embed_dims // 8, kernel_size=2, strides=2, name="up_2_conv_transpose")(nn)
nn = activation_by_name(nn, activation=activation, name="up_2_")
print(f"{nn.shape = }")
# nn.shape = TensorShape([None, 196, 196, 32])
""" Output head """
num_classes = 5
nn = conv2d_no_bias(nn, num_classes, kernel_size=3, padding="SAME", use_bias=True, name="output_")
output = activation_by_name(nn, activation="softmax", name="output_")
model = models.Model(inputs, output, name=backbone.name + "_segment")
print(f"{model.output_shape = }")
# model.output_shape = (None, 196, 196, 5) Then this model should be able to apply with your |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
For example EVA (keras_cv_attention_models/keras_cv_attention_models/beit)
Can models like EVA be used for semantic segmentation? Is it possible to change it in a simple way? Or do I need to build a from scratch?
Input shape of semantic segmantation looks like train = ( number of image, size x, size y, 3(rgb) ) mask = ( number of mask, size x, size y, 5 (5classes to_categorical) )
When creating an Eva model and fit it, I entered training and validation data in that format, and the following error occurred.
'ValueError: Shapes (4, 196, 196, 5) and (4, 5) are incompatible'
This is the code part:
eva1 = eva.EvaGiantPatch14(input_shape=(196, 196, 3), num_classes=5, activation="gelu", classifier_activation="softmax", pretrained="imagenet21k-ft1k")
eva1.compile(loss=keras.losses.CategoricalCrossentropy(),
optimizer=Adam(learning_rate=0.001),
metrics=[CategoricalAccuracy()])
eva1_history = eva1.fit(
X_train.astype('float32'), y_train_cat,
verbose=1,
batch_size=adjusted_batch_size,
validation_data=(X_test.astype('float32'), y_test_cat),
shuffle=True,
epochs=10,
)
Is it not possible to utilize it this way?
Thank you so much for your help!
Beta Was this translation helpful? Give feedback.
All reactions