Transformation matrix is shared among the set of capsule types #21

cheneeheng · 2019-02-25T12:22:44Z

I have a question regarding this part :

Lines 128 to 139 in c6b3f9e

 input_tensor_reshaped = K.reshape(input_transposed, [ 

 input_shape[0] * input_shape[1], self.input_height, self.input_width, self.input_num_atoms]) 

 input_tensor_reshaped.set_shape((None, self.input_height, self.input_width, self.input_num_atoms)) 

 conv = K.conv2d(input_tensor_reshaped, self.W, (self.strides, self.strides), 

 padding=self.padding, data_format='channels_last') 

 votes_shape = K.shape(conv) 

 _, conv_height, conv_width, _ = conv.get_shape() 

 votes = K.reshape(conv, [input_shape[1], input_shape[0], votes_shape[1], votes_shape[2], 

 self.num_capsule, self.num_atoms])

So you reshaped the 5D input tensor into 4D and used a normal conv2d to perform the dimension transformation.

Here during reshape, you merged the "batch_size" and "input_num_capsule" together. But doing so means that the same conv weights are used for each input capsule type.

However in the paper you mentioned in 3.1 contirbution 2.(ii) that the transformation matrices are shared within a capsule type. Do correct me if i am wrong.

Thanks and good work btw.

Chen.

lalonderodney · 2019-04-09T07:38:00Z

Hello @cheneeheng ,

The transformation matrices are shared across child capsule types but are unique for parent capsule types. Therefore each incoming capsule type will have the same set (number of parent capsule types) of transformations applied to it, whereas in the original paper no such sharing was described... from the original Sabour et al. paper "In convolutional capsule layers, each capsule outputs a local grid of vectors to each type of capsule in the layer above using different transformation matrices for each member of the grid as well as for each type of capsule".

I hope this clears things up and thank you for the kind words.

msseibel · 2019-04-09T08:23:41Z

When I compare your implementation of the transformation matrices:
self.W = self.add_weight(shape=[self.kernel_size, self.kernel_size, self.input_num_atoms, self.num_capsule * self.num_atoms], initializer=self.kernel_initializer, name='W')
with Sabours implementation:
kernel = variables.weight_variable(shape=[ kernel_size, kernel_size, input_atoms, output_dim * output_atoms ])
I can not see any difference in the transformation matrices.

Sabour has also added the input_capsules to the batch_dimension. https://github.com/Sarasra/models/blob/984fbc754943c849c55a57923f4223099a1ff88c/research/capsules/models/layers/layers.py#L244
So I guess that Sabour has adapted her code and is sharing the matrices now?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformation matrix is shared among the set of capsule types #21

Transformation matrix is shared among the set of capsule types #21

cheneeheng commented Feb 25, 2019 •

edited

Loading

lalonderodney commented Apr 9, 2019

msseibel commented Apr 9, 2019

Transformation matrix is shared among the set of capsule types #21

Transformation matrix is shared among the set of capsule types #21

Comments

cheneeheng commented Feb 25, 2019 • edited Loading

lalonderodney commented Apr 9, 2019

msseibel commented Apr 9, 2019

cheneeheng commented Feb 25, 2019 •

edited

Loading