Bert4rec #197

spirinamayya · 2024-10-25T10:29:02Z

Added bert4rec model

blondered · 2024-10-31T11:07:14Z

rectools/models/bert4rec.py

+        self.shuffle_train = shuffle_train
+        # TODO: add SequenceDatasetType for fit and recommend
+
+    def process_dataset_train(self, dataset: Dataset) -> Dataset:


If this method is the same for SASRec and BERT4Rec then just move it to the SessionEncoderDataPreparatorBase. Do it for all of the methods that don't have any differences.
Ideally only 2 collate fn methods will have differences

blondered · 2024-10-31T11:30:40Z

rectools/models/bert4rec.py

+                torch.ones((session_max_len, session_max_len), dtype=torch.bool, device=sessions.device)
+            )
+        timeline_mask = sessions != 0
+        attn_mask = ~timeline_mask.unsqueeze(1).repeat(self.n_heads, timeline_mask.squeeze(-1).shape[1], 1)


What is timeline_mask.squeeze(-1).shape[1] exactly?
When timeline_mask is of shape [batch_size, session_maxlen]

blondered · 2024-10-31T11:33:29Z

rectools/models/bert4rec.py

+# ####  --------------  Session Encoder  --------------  #### #
+
+
+class TransformerBasedSessionEncoder(torch.nn.Module):


We need one common class for all models. If there is some custom logic just create the flags and pass them from models during TransformerBasedSessionEncoder initialization for the model.
Since we haven't decided with the masks yet let's also make it a flag. And pass this flag to model initialization as well. This will simplify these experiments for us.
TransformerBasedSessionEncoder should be imported from sasrec.py (until we move all things to correct modules)

blondered · 2024-10-31T11:35:02Z

rectools/models/bert4rec.py

+
+    def on_train_start(self) -> None:
+        """TODO"""
+        self._truncated_normal_init()


If this is the only difference then just let's use the one that is used in SASRec without overwriting the class.
Just import SessionEncoderLightningModule from sasrec.py until we move everything to correct modules

blondered · 2024-10-31T12:03:36Z

rectools/models/bert4rec.py

+        lr: float = 0.01,
+        dataloader_num_workers: int = 0,
+        train_min_user_interaction: int = 2,
+        mask_prob: float = 0.15,


As far as I can see the only difference in this model from SASRec is receiving mask_prob and passing it to data_preparator_type.
What you need to do:

Create TransformerModelBase class with all of the common methods already implemented. Look here https://github.com/MobileTeleSystems/RecTools/blob/main/rectools/models/base.py for example. Carefully select which of the arguments must be passed to init. mask_prob shouldn't be there at all. transformer_layers_type and data_preparator_type should not have default values. self.data_preparator should not be initialized but should be declared like self.data_preparator: SessionEncoderDataPreparatorBase.

In SASRecModel init create default values for transformer_layers_type and data_preparator_type. Initialize self.data_preparator

In BERT4RecModel init create default values for transformer_layers_type and data_preparator_type. Initialize self.data_preparator

If something fails for linters, let's discuss. But from the first point of view this should work.

blondered · 2024-10-31T13:49:12Z

rectools/models/bert4rec.py

+        return recommend_dataloader
+
+
+class PointWiseFeedForward(nn.Module):


Now about the feedforward differences.
There are a few:

The order of dropout and activation. For RELU it doesn't have any difference in the result. So let's change the order for SASRec variant and it will be the same as BERT4Rec.

Activation function. Let's just add and argument to PointWiseFeedForward init which activation to use

Placing second dropout inside PointWiseFeedForward.forward or moving it to TransformerLayers.forward. We can move them to TransformerLayers in SASRecTransformerLayers so that it would be more like the classic picture of transformers architecture.

As far as I can see after all of this we will have one PointWiseFeedForward class for both models. And it's good.
Please make this change as a separate commit and check SASRec metrics.

blondered · 2024-10-31T13:54:13Z

rectools/models/bert4rec.py

+        for i in range(self.n_blocks):
+            mha_input = self.layer_norm1[i](seqs)
+            # mha_output, _ =
+            # self.multi_head_attn[i](mha_input, mha_input, mha_input, attn_mask=attn_mask, need_weights=False)


This is not handy at all :)
Here just pass the one you received in forward always.
And create a flag in the model if you want to create this mask or not. If not - pass it with Null value

blondered · 2024-10-31T13:54:31Z

rectools/models/bert4rec.py

+            ff_output = self.feed_forward[i](ff_input)
+            seqs = seqs + self.dropout2[i](ff_output)
+            seqs = seqs * timeline_mask
+            # seqs = self.dropout3[i](seqs)


We don't need it, do we? Is there a reference for this?

Added bert4rec

8fc3999

blondered requested changes Oct 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bert4rec #197

Bert4rec #197

spirinamayya commented Oct 25, 2024

blondered Oct 31, 2024

blondered Oct 31, 2024

blondered Oct 31, 2024

blondered Oct 31, 2024

blondered Oct 31, 2024 •

edited

Loading

blondered Oct 31, 2024

blondered Oct 31, 2024

blondered Oct 31, 2024

		# #### -------------- Session Encoder -------------- #### #


		class TransformerBasedSessionEncoder(torch.nn.Module):

		return recommend_dataloader


		class PointWiseFeedForward(nn.Module):

Bert4rec #197

Are you sure you want to change the base?

Bert4rec #197

Conversation

spirinamayya commented Oct 25, 2024

blondered Oct 31, 2024

Choose a reason for hiding this comment

blondered Oct 31, 2024

Choose a reason for hiding this comment

blondered Oct 31, 2024

Choose a reason for hiding this comment

blondered Oct 31, 2024

Choose a reason for hiding this comment

blondered Oct 31, 2024 • edited Loading

Choose a reason for hiding this comment

blondered Oct 31, 2024

Choose a reason for hiding this comment

blondered Oct 31, 2024

Choose a reason for hiding this comment

blondered Oct 31, 2024

Choose a reason for hiding this comment

blondered Oct 31, 2024 •

edited

Loading