You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I have a question about training cap_model with end-to-end maksed transformer.
In the code, the cap_model is trained with "window_mask = (gate_scores * pred_bin_window_mask.view(B, T, 1)".
As I understand, the pred_bin_window_mask is extracted by prediction.
Therefore, Is caption model(cap_model) trained on the learned_proposal (not GT with label) ??
Is it right what I understand?
And, If cap model is trained on leanred_proposal, the model can be affected a lot defending o n the initial performance of learned_proposal. Therefore, it seems like to show unstable learning.
If you have any misunderstandings, please point out that.
Thank you.
The text was updated successfully, but these errors were encountered:
Hello,
I have a question about training cap_model with end-to-end maksed transformer.
In the code, the cap_model is trained with "window_mask = (gate_scores * pred_bin_window_mask.view(B, T, 1)".
As I understand, the pred_bin_window_mask is extracted by prediction.
Therefore, Is caption model(cap_model) trained on the learned_proposal (not GT with label) ??
Is it right what I understand?
And, If cap model is trained on leanred_proposal, the model can be affected a lot defending o n the initial performance of learned_proposal. Therefore, it seems like to show unstable learning.
If you have any misunderstandings, please point out that.
Thank you.
The text was updated successfully, but these errors were encountered: