-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
自己学习率没有调节好 #164
Comments
本来就是encoder 的最后一层输出作为输入送入到每一层decoder当中作为k和v |
作者实现没错。。你去看论文。就是encoder最后一层,会输出到所有docoder中。 |
作者没错 是你的理解有问题 |
嗯,我知道,我两种方法后来都尝试了,作者的效果会好一些,我的方式会快一些 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
model.py
def train():
line140 memory, sents1, src_masks = self.encode(xs)
line141 logits, preds, y, sents2 = self.decode(ys, memory, src_masks)
我们都知道编码器的每一个block的output输入到解码器对应的block当中,但是代码中的memory是最后一层block的输出,原作者直接作为了解码器每一层block的输入。这里应该错了吧,我觉得应该用一个列表存储编码器每一层的输出,然后将列表传给解码器,将对应的值传给解码端对应的block。
大家有没有啥看法?
The text was updated successfully, but these errors were encountered: