Skip to content

fix: zh-CN chapter1-4 typo #832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion chapters/zh-CN/chapter1/4.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ Transformer 架构最初是为翻译而设计的。在训练期间,编码器
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/en/chapter1/transformers-dark.svg" alt="Architecture of a Transformers models"/>
</div>

注意,解码器块中的第一个注意力层关联到解码器的所有(过去的)输入,但是第二个注意力层只使用编码器的输出。因此,它在预测当前单词时,可以使用整个句子的信息。这是非常有用的,因因为不同的语言可以有把词放在不同顺序的语法规则,或者句子后面提供的一些上下文可能有助于确定给定单词的最佳翻译。
注意,解码器块中的第一个注意力层关联到解码器的所有(过去的)输入,但是第二个注意力层只使用编码器的输出。因此,它在预测当前单词时,可以使用整个句子的信息。这是非常有用的,因为不同的语言可以有把词放在不同顺序的语法规则,或者句子后面提供的一些上下文可能有助于确定给定单词的最佳翻译。

也可以在编码器/解码器中使用*attention mask(注意力掩码层)*,以防止模型关注到某些特殊单词。例如,用于在批量处理句子时使所有输入长度一致的特殊填充词。

Expand Down