Skip to content

📝 Mixed up quiz correct answers in Chapter12/2.mdx #817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 13 additions & 13 deletions chapters/en/chapter12/2.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -166,15 +166,15 @@ In the next module, we'll get our hands dirty and dive into the DeepSeek R1 pape

<Question
choices={[
{
text: "It makes models generate text faster",
explain: "RLHF isn't primarily about improving generation speed."
},
{
text: "It helps align models with human preferences and values",
explain: "Correct! RLHF uses human feedback to guide models toward more helpful, harmless, and aligned behavior.",
correct: true
},
{
text: "It makes models generate text faster",
explain: "RLHF isn't primarily about improving generation speed."
},
{
text: "It reduces the model's memory usage",
explain: "RLHF doesn't focus on model efficiency or memory optimization."
Expand All @@ -186,18 +186,18 @@ In the next module, we'll get our hands dirty and dive into the DeepSeek R1 pape

<Question
choices={[
{
text: "Generating words or choosing responses in a conversation",
explain: "Correct! For LLMs, actions typically involve text generation decisions.",
correct: true
},
{
text: "Updating model weights",
explain: "This is part of the training process, not an action in the RL context."
},
{
text: "Processing input tokens",
explain: "This is part of the model's operation, not an action in the RL context."
},
{
text: "Generating words or choosing responses in a conversation",
explain: "Correct! For LLMs, actions typically involve text generation decisions.",
correct: true
}
]}
/>
Expand Down Expand Up @@ -226,15 +226,15 @@ In the next module, we'll get our hands dirty and dive into the DeepSeek R1 pape

<Question
choices={[
{
text: "A function that generates responses",
explain: "Rewards are feedback on response quality, not the generation process itself."
},
{
text: "A numerical score that measures the quality of a response",
explain: "Correct! Rewards provide feedback on response quality, guiding the model toward desired behavior.",
correct: true
},
{
text: "A function that generates responses",
explain: "Rewards are feedback on response quality, not the generation process itself."
},
{
text: "A model that evaluates the quality of responses",
explain: "Rewards are feedback on response quality, not an evaluation model."
Expand Down