Skip to content

Why do we use '<|endoftext|>' at the beginning of the sentence? #51

Answered by karpathy
XINZHANG-ops asked this question in Q&A
Discussion options

You must be logged in to vote

endoftext is a bit of a misnomer, it is a document delimiting token. it's especially useful if you want to start sampling new document "from scratch", where you'd pass endoftext into the model at the very first time step.

As for 2 you're right, it could very well be cleaner to do "<|endoftext|>Hello, I'm a language model,". This would give the LLM additional information that this is a new document.

Do note that during training we sample random windows of the text and train on that, so the model is perfectly "used to" seeing text with no context and it just assumes it's probably somewhere in the middle of a larger document and it does its best. That's basically what ends up happening witho…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@XINZHANG-ops
Comment options

Answer selected by karpathy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants