Recommended papers for understanding LLMs Tokenization Neural Machine Translation of rare words with Subword Units (BPE) 30 papers Ilya recommended John Carmack to read Attention is All You Need The Annotated Transformer