SLM-BigramLanguageModel-

Colab notebook showcasing the implementation of SLM or small language model or bigram language model

Environment Setup I set up the environment by importing necessary libraries and checking for GPU availability. The block_size, batch_size, and training hyperparameters were defined. The choice of 'cuda' or 'cpu' as the device depends on GPU availability.
Data Preprocessing I loaded the text data from "wizard_of_oz.txt". The characters in the text were sorted and assigned numerical indices for encoding. I created functions (encode and decode) for converting between characters and indices. The data was then encoded into a PyTorch tensor.
Model Definition I defined a Bigram Language Model using PyTorch. The model consists of an embedding layer (token_embedding_table). The forward method computes logits from input indices and calculates the cross-entropy loss if targets are provided. The generate method is used for text generation.
Training Setup I instantiated the model, set up the AdamW optimizer, and defined a training loop. The loop includes getting batches of data (get_batch), computing logits and loss, performing backpropagation, and optimizing the model parameters. Periodically, I evaluated and printed the training and validation losses.
Text Generation I initialized a context tensor and used the trained model to generate text. The generate method predicts the next character at each step based on the context.

Additional Notes The model is trained to predict the next character in a sequence given a context. This is achieved by minimizing the cross-entropy loss between predicted logits and actual characters in the training data. The Bigram model captures dependencies between consecutive characters, which is a simple form of language modeling. The generated text reflects the learned patterns and structure of the training data. Final Thoughts This notebook provides a concise example of building and training a Bigram Language Model using PyTorch for text generation. The training loop and text generation showcase the practical application of such models.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
SLM(Small_Language_Model).ipynb		SLM(Small_Language_Model).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SLM-BigramLanguageModel-

About

Releases

Packages

Languages

hiteshmaurya99/SLM-BigramLanguageModel-

Folders and files

Latest commit

History

Repository files navigation

SLM-BigramLanguageModel-

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages