A project exploring LSTM and Transformer-like models to generate text with implementations in Python & Pytorch.
Welcome to TolkienFormer, a personal project that dives into the task of text generation. This project explores LSTMs and Transformer-like models to generate text reminiscent of J.R.R. Tolkien's The Lord of the Rings. While aiming to produce reasonable results, the primary goal is not to achieve state-of-the-art results but instead to enhance proficiency with Transformers, LSTMs, and PyTorch.
Text generation poses significant challenges in terms of data and computational resources. Thus, TolkienFormer employs the technique of Teacher Forcing to stabilize and expedite training and testing.
An example of the output generated by TolkienFormer after training and fine-tuning a model can be seen below:
First, you have to clone the repo and create a conda environment, as well adding the project root to your PYTHONPATH to enable local imports:
# 1. Clone this repository
git clone https://github.com/LuisWinckelmann/TolkienFormer.git
cd TolkienFormer
# 2. Setup conda env
conda create --name tolkienformer
conda activate tolkienformer
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
# 3. Enable local imports by adding the root to your pythonpath:
# 3a) Linux:
export PYTHONPATH=$PYTHONPATH:$PWD
# 3b) Windows:
set PYTHONPATH=%PYTHONPATH%;%cd%
Afterwards, you need prepare the data to be able to train the models. For the specifics, please follow the Instructions below.
For the data you can use any *.txt file that you want. In the current setup the file will get parse row-wise.
The example dataset chapter1, provided in src/data/chapter1
includes chapter 1 or Tolkien's The Fellowship of the Ring obtained from here.
To use your own dataset simply copy the text file(s) into src/data
and run:
cd src/data
python data_preparation.py
If your data is stored somewhere else then src/data
you can use --path_to_folder_with_txt_filess
to adjust the root folder with the .txt files inside.
If your data has another format you'll need to adjust your custom dataset in src/utils/datasets.py
accordingly.
To run training of the LSTM run:
cd src/models/lstm
python train.py
To run training of the transformer-like model run:
cd src/models/transformer
python train.py
All currently available hyperparameters can be changed in the corresponding config.json files located in src/models/lstm
or src/models/transformers
respectively.
After executing the training, to generate results of the models as shown in the description, you can run:
# LSTM model
cd src/models/lstm
python test.py
# Transformer-like model
cd src/models/transformer
python test.py
# Optional Parameters to edit when running test.py:
# --num_sentences 5
# --model_epoch 150
To specify the amount of predicted sentences use the --num_sentences
flag, to select one of the saved checkpoints, use the --model_epoch
flag
Other parameters for the evaluation can be changed in the model config.json
.
- Move to logging from printing
- Write description with a showcase
- Publish some additional results
- Confirm setup and functionality works and README is clearly written
- Get rid of code doubling my merging LSTM & Transformer folders and specifically train.py & test.py
- Easier setup via shell script(s)
Distributed under the MIT License. See LICENSE.txt
for more information.
Luis Winckelmann - [email protected]
Project Link: https://github.com/LuisWinckelmann/TolkienFormer