Skip to content

This repository contains the implementation of GPT-2 in TensorFlow.

Notifications You must be signed in to change notification settings

asokraju/nanoGPT_TF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nanoGPT

TensorFlow GPT-2 Model

This repository contains the implementation of nanoGPT in TensorFlow. This is a conversion of the original PyTorch code from nanoGPT library by Andery Karpathy. Special thanks to him and all contributors who made this possible.

Table of Contents

Description

The code is designed to train a GPT model on text data. The text data is downloaded from a given URL, which by default points to a small portion of the Shakespeare dataset. The text is then encoded into a numerical format, which is used to train the model.

Installation

To run this code, you need to have TensorFlow 2.4.0 or later installed on your system. You can install it via pip:

pip install tensorflow

Running the code

You can run the code using the following command: python main.py

Docker Deployment

The code can be deployed using Docker. There are Dockerfile and docker-compose.yml files provided for this purpose. To deploy the Docker container, navigate to the directory containing the docker-compose.yml file and run: docker-compose up

GPT Config Explanation

The GPTConfig class contains the following fields:

  • block_size: The size of the input block.
  • vocab_size: The size of the vocabulary. For GPT-2, this is 50257.
  • n_layer: The number of transformer blocks.
  • n_head: The number of attention heads.
  • n_embd: The size of the embeddings.
  • dropout: The dropout rate for regularization.
  • bias: If True, bias is included in linear and layer normalization operations. This is set to True to imitate GPT-2, but setting it to False can result in faster and better results.
  • epsilon: The epsilon value for layer normalization.

Blog Post

For more information about the code and how it works, check out the LinkedIn post

About

This repository contains the implementation of GPT-2 in TensorFlow.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published