Skip to content
This repository has been archived by the owner on May 5, 2023. It is now read-only.

feat: add gpt-neo model handling #42

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

onlurking
Copy link

@onlurking onlurking commented Apr 18, 2021

This PR enables neo-gpt model loading using the same API from transformers from huggingface.

Related: #40

@onlurking
Copy link
Author

Google Collab with Neo-GPT models support:

https://colab.research.google.com/drive/1xqEZeZY3aYl4w859Ej4sCsX-2LxBGU1l?usp=sharing

@paulbricman
Copy link
Owner

Thanks for the PR! As mentioned on Discord (https://discord.com/channels/817119487999606794/825717174257319974/833584636533407745), I think using the AutoModel class from transformers would make the implementation somewhat simpler, as you can simply give it the local path to the model and it can figure out what's in there. What do you think? Not sure about the tokenizer, though, but I think both GPT-2 models and GPT-Neo use similar tokenizers?

@onlurking
Copy link
Author

Hi @paulbricman!

I've followed the transformers docs and both models use the same GPT2Tokenizer function, but i´ts totally possible to replace the specific code to use AutoTokenizer, AutoConfig and AutoModel instead.

After work, I’ll take a look on this.

@onlurking
Copy link
Author

@paulbricman done!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants