Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect DSTC medium model link: pkl file is the small model #40

Open
alvinchangw opened this issue May 7, 2020 · 1 comment
Open

Comments

@alvinchangw
Copy link

Hi there, thanks for sharing your amazing work on github. Just wanted to point out that the link (https://convaisharables.blob.core.windows.net/lsp/DSTC/medium_ft.pkl) shared in readme for DSTC medium model is the small GPT-2 version.

Its size is 351.3MB rather than 863MB for medium GPT-2. When loading the pkl files, there are only 12 transformer blocks with hidden state size of 768 rather than medium's 1024.

Would be great if you can share the corrected DSTC medium model link! Thanks!

@dreasysnail
Copy link
Contributor

Hi, thanks for the feedback. We use 3 different model sizes: 117M (small), 345M(medium) and 762M(large), which might not correspond to the size of GPT-2. We don't have the 863M model available for DSTC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants