You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Its size is 351.3MB rather than 863MB for medium GPT-2. When loading the pkl files, there are only 12 transformer blocks with hidden state size of 768 rather than medium's 1024.
Would be great if you can share the corrected DSTC medium model link! Thanks!
The text was updated successfully, but these errors were encountered:
Hi, thanks for the feedback. We use 3 different model sizes: 117M (small), 345M(medium) and 762M(large), which might not correspond to the size of GPT-2. We don't have the 863M model available for DSTC.
Hi there, thanks for sharing your amazing work on github. Just wanted to point out that the link (https://convaisharables.blob.core.windows.net/lsp/DSTC/medium_ft.pkl) shared in readme for DSTC medium model is the small GPT-2 version.
Its size is 351.3MB rather than 863MB for medium GPT-2. When loading the pkl files, there are only 12 transformer blocks with hidden state size of 768 rather than medium's 1024.
Would be great if you can share the corrected DSTC medium model link! Thanks!
The text was updated successfully, but these errors were encountered: