-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confused about the Model outputs? #43
Comments
Okay so I think in Microsofts hugging face release with your decoders, you nurfed the outputs to prevent anything that is said that would be controversial |
@ArEnSc , were you able to get the expected results? |
@chiranshu14 you have to use a custom decoder, I think they filter out everything or nurfed the model |
@ArEnSc Thanks for your response. |
@chiranshu14 |
@ArEnSc sure Michael, could you please tell me more on what exactly needs to be done here? I'm kinda new to this, by using a custom decoder do you mean fine tuning with a different dataset? |
You just need to load the model, hopefully they didn't nurf the model, but here is the script, explore it a bit, it works similar to the original paper, but let me know if the results are better https://colab.research.google.com/drive/1PslHE4Rl4RqSa20s7HEp0ZKITBir6ezE |
@ArEnSc yes, I have been trying this and other sample decoding scripts on the readme page. None of them have worked so far, they all have issues while loading the weights. Some weights seem to be missing. |
@ArEnSc @chiranshu14 maybe you can try this script with |
@golsun is this the same technique that was used to train Tay on twitter? it seems like it Thanks. |
No it’s not related to Tay.
On Mon, Feb 22, 2021 at 7:43 AM Michael Chung ***@***.***> wrote:
@golsun <https://github.com/golsun> is this the same technique that was
used to train Tay on twitter? it seems like it Thanks.
@chiranshu14 <https://github.com/chiranshu14> what @golsun
<https://github.com/golsun> suggested doesn't work, I have an earlier
model of this laying around some where I just have to find it
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCQUMSUB6PF3GQJPUME2PLTAJ3RVANCNFSM4NWV22LA>
.
--
Thanks!
Xiang
|
Oh thanks yeah I watched the video, essentially, it helps bring to better responses in data a bit by predicting which are more preferred for a response, smart!. Is there anyway to get Tay like learning behaviour from any project Microsoft has open sourced? it seems like it's doing online learning based off ranking of specific things said to it, based off feedback from the community? |
@golsun Does a higher rank mean better response? For these simple examples with your demo notebook, the results are not as expected. Example 1- Example 2- Example 3 - Am I using the wrong model? |
@chiranshu14 the results are better than I don't know haha which I get alot. |
python src/generation.py play -pg=restore/medium_ft.pkl -pr=restore/updown.pth --sampling This command worked perfectly, got awesome results. I'm going to play around with this some more. But I think this looks perfect so far. Thank you @golsun for suggesting (and building) DialoRPT. Really liked the idea of ranking the responses & @ArEnSc for your guidance! |
Great! Thank @chiranshu14 for trying our DialogRPT! ! |
Thank you @ArEnSc for trying DialogRPT! Sorry I'm not aware of open-sourced repo similar to Tay. |
Since this model is huge and requires GPU to compute inferences. |
@chiranshu14 I think if you want to execute this there are two ways distill the model, requires a lot of work, and then run it with js on client side, or create a fast api server with rest or socket based api, host the model and send the information across the net and wait for the response. Otherwise you really have no way of using this. Unless you host it offline. Is there no CPU inference mode ? I think there is, and it's slow. |
I'll look into distilling of a model. I guess that's my only option as I cannot use a server with a GPU. Thanks!! |
@chiranshu14 do you have the approximate inference time ? |
@ArEnSc with GPU it took around 1-2 seconds, while on CPU it took approximately 8 seconds. Also, I noticed that if we do not use the updown.pth the results were not as good, so that's important. |
@chiranshu14 thanks and good luck with what you are working on |
@chiranshu14 After analyzing, the results you got are satisfying or not? |
@alan-ai-learner I've decided to go with rasa. It's deterministic. |
I think the issue is, most people have high expectations on these, you likely switch over to this when you rules base model fails and thing's go off script even GPT-3 isn't as great. It really depends at your audience as well and their expectations |
Hey I got the hugging face GPT2 Large model of Dialog, I guess pre trained? and I have tried to ask it questions and it seems like it's not really returning any thing interesting. I asked it what is the meaning of life? and it said to be a good boy? I am confused about why it didn't pick up anything from reddit am I missing something? am I supposed to train it my self with the reddit dataset to get similar outputs from what was described?
The text was updated successfully, but these errors were encountered: