Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why Train_Accuracy is pretty low(about 0.2) ? #7

Open
shoveller86 opened this issue Sep 4, 2020 · 1 comment
Open

Why Train_Accuracy is pretty low(about 0.2) ? #7

shoveller86 opened this issue Sep 4, 2020 · 1 comment

Comments

@shoveller86
Copy link

shoveller86 commented Sep 4, 2020

I follow the "README.md" trained the model on sample data

First: python pre_process.py
Second: python train_gpt2.py --num-layers=8 --embedding-size=768 --batch-size=32

Then, the training beigins, here us the Loss and Accurancy during training

eprecated and will be removed after 2020-07-01.
Instructions for updating:
`tf.python.eager.profiler` has deprecated, use `tf.profiler` instead.
Saving checkpoint for step 0 at xxx/GPT_2/GPT_tf/TF2/gpt-2-tensorflow2.0/model/ckpt-1
Step 10 Train_Loss 7.2324 Train_Accuracy 0.0832
Step 20 Train_Loss 6.5299 Train_Accuracy 0.1730
Step 30 Train_Loss 6.4850 Train_Accuracy 0.1768
Step 40 Train_Loss 6.1244 Train_Accuracy 0.1932
Step 50 Train_Loss 6.3007 Train_Accuracy 0.1790
Step 60 Train_Loss 6.3144 Train_Accuracy 0.1865
Step 70 Train_Loss 6.1924 Train_Accuracy 0.1648
Step 80 Train_Loss 6.2282 Train_Accuracy 0.1759
Step 90 Train_Loss 6.2466 Train_Accuracy 0.1744
Step 100 Train_Loss 6.1871 Train_Accuracy 0.1795
Step 110 Train_Loss 6.0732 Train_Accuracy 0.2064
Step 120 Train_Loss 5.7407 Train_Accuracy 0.2119
Step 130 Train_Loss 5.8436 Train_Accuracy 0.2077
Step 140 Train_Loss 5.7919 Train_Accuracy 0.1898
Step 150 Train_Loss 5.9080 Train_Accuracy 0.1661
Step 160 Train_Loss 5.8630 Train_Accuracy 0.1994
Step 170 Train_Loss 5.7625 Train_Accuracy 0.2076
---

Step 2740 Train_Loss 5.3913 Train_Accuracy 0.1958
Step 2750 Train_Loss 5.3359 Train_Accuracy 0.2195
Step 2760 Train_Loss 5.3394 Train_Accuracy 0.1973
Step 2770 Train_Loss 5.0865 Train_Accuracy 0.2501
Step 2780 Train_Loss 5.4709 Train_Accuracy 0.1929
Step 2790 Train_Loss 5.4672 Train_Accuracy 0.1946
Step 2800 Train_Loss 5.5116 Train_Accuracy 0.1962
Step 2810 Train_Loss 5.2981 Train_Accuracy 0.2346
Step 2820 Train_Loss 5.4803 Train_Accuracy 0.2078
Step 2830 Train_Loss 5.5752 Train_Accuracy 0.1869
Step 2840 Train_Loss 5.4528 Train_Accuracy 0.2158
Step 2850 Train_Loss 5.2158 Train_Accuracy 0.2377
Step 2860 Train_Loss 5.3771 Train_Accuracy 0.2202
Step 2870 Train_Loss 5.3635 Train_Accuracy 0.1965
Step 2880 Train_Loss 5.4944 Train_Accuracy 0.2296
Step 2890 Train_Loss 5.4714 Train_Accuracy 0.2068
Step 2900 Train_Loss 5.2218 Train_Accuracy 0.2330
Step 2910 Train_Loss 5.4696 Train_Accuracy 0.2070
Step 2920 Train_Loss 5.5928 Train_Accuracy 0.1947
Step 2930 Train_Loss 5.4761 Train_Accuracy 0.2173
Step 2940 Train_Loss 5.5963 Train_Accuracy 0.2022
Step 2950 Train_Loss 5.3133 Train_Accuracy 0.2197
Training Done................
@akanyaani
Copy link
Owner

Hi @alphaRGB,
Gpt2 is an autoregressive language model so accuracy is not a good metric, I have removed the accuracy and added the perplexity as a metric.
https://thegradient.pub/understanding-evaluation-metrics-for-language-models/
https://towardsdatascience.com/perplexity-in-language-models-87a196019a94

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants