Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] 文本分类中的CNN开头的模型accuracy不管换数据集还是调参数都只有0.2 #488

Open
hwq458362228 opened this issue Apr 14, 2022 · 1 comment
Assignees
Labels
question Further information is requested wontfix This will not be worked on

Comments

@hwq458362228
Copy link

hwq458362228 commented Apr 14, 2022

You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed.
请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue

Check List

Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.

You can post pictures, but if specific text or code is required to reproduce the issue, please provide the text in a plain text format for easy copy/paste.

Environment

  • OS [e.g. Mac OS, Linux]: Win10
  • Python Version: 3.7
  • requirements.txt: TensorFlow 2.3 kashgari 2.0.1
[Paste requirements.txt file here]

Question

不管是使用SMP2018ECDTCorpus还是自己的数据集,在使用CNN开头的系列文本分类模型时,这个accuracy都不行,也试过改变学习率和epoch等参数,但是没啥作用,不知道不是这些模型本身有问题

from kashgari.corpus import SMP2018ECDTCorpus
from kashgari.tasks.classification import CNN_Model
from kashgari.callbacks import EvalCallBack

import logging
logging.basicConfig(level='DEBUG')

train_x, train_y = SMP2018ECDTCorpus.load_data('train')
valid_x, valid_y = SMP2018ECDTCorpus.load_data('valid')
test_x, test_y = SMP2018ECDTCorpus.load_data('test')

model = CNN_Model()
model.fit(train_x, train_y, valid_x, valid_y,batch_size=64,epochs=14)
model.evaluate(test_x,test_y,batch_size=64)

运行结果:
2022-04-14 18:08:55,276 [DEBUG] kashgari - loaded 1881 samples from C:\Users\hwq45.kashgari\datasets\SMP2018ECDTCorpus\train.csv. Sample:
x[0]: ['打', '开', '河', '南', '英', '东', '网', '站']
y[0]: website
2022-04-14 18:08:55,280 [DEBUG] kashgari - loaded 418 samples from C:\Users\hwq45.kashgari\datasets\SMP2018ECDTCorpus\valid.csv. Sample:
x[0]: ['来', '一', '首', ',', '灵', '岩', '。']
y[0]: poetry
2022-04-14 18:08:55,284 [DEBUG] kashgari - loaded 770 samples from C:\Users\hwq45.kashgari\datasets\SMP2018ECDTCorpus\test.csv. Sample:
x[0]: ['给', '曹', '广', '义', '打', '电', '话']
y[0]: telephone
Preparing text vocab dict: 100%|██████████| 1881/1881 [00:00<00:00, 943831.30it/s]
Preparing text vocab dict: 100%|██████████| 418/418 [00:00<00:00, 416936.76it/s]
2022-04-14 18:08:55,291 [DEBUG] kashgari - --- Build vocab dict finished, Total: 875 ---
2022-04-14 18:08:55,291 [DEBUG] kashgari - Top-10: ['[PAD]', '[UNK]', '[CLS]', '[SEP]', '的', '么', '我', '。', '怎', '你']
Preparing classification label vocab dict: 100%|██████████| 1881/1881 [00:00<?, ?it/s]
Preparing classification label vocab dict: 100%|██████████| 418/418 [00:00<?, ?it/s]
Calculating sequence length: 100%|██████████| 1881/1881 [00:00<00:00, 1894234.29it/s]
Calculating sequence length: 100%|██████████| 418/418 [00:00<00:00, 419430.40it/s]
2022-04-14 18:08:55,309 [DEBUG] kashgari - Calculated sequence length = 15
2022-04-14 18:08:55,337 [DEBUG] kashgari - Model: "functional_43"


Layer (type) Output Shape Param #

input (InputLayer) [(None, None)] 0


layer_embedding (Embedding) (None, None, 100) 87500


conv1d_6 (Conv1D) (None, None, 128) 64128


global_max_pooling1d_4 (Glob (None, 128) 0


dense_14 (Dense) (None, 64) 8256


dense_15 (Dense) (None, 31) 2015


activation_10 (Activation) (None, 31) 0

Total params: 161,899
Trainable params: 161,899
Non-trainable params: 0


Epoch 1/14
29/29 [==============================] - 0s 8ms/step - loss: 3.3098 - accuracy: 0.1735 - val_loss: 3.1836 - val_accuracy: 0.1901
Epoch 2/14
29/29 [==============================] - 0s 5ms/step - loss: 3.0778 - accuracy: 0.1992 - val_loss: 3.0883 - val_accuracy: 0.1953
Epoch 3/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0232 - accuracy: 0.1992 - val_loss: 3.0700 - val_accuracy: 0.2005
Epoch 4/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0164 - accuracy: 0.1987 - val_loss: 3.0591 - val_accuracy: 0.1901
Epoch 5/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0395 - accuracy: 0.1943 - val_loss: 3.0622 - val_accuracy: 0.1979
Epoch 6/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0327 - accuracy: 0.2003 - val_loss: 3.0659 - val_accuracy: 0.1875
Epoch 7/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0361 - accuracy: 0.1948 - val_loss: 3.0711 - val_accuracy: 0.1953
Epoch 8/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0347 - accuracy: 0.1987 - val_loss: 3.0581 - val_accuracy: 0.1901
Epoch 9/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0155 - accuracy: 0.1981 - val_loss: 3.0576 - val_accuracy: 0.2005
Epoch 10/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0415 - accuracy: 0.2036 - val_loss: 3.0651 - val_accuracy: 0.1953
Epoch 11/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0296 - accuracy: 0.1992 - val_loss: 3.0850 - val_accuracy: 0.1849
Epoch 12/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0132 - accuracy: 0.2053 - val_loss: 3.0643 - val_accuracy: 0.1953
Epoch 13/14
29/29 [==============================] - 0s 4ms/step - loss: 3.0523 - accuracy: 0.1899 - val_loss: 3.0639 - val_accuracy: 0.2005
Epoch 14/14
29/29 [==============================] - 0s 4ms/step - loss: 3.7734 - accuracy: 0.2075 - val_loss: 3.0653 - val_accuracy: 0.2031

@hwq458362228 hwq458362228 added the question Further information is requested label Apr 14, 2022
@stale
Copy link

stale bot commented Nov 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants