Different results for different batch sizes when evaluating trained models #176

AxelMueller · 2019-02-17T16:57:46Z

Hi,
First of all, thanks for making your great code and models available.
I am currently trying out two of your models (MP-CNN and VDPWI) and noticed that when evaluating trained models (via --skip-training), different batch sizes give different results.
For example,

python -m mp_cnn ../Castor-models/mp_cnn/mpcnn.sick.model --dataset sick --batch-size 16 --skip-training

returns a different results than

python -m mp_cnn ../Castor-models/mp_cnn/mpcnn.sick.model --dataset sick --batch-size 64 --skip-training

Have your encountered this behavior before and do you know what the reasons might be? Which would be the correct result?

The text was updated successfully, but these errors were encountered:

daemon · 2019-02-17T19:19:14Z

Hi,

Thanks for your interest, I've confirmed this issue. My guess is that the amount of padding depends on the batch size due to varying sentence lengths, and the resulting padding is not implemented as a no-op. Using a batch size of 1 should be the correct thing to do during inference (for now).

AxelMueller · 2019-02-19T22:11:19Z

Ok, thanks for your quick reply!

daemon added the bug label Feb 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different results for different batch sizes when evaluating trained models #176

Different results for different batch sizes when evaluating trained models #176

AxelMueller commented Feb 17, 2019

daemon commented Feb 17, 2019

AxelMueller commented Feb 19, 2019

Different results for different batch sizes when evaluating trained models #176

Different results for different batch sizes when evaluating trained models #176

Comments

AxelMueller commented Feb 17, 2019

daemon commented Feb 17, 2019

AxelMueller commented Feb 19, 2019