Beta-7 different outputs on GPU and CPU regression #124

chooneung · 2020-10-14T09:12:06Z

OS: Windows
DL4J: deeplearning4j 1.0.0-beta7
CUDA: 10.2
cuDNN: 7.6
Issue: The results of regression (eg. BostonHousePricePrediction.java) running on GPU are not same as results running on CPU. Attached are the screenshot of the results.

CPU:
Using CPU as backend:

Result running on CPU backend:

GPU:
Using GPU as backend:

Result running on GPU backend:

jtkhair · 2021-03-01T10:56:32Z

Any update on this?

I have the same issue. Setting as below

OS: Ubuntu18.04.5
DL4J: deeplearning4j 1.0.0-beta7
CUDA: 10.1
cuDNN: 7.6

kenghooi-teoh · 2021-04-01T09:00:34Z

Issue Description
Using same model and training config, we saw a huge difference in loss scores where running example, on this dataset using CPU vs using GPU, but only in certain examples i.e. in other examples where CNN is used, we didn't come across the same issue. DL4J version: beta7.

Version Information
OS: Windows, CUDA: 10.2, cuDNN: 7.6
OS: Ubuntu18.04.5, CUDA: 10.1, cuDNN: 7.6
OS: Windows, CUDA: 10.0.130, cuDNN: 7.5

Additional Information
Model config (only contains Dense and Output layers):

MultiLayerConfiguration conf= new NeuralNetConfiguration.Builder()
        .seed(seed)
        .updater(new Adam(learningRate))
        .weightInit(WeightInit.XAVIER)
        .l2(0.001)
        .list()
        .layer(new DenseLayer.Builder()
                .nIn(13)
                .nOut(128)
                .activation(Activation.RELU)
                .build())
        .layer(new DenseLayer.Builder()
                .nIn(128)
                .nOut(64)
                .activation(Activation.RELU)
                .build())
        .layer(new OutputLayer.Builder()
                .nIn(64)
                .nOut(1)
                .activation(Activation.IDENTITY)
                .lossFunction(LossFunctions.LossFunction.MSE)
                .build())
        .build();

Screenshots of different loss:

agibsonccc · 2021-04-01T10:20:59Z

@chooneung could you give me the full training loop so I can reproduce this out of the box for testing? I need to confirm if this is still the case or not on the latest version.

kenghooi-teoh · 2021-04-01T16:13:43Z

@agibsonccc
training code
https://github.com/CertifaiAI/cdle-traininglabs/blob/main/dl4j-labs/src/main/java/ai/certifai/solution/regression/bostonhousepriceprediction/BostonHousePricePrediction.java

dataset:
https://github.com/CertifaiAI/cdle-traininglabs/tree/main/dl4j-labs/src/main/resources/boston

chooneung added bug Something isn't working invalid This doesn't seem right critical Significant failures affecting the running of repository labels Oct 14, 2020

nuratikah-certifai mentioned this issue Mar 2, 2021

update for previous issue: Beta-7 different outputs on GPU and CPU regression CertifaiAI/CertifAI-Knowledge-Base#39

Open

This was referenced Mar 2, 2021

Beta-7 different outputs on GPU and CPU regression deeplearning4j/deeplearning4j-examples#1019

Closed

Beta-7 different outputs on GPU and CPU regression deeplearning4j/deeplearning4j#9199

Closed

OoiXinPeng mentioned this issue Aug 26, 2021

M1.1 - Different outputs on GPU and CPU classification #177

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beta-7 different outputs on GPU and CPU regression #124

Beta-7 different outputs on GPU and CPU regression #124

chooneung commented Oct 14, 2020

jtkhair commented Mar 1, 2021 •

edited

kenghooi-teoh commented Apr 1, 2021

agibsonccc commented Apr 1, 2021

kenghooi-teoh commented Apr 1, 2021

Beta-7 different outputs on GPU and CPU regression #124

Beta-7 different outputs on GPU and CPU regression #124

Comments

chooneung commented Oct 14, 2020

jtkhair commented Mar 1, 2021 • edited

kenghooi-teoh commented Apr 1, 2021

agibsonccc commented Apr 1, 2021

kenghooi-teoh commented Apr 1, 2021

jtkhair commented Mar 1, 2021 •

edited