What is the input data format for vectors instead of images? #129

axu-git · 2024-02-28T23:01:29Z

axu-git
Feb 28, 2024

I am using RHVAE and would like to input custom data consisting of N vectors of length L. So:

train_dataset.shape == (N1, L)
eval_dataset.shape == (N2, L)

Q1) In the colab tutorial, the dataset are MNIST images so the input is shape (-1, 1, 28, 28). I'm using vectors instead of images (L instead of 28 x28) so I would need to change the input dimensions.
i) What is the 1 for (i.e., why isn't the MNIST images in shape (-1, 28, 28) instead)? Is it for number of channels?
ii) If I have N independent vectors, then would I want to set number of channels to 1?
iii) Does this mean I need to change my train_dataset/eval_dataset to have shape (N1, 1, 1, L) (and (N2, 1, 1, L))? Or (N1, 1, L) (and (N2, 1, L))?
Q2) Does the answer change if I use batching?

    per_device_train_batch_size=64,
    per_device_eval_batch_size=64,

I ask because I saw an error comment when I ran the code about 4d for batched and 3d for unbatched, but I'm not sure if that is specific to using Encoder_ResNet_VAE_MNIST/Decoder_ResNet_AE_MNIST.

Thank you!

clementchadebec · 2024-02-29T17:11:56Z

clementchadebec
Feb 29, 2024
Maintainer

Hey @axu-git,

Actually, Encoder_ResNet_VAE_MNIST and Decoder_ResNet_AE_MNIST are specific for image data. That is why you will encounter issues with other data types.

As to your questions.

Q1) i) The 1 is indeed for the number of channels that needs to be specified when using convolutions

ii) The number of channels represent the encoding of your image. In the MNIST case, the image is gray scale meaning that a pixel can be represented by a single value in [0, 255] representing the pixel intensity. In case of coloured images, you may want to use the RGB encoding meaning that each pixel will be represented by 3 values (one for blue, one for red and one for green). Hence, you will need 3 channels. This has nothing to do with the number of vector/data points you are dealing with

iii) No, this depends on the type of data your are using and the network you want to use for your encoder/decoder. For images, oin tha very case, we need them to be shaped as (B, C, H, W) since the networks you use are using 2-dimensional convolutions. If you only wanted to use a MLP to encode those images then you would have needed to shape them as (B, CxHxW).

Q2) Using batching does not change the previous answers.

I hope this helps,

Best,

Clément

0 replies

axu-git · 2024-02-29T22:29:28Z

axu-git
Feb 29, 2024
Author

Thank you!!

Following up on Q1 iii), if I do not specify encoder and decoder (which from the docs seems to mean I am using MLP), are setting input_dim in model_config equal to (12,) and (1, 12) equivalent? I ran it with both and both run with no errors.

I'm leaning towards (12,) because in the MNIST colab example, the input_dim = (1, 28, 28) which did not include the batching dimension, but I wanted to double check.

1 reply

clementchadebec Mar 1, 2024
Maintainer

Yes both settings are equivalent since to build the MLP with the correct shape, the code will compute the size of a flattened input (hence 1×12 or 1×28×28 for MNIST). And indeed the batch should not be added there, it is really the size of a single data point.

axu-git · 2024-03-01T19:25:15Z

axu-git
Mar 1, 2024
Author

Awesome thanks!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the input data format for vectors instead of images? #129

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

What is the input data format for vectors instead of images? #129

axu-git Feb 28, 2024

Replies: 3 comments · 1 reply

clementchadebec Feb 29, 2024 Maintainer

axu-git Feb 29, 2024 Author

clementchadebec Mar 1, 2024 Maintainer

axu-git Mar 1, 2024 Author

axu-git
Feb 28, 2024

Replies: 3 comments 1 reply

clementchadebec
Feb 29, 2024
Maintainer

axu-git
Feb 29, 2024
Author

clementchadebec Mar 1, 2024
Maintainer

axu-git
Mar 1, 2024
Author