You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am attempting to train a VQ-VAE model, but I couldn't find an embedding_dim argument in either the VQVAEconfig or VQVAE classes to assign a value to it.
From what I have found, the only place where the embedding_dim is assigned is inside the _set_quantizer function of the VQVAE class, which is hard-coded to one.
def _set_quantizer(self, model_config):
if model_config.input_dim is None:
raise AttributeError(
"No input dimension provided !"
"'input_dim' parameter of VQVAEConfig instance must be set to 'data_shape' where "
"the shape of the data is (C, H, W ..). Unable to set quantizer."
)
x = torch.randn((2,) + self.model_config.input_dim)
z = self.encoder(x).embedding
if len(z.shape) == 2:
z = z.reshape(z.shape[0], 1, 1, -1)
z = z.permute(0, 2, 3, 1)
self.model_config.embedding_dim = z.shape[-1]
z.shape[-1] always holds the value 1.
The text was updated successfully, but these errors were encountered:
Actually, this value is set automatically to either the number of channels of your encoded sample or the size of your latent space in case of flattened encoded input. This is needed to be able to quantized the encoded sample within the codebook. To change this value you should either adapt you encoder architecture to output a sample with the required embedded dimension if you created you own or change the latent_dim in your config if you use default nets. In the latter case, do not also forget to provide the input dimension of your data.
Hi @clementchadebec.
Thank you for creating this repository.
I am attempting to train a VQ-VAE model, but I couldn't find an
embedding_dim
argument in either theVQVAEconfig
orVQVAE
classes to assign a value to it.From what I have found, the only place where the
embedding_dim
is assigned is inside the_set_quantizer
function of theVQVAE
class, which is hard-coded to one.z.shape[-1]
always holds the value 1.The text was updated successfully, but these errors were encountered: