You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, and thanks for the code! I want to replicate the audio results from the paper, but the DeepMind repo does not have a VQ-VAE example for audio (see google-deepmind/sonnet#141 ), and it seems quite different from the one for CIFAR:
We train a VQ-VAE where the encoder has 6 strided convolutions with stride 2 and window-size 4. This yields a latent space 64x smaller than the original waveform. The latents consist of one feature map and the discrete space is 512-dimensional.
Could you please include an example of using your code for audio?
The text was updated successfully, but these errors were encountered:
Hello, and thanks for the code! I want to replicate the audio results from the paper, but the DeepMind repo does not have a VQ-VAE example for audio (see google-deepmind/sonnet#141 ), and it seems quite different from the one for CIFAR:
Could you please include an example of using your code for audio?
The text was updated successfully, but these errors were encountered: