Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example for raw audio #21

Open
mm3509 opened this issue Dec 3, 2023 · 0 comments
Open

Example for raw audio #21

mm3509 opened this issue Dec 3, 2023 · 0 comments

Comments

@mm3509
Copy link

mm3509 commented Dec 3, 2023

Hello, and thanks for the code! I want to replicate the audio results from the paper, but the DeepMind repo does not have a VQ-VAE example for audio (see google-deepmind/sonnet#141 ), and it seems quite different from the one for CIFAR:

We train a VQ-VAE where the encoder has 6 strided convolutions with stride 2 and window-size 4. This yields a latent space 64x smaller than the original waveform. The latents consist of one feature map and the discrete space is 512-dimensional.

Could you please include an example of using your code for audio?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant