Fixes wrong output dimensions in ConvTranspose1d #78
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Solves #42, #58 and #68: all related to incorrect computation of output shape in the ConvTraspose1d of the DecoderBlock (as also pointed out in PR #44).
When using stride > 1 in a conv operation the output dimensions are underdetermined and ConvTranspose1d needs extra info (the output_padding) to compute the expected output (see note in docs).
Given the construction constraints of the conv/deconv operations (namely, kernel_size=stride/2, padding=ceil(stride/2)), I figured out the right output_padding (so we always recover the same input dimensions) is:
with input_timesteps = timestestep dimension of the input to the original conv1d.
This PR sets output_padding=0 for even strides and 1 for odd strides. This will work in the vast majority of cases (including for all pretrained models) except when:
1: if stride is even and input_timesteps is not divisible by stride.
2: if stride is odd and input_timesteps+1 is divisible by stride.
Both of which are unlikely ( and case 1 would fail anyway even without this PR). At the very least, I believe the current setting is a more sensible default.