Implement the PSNR vision metric #4379
Open
+619
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Template
Checklist
cargo run-checkscommand has been executed.Related Issues/PRs
Image quality metrics #4312
Changes
Implemented the PSNR vision metric in the
crates/burn-train/src/metric/vision/psnr.rsfile. It computes the per-image PSNR values and then average them across all the images in the batch. Since the PSNR (in dB) is defined as following:the MSE value is clamped a minimum value of
epsilonto avoid division by 0. Theepsilonfield of thePsnrMetricConfigstruct is set to1e-10by default. Users can set a custom/different epsilon value as well.Testing
Added 18 test cases to the
psnr.rsfile. Tests cover different input shapes, different PSNR values, different image representation (where maximum possible pixel values are different), multichannel tensors, running average, setting custom name and/or epsilon, panics, etc. The test cases verify actual expected values rather than just ranges (which is approach taken by the other open PR).Note
max_pixel_valset but this value depends on the specific image format used by the user. Depending on the whether the images are normalized to [0, 1] range, or they are 8, 16, 32, etc bits, this value changes. Hence, in my code, the user who is aware of their specific image formats must always set this value.update()method always treats the first dimension (dim 0) as the batches. This can easily go wrong in many different cases such as when the input tensors have shapes such as [H, W], [C, H, W], etc which are still valid shapes in their code.