Feat: Frame-level Extraction and PyTorch API Updates #41

TioSisai · 2025-07-15T14:04:07Z

This pull request introduces two main sets of changes: a new feature for frame-level embedding extraction and several updates to ensure compatibility with modern PyTorch versions by replacing deprecated APIs.

New Features:

Frame-level Feature Extraction:

Added a frame: bool parameter to the forward methods in both MobileNet (MN) and DyMN models.
When frame=True, the model preserves the temporal dimension during the final pooling stage, allowing for the extraction of frame-wise embeddings.
This enables more fine-grained temporal analysis, while maintaining backward compatibility with the default clip-level feature extraction.

Fixes & Maintenance:

PyTorch API Modernization:

Replaced the deprecated ConvNormActivation with the current Conv2dNormActivation.
Updated torch.stft to use return_complex=True and calculated the power magnitude with torch.square(torch.abs(x)) to align with modern complex tensor handling.
Replaced torch.cuda.amp.autocast with the more general torch.amp.autocast.

- Replace closely deprecated ConvNormActivation with Conv2dNormActivation - Update torch.stft to use return_complex=True for complex tensor handling and torch.square(torch.abs(x)) for power magnitude computation from complex-valued spectrogram - Replace torch.cuda.amp.autocast with torch.amp.autocast for better device compatibility These changes ensure compatibility with newer PyTorch versions while maintaining backward compatibility and fixing deprecation warnings.

…odels - Add 'frame' parameter to forward methods in MN and DyMN classes - Modify _clf_forward and _forward_impl methods to support frame-level feature extraction - Update adaptive pooling logic to preserve temporal dimension when frame=True - Maintain backward compatibility with existing clip-level feature extraction - Enable frame-wise embeddings output alongside classification results This enhancement allows models to extract features at frame level (preserving temporal dimension) in addition to the existing clip-level aggregation, enabling more fine-grained temporal analysis.

TioSisai added 2 commits July 15, 2025 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Frame-level Extraction and PyTorch API Updates #41

Feat: Frame-level Extraction and PyTorch API Updates #41

Uh oh!

TioSisai commented Jul 15, 2025

Uh oh!

Uh oh!

Feat: Frame-level Extraction and PyTorch API Updates #41

Are you sure you want to change the base?

Feat: Frame-level Extraction and PyTorch API Updates #41

Uh oh!

Conversation

TioSisai commented Jul 15, 2025

This pull request introduces two main sets of changes: a new feature for frame-level embedding extraction and several updates to ensure compatibility with modern PyTorch versions by replacing deprecated APIs.

New Features:

Frame-level Feature Extraction:

Fixes & Maintenance:

PyTorch API Modernization:

Uh oh!

Uh oh!