SODAWideNetPlusPlus [Link]

Combining Attention and Convolutions for Salient Object Detection

ABSTRACT

Salient Object Detection (SOD) has traditionally relied on feature refinement modules that utilize the features of an ImageNet pre-trained backbone. However, this approach limits the possibility of pre-training the entire network because of the distinct nature of SOD and image classification. Additionally, the architecture of these backbones originally built for Image classification is sub-optimal for a dense prediction task like SOD. To address these issues, we propose a novel encoder-decoder-style neural network called SODAWideNet++ that is designed explicitly for SOD. Inspired by the vision transformers' ability to attain a global receptive field from the initial stages, we introduce the Attention Guided Long Range Feature Extraction (AGLRFE) module, which combines large dilated convolutions and self-attention. Specifically, we use attention features to guide long-range information extracted by multiple dilated convolutions, thus taking advantage of the inductive biases of a convolution operation and the input dependency brought by self-attention. In contrast to the current paradigm of ImageNet pre-training, we modify 118K annotated images from the COCO semantic segmentation dataset by binarizing the annotations to pre-train the proposed model end-to-end. Further, we supervise the background predictions along with the foreground to push our model to generate accurate saliency predictions. SODAWideNet++ performs competitively on five different datasets while only containing 35{%} of the trainable parameters compared to the state-of-the-art models.

Model Overview

Model	Number of Parameters	Pre-computed Saliency Maps	Model Weights	Pre-trained Weights
SODAWideNet++	26.58M	Saliency Maps	Weights	Pre-trained Weights
SODAWideNet++-M	6.66M	Saliency Maps	Weights	Pre-trained Weights
SODAWideNet++-S	1.67M	Saliency Maps	Weights	Pre-trained Weights

COCO Pre-training

Download the dataset and unzip the file. Then, use the following command to train the model. The model sizes can be L, M, and S.

python training.py \
    --lr 0.001 \
    --epochs 21 \
    --f_name "COCOSODAWideNet++L" \
    --n 4 \
    --b 20 \
    --sched 1 \
    --training_scheme "COCO" \
    --salient_loss_weight 1.0 \
    --use_pretrained 0 \
    --im_size 384 \
    --model_size 'L'

DUTS Finetuning

Download the dataset from link and unzip the file. Then, use the following command to train the model. Also, download the DUTS-TE dataset for evaluation. Create a folder with the name checkpoints and save the COCO pre-trained checkpoint in it.

python training.py \
    --lr 0.001 \
    --epochs 11 \
    --f_name "DUTSSODAWideNet++L" \
    --n 4 \
    --b 20 \
    --sched 1 \
    --training_scheme "DUTS" \
    --salient_loss_weight 0.5 \
    --use_pretrained 1 \
    --checkpoint_name "COCOSODAWideNet++L"
    --im_size 384 \
    --model_size 'L'

Inference

We provide an option to generate the saliency map for a single image or multiple images in a folder. The below script displays the generated saliency map. model_size can be L, M, and S.

python inference.py \
    --mode single \
    --input_path /path/to/image.jpg \
    --display \
    --model_size L

The below script generates a saliency map and saves the result.

python inference.py \
    --mode single \
    --input_path /path/to/image.jpg \
    --model_size L

The below script generates saliency maps for a folder of images and saves them in the user-specified output directory.

python inference.py \
    --mode folder \
    --input_path /path/to/input/folder \
    --output_dir /path/to/output/folder \
    --model_size L

Citation

If you find our research useful, please cite our paper with the following citation -

@inproceedings{dulam2025sodawidenet++,
  title={SODAWideNet++: Combining Attention and Convolutions for Salient Object Detection},
  author={Dulam, Rohit Venkata Sai and Kambhamettu, Chandra},
  booktitle={International Conference on Pattern Recognition},
  pages={210--226},
  year={2025},
  organization={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
LICENSE		LICENSE
README.md		README.md
SODAWideNetPlusPlus.py		SODAWideNetPlusPlus.py
inference.py		inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SODAWideNetPlusPlus [Link]

ABSTRACT

Model Overview

COCO Pre-training

DUTS Finetuning

Inference

Citation

About

Uh oh!

Releases

Packages

Languages

License

VimsLab/SODAWideNetPlusPlus

Folders and files

Latest commit

History

Repository files navigation

SODAWideNetPlusPlus [Link]

ABSTRACT

Model Overview

COCO Pre-training

DUTS Finetuning

Inference

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages