Skip to content

yanliyue/Contour-enhanced-Visual-State-Space-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 

Repository files navigation

Contour-enhanced-Visual-State-Space-Model

G-VMamba: Contour-enhanced Visual State-Space Model for Remote Sensing Image Classification

Introduction

The current branch has been tested on Linux system, PyTorch 1.13.X and CUDA 11.6, supports Python 3.8+.

Overview

  • Class activation map (CAM) visualization of the final normalization layer for VMamba and G-VMamba Models’ classification of UC-Merced dataset images.

When the model classifies the scenes in the image, the G-VMamba model focuses on areas where the color (or brightness) of the image changes more significantly (red areas), such as the lane intersection position of the Overpass scene, the edge of the court in the Baseball diamond scene, and the airplane shadow and lawn border of the Airplane scene. (The model size is ‘Small’.)

arch

  • The overall architecture: (a) Overview of G-VMamba model; (b) Feature grouping in the G-VSS block.

arch

Preparing the dataset

Remote Sensing Image Classification Dataset

We provide the method of preparing the remote sensing image classification dataset used in the paper.

UC Merced Dataset

AID Dataset

NWPU RESISC45 Dataset

Installation

Step 1. Create a conda environment and activate it.

conda create -n Gvmamba python=3.9
conda activate Gvmamba

Step 2. Install the requirements.

  • Torch1.13.1 + cu116
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch==1.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torchvision==0.14.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116

Step 3. Install the VMamba.

Please refer to the code support.

Step 4. Configuring G-VMamba core components.

Replace the contents of the models folder under the classification folder.

Model Training and Inference

If you only want to test the performance:

python -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=1 --master_port=29500 main.py --cfg </path/to/config> --batch-size 128 --data-path </path/of/dataset> --output /tmp --pretrained </path/of/checkpoint>

Train:

Training with a single GPU:

torchrun --nnodes=1 --node_rank=0 --nproc_per_node=1 main.py --cfg </path/to/config> --batch-size 16 --data-path </path/of/dataset> --output </path/of/output>

Training with multiple GPUs:

export CUDA_VISIBLE_DEVICES=0,1,2,3,4

torchrun --nnodes=1 --node_rank=0 --nproc_per_node=5 --master_port=29500 --rdzv_id=12345 --rdzv_backend=c10d --rdzv_endpoint=localhost:29500 main.py --cfg </path/to/config> --batch-size 8 --data-path </path/of/dataset> --output </path/of/output>

Citation

@ARTICLE{10810482,
  author={Yan, Liyue and Zhang, Xing and Wang, Kafeng and Zhang, Dejin},
  journal={IEEE Transactions on Geoscience and Remote Sensing}, 
  title={Contour-enhanced Visual State-Space Model for Remote Sensing Image Classification}, 
  year={2024}
  }

Acknowledgment

This project is mainly based on VMamba (paper, code), Swin-Transformer (paper, code), pytorch-grad-cam (code), etc, thanks for their excellent works.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages