Please find the public Aerial Fluvial Image Dataset (AFID) on Purdue University Research Repository. AFID contains manually annotated images shot by the drone over the Wabash River and the Wildcat Creek in Indiana. Unzip the downloaded zip file and put AerialFluvialDataset folder into the Aerial-Fluvial-Semantic-Segmentation folder. Also remember to unzip the WildcatCreekDataset and WabashRiverDataset in the AerialFluvialDataset.
The FluvialDataset class accepts image and mask absolute path pairs in a csv file. An example is the csv files in this folder, containing the whole dataset, train set and test set. These csv files can be generated using the function build_csv_from_datasets in build_dataset.py. Since the csv files store the absolute paths, it is recommended to rebuild these files after git cloning.
The training mainly depends on PyTorch Lightning and segmentation_models_pytorch. A virtual python environment (e.g., miniconda) is recommended to test this repo while separating from your system python libraries.
conda create -n afid python==3.10
afid is the name of the virtual python environment. After activated this environment by conda activate afid
,
ensure you have installed all dependencies by
pip install -r requirements.txt
The training code is in train.py. An example usage is
python -m networks.train '../dataset/afid/train.csv' '../dataset/afid/test.csv'
If having No such file or directory error, make sure the paths in train.csv and test.csv are pointing to the correct location of AFID data.
We use wandb for logging intermediate checkpoints. You can visualize the training progress, and inspect the trained models from their website (although models are also stored locally, it is more convenient to see which model is the best on their website).
The inference code is in inference.py. An example usage is
python -m networks.inference '../dataset/afid/test.csv' '../models/unet-resnet34-128x128.ckpt'
If you want to do prediction on your own video, you can use the inference_video.py script. An example usage is
python -m networks.inference_video -i video-path/video_path.csv -o ./output.mp4 -m ../models/unet-resnet34-128x128.ckpt --height 128 --width 128 -r 1
where -i
is the csv file path that contains all input videos, -o
is the output video path with desired suffix, -m
is the model checkpoint path, --height
and --width
are the height and width of the input frames, and -r
is the frame rate of the output video.
If you use the AFID dataset or this repo in your work, please cite our paper. Thanks.
@article{wang2023aerial,
title={Aerial fluvial image dataset for deep semantic segmentation neural networks and its benchmarks},
author={Wang, Zihan and Mahmoudian, Nina},
journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
volume={16},
pages={4755--4766},
year={2023},
publisher={IEEE}
}