My goal was to get a better understanding in semantic segmentation and how to build a U-net model from scratch. There is room for improvement so feel free to reach me at [email protected]
It is basically an encoder-decoder architecture with skip connections.
Aerial semantic segmentation using urban scenes for increasing the safety of autonomous drone flight and landing procedures. The imagery depicts more than 20 houses from nadir (bird's eye) view acquired at an altitude of 5 to 30 meters above ground. A high resolution camera was used to acquire images at a size of 6000x4000px (24Mpx). The training set contains 400 publicly available images and the test set is made up of 200 private images.
My custom model showed an 80% accuracy. More insight to be added soon.
Link to the dataset https://www.kaggle.com/bulentsiyah/semantic-drone-dataset