Deep Attention Neural Network for multi-label classification in UAV imagery

Journal Article
Alajlan, A. Alshehri, Y. Bazi, N. Ammour, H. Almubarak, N. . 2019
المجلة \ الصحيفة: 
IEEE Access
رقم الإصدار السنوي: 
مستخلص المنشور: 

The multi-label classification problem in Unmanned Aerial Vehicle (UAV) images is particularly challenging compared to single-label classification due to its combinatorial nature. To tackle this issue, we propose in this paper a deep learning approach based on encoder-decoder neural network architecture with channel and spatial attention mechanisms. Specifically, the encoder module which is based on a pre-trained convolutional neural network (CNN) has the task to transform the input image to a set of feature maps using an opportune feature combination. To improve the feature representation further, this module incorporates a squeeze excitation (SE) layer for modelling the interdependencies between the channels of the feature maps. The decoder module which is based on a long short terms memory (LSTM) network has the task of generating, in a sequential way, the classes present in the image. At each time step, it predicts the next class-label by aligning its hidden state to the corresponding region in the image by means of an adaptive spatial attention mechanism. The experiments carried out on two UAV datasets with a spatial resolution of 2-cm show that our method is promising in predicting the labels present in the image while attending the relevant objects in the image. Additionally, it is able to provide better classification results compared to state-of-the-art methods.