M2FNet: Multi-modal fusion network for object detection from visible and thermal infrared images
- 影响因子:0.0
- DOI码:10.1016/j.jag.2024.103918
- 发表刊物:International Journal of Applied Earth Observation and Geoinformation
- 摘要:Fusing multi-modal information from visible (VIS) and thermal infrared (TIR) images is crucial for object detection in fully adapting to varied lighting conditions. However, the existing models usually treat VIS and TIR images as independent information and extract corresponding features from separate networks due to the scarcity of training data with labeled instances from both VIS and TIR registration images. To fill this gap, a novel Multi-Modal Fusion NETwork (M2FNet) based on the Transformer architecture is proposed in this paper, which contains two effective modules: the Union-Modal Attention (UMA) and the Cross-Modal Attention (CMA). The UMA module aggregates multi-spectral features from VIS and TIR images and then extracts multi-modal features via a convolutional neural network (CNN) backbone. The CMA module is designed to learn cross-attention features from VIS and TIR pairwise features by Transformer architecture. Evaluation results by the mean average precision (mAP) metric show that the M2FNet method significantly advances the baseline methods trained using only VIS or TIR images by 10.71 % and 2.97 %, respectively. The increments in mAP are observed in the M2FNet method compared with the existing multi-modal methods on two public datasets. Sensitivity analysis of eight illumination thresholds shows that the M2FNet method presents robustness performance on varied illumination conditions and achieves the maximum increase in accuracy of 25.6 %. Moreover, this method is subsequently applied to a new testing dataset, VI2DA (Visible-Infrared paired Video and Image DAtaset), observed by diverse sensors and platforms for testing the generalization ability of object detectors, which will be publicly available at https://github.com/TIR-OD/Datasets.
- 论文类型:期刊论文
- 论文编号:103918
- 学科门类:理学
- 一级学科:地理学
- 文献类型:J
- 卷号:130
- 页面范围:103918
- 是否译文:否
- 收录刊物:SCI
- 发布期刊链接:https://www.sciencedirect.com/science/article/pii/S1569843224002723
- 第一作者:Chenchen Jiang
- 通讯作者:Huazhong Ren
- 全部作者:Hong Yang,Hongtao Huo,Pengfei Zhu,Zhaoyuan Yao,Jing Li,Min Sun,Shihao Yang
- 发表时间:2024-05-23