DMTCANet: Dual-Branch Multiscale CNN and Token Cross Attention Fusion for Hyperspectral and LiDAR Data Classification
Abstract
and Light Detection and Ranging (LiDAR) data, leveraging the rich spectral information of HSI and the precise 3D structural details of LiDAR. While this combination improves classification accuracy, it presents challenges due to differences in data dimensions and semantic
levels. Existing deep learning approaches often struggle to effectively extract features and capture interactions between these heterogeneous
sources, and traditional CNNs suffer from limited receptive fields and detail loss in complex multi-scale scenarios. To address these issues,
we propose DMTCANet, a novel joint classification network that combines a dual-branch multi-scale CNN with token cross-attention (TCA)
fusion. The network incorporates a multi-scale hybrid convolution module to process HSI and LiDAR data, expanding the receptive field and
capturing local and global information. A TCA fusion encoder further enhances deep interactions between the two data modalities, overcoming the limitations of insufficient feature integration. Experimental results on Trento, Houston2013, and MUUFL datasets demonstrate the
effectiveness of DMTCANet, outperforming existing methods.
Keywords
Full Text:
PDFReferences
[1]B. Li, Q.-W. Wang, J.-H. Liang, E.-Z. Zhu, and R.-Q. Zhou, “Squconvnet: Deep sequencer convolutional network for hyperspectral
image classification,” Remote Sensing, vol. 15, no. 4, p. 983, 2023.
[2]D. Song, Y. Tang, B. Wang, J. Zhang, and C. Yang, “Two-branch generative adversarial network with multiscale connections for hyperspectral image classification,” IEEE Access, vol. 11, pp. 7336–7347, 2022.
[3]S. Mohla, S. Pande, B. Banerjee, and S. Chaudhuri, “Fusatnet: Dual attention based spectrospatial multimodal fusion network for
hyperspectral and lidar classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops,
2020, pp. 92–93.
[4]W. Wang, C. Li, P. Ren, X. Lu, J. Wang, G. Ren, and B. Liu, “Dualbranch feature fusion network based cross-modal enhanced cnn
and transformer for hyperspectral and lidar classification,” IEEE Geoscience and Remote Sensing Letters, 2024.
[5]L. Sun, X. Wang, Y. Zheng, Z. Wu, and L. Fu, “Multiscale 3-d–2-d mixed cnn and lightweight attention-free transformer for hyperspectral and lidar classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–16, 2024.
[6]G. Zhao, Q. Ye, L. Sun, Z. Wu, C. Pan, and B. Jeon, “Joint classification of hyperspectral and lidar data using a hierarchical cnn and
transformer,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–16, 2022.
[7]K. Ni, D. Wang, Z. Zheng, and P. Wang, “Mhst: Multiscale head selection transformer for hyperspectral and lidar classification,”
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024.
[8]X. Wu, D. Hong, and J. Chanussot, “Convolutional neural networks for multimodal remote sensing data classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–10, 2021.
[9]R. Hang, Z. Li, P. Ghamisi, D. Hong, G. Xia, and Q. Liu, “Classification of hyperspectral and lidar data using coupled cnns,” IEEE
Transactions on Geoscience and Remote Sensing, vol. 58, no. 7, pp. 4939–4950, 2020.
[10]D. Hong, L. Gao, R. Hang, B. Zhang, and J. Chanussot, “Deep encoder–decoder networks for classification of hyperspectral and
lidar data,” IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1–5, 2020.
DOI: https://doi.org/10.18686/utc.v10i3.243
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Yanfen Sun,Bian Bawangdui*