A novel ResNet50-based attention mechanism for image classification

Jingsi  Zhang; Xiaosheng  Yu; Xiaoliang  Lei; Chengdong Wu

doi:10.6180/jase.202408_27(8).0004

A novel ResNet50-based attention mechanism for image classification

Computer Science and Information Engineering

Jingsi Zhang, Xiaosheng Yu, Xiaoliang Lei, and Chengdong WuThis email address is being protected from spambots. You need JavaScript enabled to view it.

Faculty of Robot Science and Engineering, Northeastern University Shenyang 110819, China

Received: October 9, 2023
Accepted: October 30, 2023
Publication Date: November 16, 2023

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.6180/jase.202408_27(8).0004

Image classification tasks often compress the neural network model to reduce the number of parameters, which leads to a decrease in classification accuracy. Therefore, we propose a novel ResNet50-based attention mechanism for image classification. ResNet50 network is used to extract image features and input the features into the graph neural network as node features. Then, packet convolution and depth-separable convolution are used to compress the residual network. The attention mechanism is introduced into the network backbone to make it focus on the important part of the neighborhood and help the branch network to extract key information. The accuracy of 5-way 1-shot task classification on three publicly available datasets reaches 86.32%, 92.21% and 92.19%, respectively. The proposed method has achieved remarkable results in image classification tasks.

Keywords: Image classification; ResNet50; attention mechanism; depth-separable convolution; packet convolution

[1] S. De, L. Berrada, J. Hayes, S. L. Smith, and B. Balle, (2022) “Unlocking high-accuracy differentially private image classification through scale, 2022" arXiv preprint arXiv:2204.13650: DOI: 10.48550/arXiv.2204.13650.
[2] X. Meng, X. Wang, S. Yin, and H. Li, (2023) “Few-shot image classification algorithm based on attention mechanism and weight fusion" Journal of Engineering and Applied Science 70(1): 1–14. DOI: 10.1186/s44147-023-00186-9.
[3] L. Zhan, W. Li, and W. Min, (2023) “FA-ResNet: Feature affine residual network for large-scale point cloud segmentation" International Journal of Applied Earth Observation and Geoinformation 118: 103259. DOI: 10.1016/j.jag.2023.103259.
[4] M. Ji, G. Peng, S. Li, F. Cheng, Z. Chen, Z. Li, and H. Du, (2022) “A neural network compression method based on knowledge-distillation and parameter quantization for the bearing fault diagnosis" Applied Soft Computing 127: 109331. DOI: 10.1016/j.asoc.2022.109331.
[5] S. Lin, R. Ji, C. Chen, D. Tao, and J. Luo, (2018) “Holistic cnn compression via low-rank decomposition with knowledge transfer" IEEE transactions on pattern analysis and machine intelligence 41(12): 2889–2905. DOI: 10.1109/TPAMI.2018.2873305.
[6] X.-L. Zhang, B.-C. Du, Z.-C. Luo, and K. Ma, (2022) “Lightweight and efficient asymmetric network design for real-time semantic segmentation" Applied Intelligence 52(1): 564–579. DOI: 10.1007/s10489-021-02437-9.
[7] L. Wang and K.-J. Yoon, (2021) “Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks" IEEE transactions on pattern analysis and machine intelligence 44(6): 3048–3068. DOI: 10.1109/TPAMI.2021.3055564.
[8] T.-B. Xu and C.-L. Liu, (2020) “Deep neural network self-distillation exploiting data representation invariance" IEEE Transactions on Neural Networks and Learning Systems 33(1): 257–269. DOI: 10.1109/TNNLS.2020.3027634.
[9] Y. Cui, Y. An, W. Sun, H. Hu, and X. Song, (2020) “Lightweight attention module for deep learning on classification and segmentation of 3-D point clouds" IEEE Transactions on Instrumentation and Measurement 70: 1–12. DOI: 10.1109/TIM.2020.3013081.
[10] L. Teng, Y. Qiao, M. Shafiq, G. Srivastava, A. R. Javed, T. R. Gadekallu, and S. Yin, (2023) “FLPK-BiSeNet: Federated Learning Based on Priori Knowledge and Bilateral Segmentation Network for Image Edge Extraction" IEEE Transactions on Network and Service Management 20(2): 1529–1542. DOI: 10.1109/TNSM.2023.3273991.
[11] A. Jisi, S. Yin, et al., (2021) “A new feature fusion network for student behavior recognition in education" Journal of Applied Science and Engineering 24(2): 133– 140. DOI: 10.6180/jase.202104_24(2).0002.
[12] C. Zhang, L. Wang, S. Cheng, and Y. Li, (2022) “SwinSUNet: Pure transformer network for remote sensing image change detection" IEEE Transactions on Geoscience and Remote Sensing 60: 1–13. DOI: 10.1109/TGRS.2022.3160007.
[13] J. Zhu, Y. Tan, R. Lin, J. Miao, X. Fan, Y. Zhu, P. Liang, J. Gong, and H. He, (2022) “Efficient self-attention mechanism and structural distilling model for Alzheimer’s disease diagnosis" Computers in Biology and Medicine 147: 105737. DOI: 10.1016/j.compbiomed.2022.105737.
[14] J. Qu, Y. Xu, W. Dong, Y. Li, and Q. Du, (2021) “Dualbranch difference amplification graph convolutional network for hyperspectral image change detection" IEEE Transactions on Geoscience and Remote Sensing 60: 1–12. DOI: 10.1109/TGRS.2021.3135567.
[15] H. Li, X. Zhang, Q. Tian, and H. Xiong. “Attribute mix: Semantic data augmentation for fine grained recognition”. In: 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). IEEE. 2020, 243–246. DOI: 10.1109/VCIP49819.2020.9301763.
[16] A. Khosla, N. Jayadevaprakash, B. Yao, and F.-F. Li. “Novel dataset for fine-grained image categorization: Stanford dogs”. In: Proc. CVPR workshop on fine-grained visual categorization (FGVC). 2. 1. Citeseer. 2011.
[17] T. Kramberger and B. Potoˇcnik, (2020) “LSUNStanford car dataset: enhancing large-scale car image datasets using deep learning for usage in GAN training" Applied Sciences 10(14): 4913. DOI: 10.3390/app10144913.
[18] B. Oreshkin, P. Rodríguez López, and A. Lacoste, (2018) “Tadam: Task dependent adaptive metric for improved few-shot learning" Advances in neural information processing systems 31:
[19] W. Zhou, H. Wang, and Z. Wan, (2022) “Ore image classification based on improved CNN" Computers and Electrical Engineering 99: 107819. DOI: 10.1016/j.compeleceng.2022.107819.
[20] X. Ning, W. Tian, Z. Yu, W. Li, X. Bai, and Y. Wang, (2022) “HCFNN: high-order coverage function neural network for image classification" Pattern Recognition 131: 108873. DOI: 10.1016/j.patcog.2022.108873.
[21] V. Narayan, P. K. Mall, S. Awasthi, S. Srivastava, and A. Gupta. “FuzzyNet: Medical Image Classification based on GLCM Texture Feature”. In: 2023 International Conference on Artificial Intelligence and Smart Communication (AISC). IEEE. 2023, 769–773. DOI: 10.1109/AISC56616.2023.10085348.