- [1] T. Alpay, S. Magg, P. Broze, and D. Speck, (2023) “Multimodal video retrieval with CLIP: a user study" Information Retrieval Journal 26(1): 6. DOI: 10.1007/s10791-023-09425-2.
- [2] R. Zuo, X. Deng, K. Chen, Z. Zhang, Y.-K. Lai, F. Liu, C. Ma, H. Wang, Y.-J. Liu, and H. Wang, (2023) “Fine-grained video retrieval with scene sketches" IEEE Transactions on Image Processing 32: 3136–3149. DOI: 10.1109/TIP.2023.3278474.
- [3] L. Teng, (2023) “Brief Review of Medical Image Segmentation Based on Deep Learning [J]" IJLAI Transactions on Science and Engineering 1(02): 01–08.
- [4] S. Sharma and K. Guleria. “Deep learning models for image classification: comparison and applications”. In: 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE). IEEE. 2022, 1733–1738. DOI: 10.1109/ICACITE53722.2022.9823516.
- [5] J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, (2023) “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification" Scientific Data 10(1): 41. DOI: 10.1038/s41597-022-01721-8.
- [6] T. Liu, Y. Ma, W. Yang, W. Ji, R. Wang, and P. Jiang, (2022) “Spatial-temporal interaction learning based two stream network for action recognition" Information Sciences 606: 864–876. DOI: 10.1016/j.ins.2022.05.092.
- [7] S. Yin, H. Li, A. A. Laghari, T. R. Gadekallu, G. A. Sampedro, and A. Almadhor, (2024) “An Anomaly Detection Model Based on Deep Auto-Encoder and Cap sule Graph Convolution via Sparrow Search Algorithm in 6G Internet of Everything" IEEE Internet of Things Journal 11(18): 29402–29411. DOI: 10.1109/JIOT.2024.3353337.
- [8] I. U. Khan and J. W. Lee, (2024) “PAR-Net: An Enhanced Dual-Stream CNN–ESN Architecture for Human Physical Activity Recognition" Sensors 24(6): 1908. DOI: 10.3390/s24061908.
- [9] Y. Mou, X. Jiang, K. Xu, T. Sun, and Z. Wang, (2023) “Compressed video action recognition with dual-stream and dual-modal transformer" IEEE Transactions on Circuits and Systems for Video Technology 34(5): 3299–3312. DOI: 10.1109/TCSVT.2023.3319140.
- [10] D. Chen, M. Wu, T. Zhang, and C. Li, (2023) “Fea ture fusion for dual-stream cooperative action recognition" IEEE Access 11: 116732–116740. DOI: 10.1109/ ACCESS.2023.3325401.
- [11] Q. Ren, Z. Lu, H. Wu, J. Zhang, and Z. Dong, (2023) “HR-Net: Al and mark based high realistic face reenactment network" IEEE Transactions on Circuits and Systems for Video Technology 33(11): 6347–6359. DOI: 10. 1109/TCSVT.2023.3268062.
- [12] P. S. Yee, K. M. Lim, and C. P. Lee, (2022) “DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling" Expert Systems with Applications 193: 116382. DOI: 10.1016/j.eswa.2021.116382.
- [13] T. Zhou, Q. Li, H. Lu, Q. Cheng, and X. Zhang, (2023) “GANreview: Models and medical image fusion applications" Information Fusion 91: 134–148. DOI: 10.1016/j.inffus.2022.10.017.
- [14] Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu. “Video swin transformer”. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022, 3202–3211. DOI: 10.1109/CVPR52688.2022.00320.
- [15] G. Mai, K. Janowicz, Y. Hu, S. Gao, B. Yan, R. Zhu, L. Cai, and N. Lao, (2022) “A review of location encod ing for GeoAI: methods and applications" International Journal of Geographical Information Science 36(4): 639–673. DOI: 10.1080/13658816.2021.2004602.
- [16] M. Jiang and S. Yin, (2023) “Facial expression recognition based on convolutional block attention module and multi-feature fusion" International journal of computational vision and robotics 13(1): 21–37. DOI: 10.1504/IJCVR.2023.127298.
- [17] M. Ramesh and K. Mahesh, (2022) “Sports Video Classification Framework Using Enhanced Threshold Based Keyframe Selection Algorithm and Customized CNN on UCF101 and Sports1-M Dataset" Computational Intelligence and Neuroscience 2022(1): 3218431. DOI: 10.1155/2022/3218431.
- [18] S. Alamuru and S. Jain, (2024) “Effective Video Event Detection Using Optimized Bidirectional Long Short Term Memory Network" International Journal of Information Technology & Decision Making 23(05): 1911–1933. DOI: 10.1142/S0219622023500621.
- [19] Q. Tian, K. Wang, B. Liu, and Y. Wang. “Multi-kernel excitation network for video action recognition”. In: 2022 16th IEEE international conference on signal processing (ICSP). 1. IEEE. 2022, 155–159. DOI: 10.1109/ICSP56322.2022.9965286.
- [20] X. Wang, J. Ding, Z. Zhang, J. Xu, and J. Gao, (2024) “Ipnet: Polarization-based camouflaged object detection via dual-flow network" Engineering Applications of Artificial Intelligence 127: 107303. DOI: 10.1016/j.engappai.2023.107303.
- [21] H. Li, X. Li, L. Su, D. Jin, J. Huang, and D. Huang, (2022) “Deep spatio-temporal adaptive 3d convolutional neural networks for traffic flow prediction" ACM Transactions on Intelligent Systems and Technology (TIST) 13(2): 1–21. DOI: 10.1145/3510829.
- [22] J. Lee and S. B. Kim, (2022) “Uncertainty-aware hierarchical segment-channel attention mechanism for reliable and interpretable multichannel signal classification" Neural Networks 150: 68–86. DOI: 10.1016/j.neunet.2022.02.019.
- [23] S. Yin, L. Wang, M. Shafiq, L. Teng, A. A. Laghari, and M. F. Khan, (2023) “G2Grad-CAMRL: An object detection and interpretation model based on gradient-weighted class activation mapping and reinforcement learning in remote sensing images" IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 16: 3583–3598. DOI: 10.1109/JSTARS.2023.3241405.
- [24] Z.Bingyu, L. Zhen, and Z. Jingxiang, (2022) “COVID 19 Detection Algorithm Combining Grad-CAM and Convolutional Neural Network" Journal of Frontiers of Computer Science & Technology 16(9): 2108. DOI: 10.3778/j.issn.1673-9418.2105117.