Journal of Applied Science and Engineering

Published by Tamkang University Press

1.30

Impact Factor

2.10

CiteScore

Lu Zhao and Jing YuThis email address is being protected from spambots. You need JavaScript enabled to view it.

Lu Xun Academy of Fine Arts, Shenyang 110816 China.


 

 

Received: April 16, 2025
Accepted: May 12, 2025
Publication Date: June 8, 2025

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.


Download Citation: ||https://doi.org/10.6180/jase.202602_29(2).0019  


Gesture is one of the most natural ways for humans to communicate. Gesture recognition technology allows users to directly interact with digital media content through simple gestures, without resorting to traditional devices such as mouse, keyboard or touch screen, making the interaction process more natural, intuitive and convenient. For example, in some digital exhibition halls, the audience can operate the exhibition items through space gestures, which enhances the sense of participation and fun. The traditional real-time gesture recognition model is not adaptable to illumination changes, complex background and other interference factors, and the used classification data set only contains specific gestures, which is insufficient in practical application. To solve this problem, this paper proposes a novel gesture recognition based on YOLOv5 and dynamic time warping for digital media design. In the detection stage, YOLOv5 is used as the detection network, and its positioning ability is used to quickly detect the hand position. In the recognition stage, firstly, the background and sensor thermal noise are used to enhance the classification data set, and the background optimization preprocessing algorithm is designed to improve the adaptability of the model to the complex background. VGG-16 is then used as a prototype of the recognition network, adding a normalization layer and replacing the activation function to accelerate convergence and prevent over-fitting. Dynamic time warping (DTW) algorithm is used to fuse different surface EMG signals, calculate the similarity between samples and models, and realize gesture recognition. The comparison results show that the accuracy of the proposed gesture recognition method is increased by 1.9% and 0.4%, and the reasoning speed is increased by 47.7% and 53.9%.


Keywords: Digital media design, gesture recognition, YOLOv5, dynamic time warping, background optimization


  1. [1] H. Ashraf, A. Waris, S. O. Gilani, U. Shafiq, J. Iqbal, E. N. Kamavuako, Y. Berrouche, O. Brüls, M. Boutaayamou, and I. K. Niazi, (2024) “Optimizing the performance of convolutional neural network for enhanced gesture recognition using sEMG" Scientific Reports 14(1): 2020. DOI: 10.1038/s41598-024-52405-9.
  2. [2] K.Aurangzeb, K. Javeed, M. Alhussein, I. Rida, S. I. Haider, and A. Parashar, (2024) “Deep Learning Approach for Hand Gesture Recognition: Applications in Deaf Communication and Healthcare." Computers, Materials & Continua 78(1): DOI: 10.32604/cmc.2023.042886.
  3. [3] L. Zholshiyeva, Z. Manbetova, D. Kaibassova, A. Kassymova, Z. Tashenova, S. Baizhumanov, A. Yerzhanova, and K. Aikhynbay, (2024) “Human machine interactions based on hand gesture recognition using deep learning methods." International Journal of Electrical & Computer Engineering (2088-8708) 14(1): DOI: 10.11591/IJECE.V14I1.PP741-748.
  4. [4] S. Shen, M. Li, F. Mao, X. Chen, and R. Ran, (2024) “Gesture recognition using mlp-mixer with cnn and stacking ensemble for semg signals" IEEE Sensors Journal 24(4): 4960–4968. DOI: 10.1109/JSEN.2023.3347529.
  5. [5] Z. Hu, F. Qiu, H. Sun, W. Zhang, Y. Ding, T. Lv, and C. Fan, (2024) “Learning a compact embedding for fine grained few-shot static gesture recognition" Multimedia Tools and Applications 83(33): 79009–79028. DOI: 10.1007/s11042-024-18430-6.
  6. [6] Y. Jiang and S. Yin, (2023) “Heterogenous-view occluded expression data recognition based on cycle-consistent adversarial network and K-SVD dictionary learning under intelligent cooperative robot environment" Computer Science and Information Systems 20(4): 1869–1883. DOI: 10.2298/CSIS221228034J.
  7. [7] N. Nayan, D. Ghosh, and P. M. Pradhan, (2024) “A multi-modal framework for continuous and isolated hand gesture recognition utilizing movement epenthesis detection" Machine Vision and Applications 35(4): 86. DOI: 10.1007/s00138-024-01565-9.
  8. [8] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. “Ssd: Single shot multibox detector”. In: Computer Vision–ECCV 2016: 14th Euro pean Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer. 2016, 21 37. DOI: 10.1007/978-3-319-46448-0_2.
  9. [9] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. “You only look once: Unified, real-time object detection”. In: Proceedings of the IEEE conference on com puter vision and pattern recognition. 2016, 779–788. DOI: 10.1109/CVPR.2016.91.
  10. [10] M. J. Shafiee, B. Chywl, F. Li, and A. Wong, (2017) “Fast YOLO: A fast you only look once system for real time embedded object detection in video" arXiv preprint arXiv:1709.05943: DOI: 10.48550/arXiv.1709.05943.
  11. [11] G. Marques, D. Agarwal, and I. De la Torre Díez, (2020) “Automated medical diagnosis of COVID-19 through Efficient Net convolutional neural network" Ap plied soft computing 96: 106691. DOI: 10.1016/j.asoc.2020.106691.
  12. [12] S. Benatti, F. Casamassima, B. Milosevic, E. Farella, P. Schönle, S. Fateh, T. Burger, Q. Huang, and L. Benini, (2015) “A versatile embedded platform for EMG acquisition and gesture recognition" IEEE transactions on biomedical circuits and systems 9(5): 620–630. DOI: 10.1109/TBCAS.2015.2476555.
  13. [13] A. Mujahid, M. J. Awan, A. Yasin, M. A. Mohammed, R. Damaševiˇ cius, R. Maskeli¯ unas, and K. H.Abdulka reem, (2021) “Real-time hand gesture recognition based on deep learning YOLOv3 model" Applied Sciences 11(9): 4164. DOI: 10.3390/app11094164.
  14. [14] A. Hussain, S. Ul Amin, M. Fayaz, et al., (2023) “An Efficient and Robust Hand Gesture Recognition System of Sign Language Employing Finetuned Inception-V3 and Efficientnet-B0 Network." Computer Systems Science & Engineering 46(3): DOI: 10.32604/CSSE.2023. 037258.
  15. [15] X. Wu, J. Zhang, and X. Xu, (2018) “Hand gesture recognition algorithm based on faster R-CNN" Journal of Computer-Aided Design & Computer Graphics 30(3): 468–476. DOI: 10.3724/SP.J.1089.2018.16435.
  16. [16] Q. Zhou, S. Wang, Y. Wang, and J. Zhang,(2023)“Traffic police gesture recognition based on Faster R-CNN and fuzzy matching algorithm." Advances in Transportation Studies 60: DOI: 10.53136/979122180742410.
  17. [17] H. Baumgartl, D. Sauter, C. Schenk, C. Atik, and R. Buettner. “Vision-based hand gesture recognition for human-computer interaction using MobileNetV2”. In: 2021 IEEE45th Annual Computers, Software, and Ap plications Conference (COMPSAC). IEEE. 2021, 1667 1674. DOI: 10.1109/COMPSAC51774.2021.00249.
  18. [18] E. Keogh and C. A. Ratanamahatana, (2005) “Exact indexing of dynamic time warping" Knowledge and information systems 7: 358–386. DOI: 10.1007/s10115-004-0154-9.
  19. [19] Y. Zhang, Z. Guo, J. Wu, Y. Tian, H. Tang, and X. Guo, (2022) “Real-time vehicle detection based on improved yolo v5" Sustainability 14(19): 12274. DOI: 10.3390/su141912274.
  20. [20] A. Yadav, K. Pasupa, C. K. Loo, and X. Liu, (2024) “Optimizing echo state networks for continuous gesture recognition in mobile devices: A comparative study" He liyon 10(5): DOI: 10.1016/j.heliyon.2024.e27108.
  21. [21] A. Neac¸su, J.-C. Pesquet, and C. Burileanu, (2024) “EMG-based automatic gesture recognition using lipschitz regularized neural networks" ACM Transactions on Intelligent Systems and Technology15(2): 1–25. DOI: 10.1145/3635159.
  22. [22] X. Liu, L. Hu, L. Tie, L. Jun, X. Wang, and X. Liu, (2024) “Integration of Convolutional Neural Network and Vision Transformer for gesture recognition using sEMG" Biomedical Signal Processing and Control 98: 106686. DOI: 10.1016/j.bspc.2024.106686.
  23. [23] G. Zhou, Z. Cui, and J. Qi, (2024) “FGDS Net: A lightweight hand gesture recognition network for human robot interaction" IEEE Robotics and Automation Letters 9(4): 3076–3083. DOI: 10.1109/LRA.2024.3362144.
  24. [24] Y. Liu, X. Peng, Y. Tan, T. T. Oyemakinde, M. Wang, G. Li, and X. Li, (2024) “A novel unsupervised dynamic feature domain adaptation strategy for cross-individual myoelectric gesture recognition" Journal of Neural En gineering 20(6): 066044. DOI: 10.1088/1741-2552/ ad184f.


    



 

2.1
2023CiteScore
 
 
69th percentile
Powered by  Scopus

SCImago Journal & Country Rank

Enter your name and email below to receive latest published articles in Journal of Applied Science and Engineering.