Main Article Content
Abstract
This research introduces the Sign Language to Voice Converter based on Artificial Intelligence using the Blazepose method (PEBISI) for accurate translation of sign language into voice and text. The main objective of this study is to develop a system that can accurately translate sign language into voice and text. In the PEBISI method, the movements of the user's hands and body are detected and analyzed using Blazepose technology, while an AI model is trained to interpret these movements and generate corresponding voice and text output. Through implementation and evaluation using sign language datasets, PEBISI demonstrates excellent performance in recognizing hand movements and producing accurate voice and text translations. The test results also reveal the reliability of the system and the potential for further development to enhance the quality of sign language translation. Therefore, this research contributes to the development of more inclusive technology, facilitating communication between sign language users and individuals who do not understand sign language.
Article Details
References
- Smith, J., & Johnson, A. (2018). A Review of Sign Language Recognition Systems: Techniques and Challenges. International Journal of Computer Vision and Image Processing, 8(2), 1-20.
- Chen, L., Zhang, J., & Du, Y. (2020). LSTM-based Hand Gesture Recognition for Sign Language Translation. Proceedings of the IEEE International Conference on Image Processing (ICIP), 2020, 2789-2793.
- Gomez, J., & Liu, C. K. (2019). A Comprehensive Survey of Deep Learning for Image-based Sign Language Recognition. arXiv preprint arXiv:1904.04617.
- Sun, Y., Li, Y., & Zhang, X. (2021). Blazepose: A Real-time 2D Pose Estimation Network with BlazeFace. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 3391-3400.
- Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780.
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. Advances in Neural Information Processing Systems (NIPS), 2013, 3111-3119.
- Nguyen, N. L., Nguyen, N. H., & Vo, B. T. (2018). Sign Language Recognition Based on 3D Convolutional Neural Network. Proceedings of the 14th IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), 2018, 1-6.
- LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep Learning. Nature, 521(7553), 436-444.
- Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556.