The Improved Kurdish Dialect Classification Using Data Augmentation and ANOVA-Based Feature Selection

Karzan J. Ghafoor; Sarkhel H. Taher; Karwan M. Hama Rawf; Ayub O. Abdulrahman

doi:10.14500/aro.11897

Karzan J. Ghafoor Computer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. Iraq https://orcid.org/0000-0001-7851-8887
Sarkhel H. Taher Computer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. Iraq https://orcid.org/0000-0002-8936-0429
Karwan M. Hama Rawf Computer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. Iraq https://orcid.org/0000-0002-0350-7435
Ayub O. Abdulrahman Computer Science Department, College of Science, University of Halabja, Halabja, 46018, Kurdistan Region - F.R. Iraq https://orcid.org/0000-0003-3508-1093

Keywords: 1D convolutional neural network, Data augmentation, Feature selection, Kurdish dialect identification, Sound feature

Abstract

Analyzing dialects in the Kurdish language proves to be tough because of the tiny phonetic distinctions among the dialects. We applied advanced methods to enhance the precision of Kurdish dialect classification in this research. We examined the dataset’s stability and variation through the use of time-stretching and noise-augmenting methods. Analysis of variance (ANOVA) filter approach is applied to improve feature selection (FS) more efficiently and highlight the most relevant features for dialect classification. The ANOVA filter method ranks features based on the means from different dialect groups, which made FS better. To make dialect classification work better, a 1D convolutional neural network model was given a dataset that had ANOVA FS added to it. The model showed a very strong performance, reaching a remarkable accuracy of 99.42%. This noteworthy increase in accuracy beat former research with an accuracy of 95.5%. The findings demonstrate how combining time stretch and FS methods can improve the accuracy of Kurdish dialect classification. This project improves our understanding and implementation of machine learning in the field of linguistic diversity and dialectology.

Downloads

Download data is not yet available.

References

Abdul, Z.K., Al-Talabani, A., and Abdulrahman, A.O. 2016. A new feature extraction technique based on 1D local binary pattern for gear fault detection. Shock and Vibration, 2016, p. 8538165.

Abdullah, A.A., Abdulla, S.H., Toufiq, D.M., Maghdid, H.S., Rashid, T.A., Farho, P.F., & Asaad, A.T. 2024. NER-RoBERTa: Fine-Tuning RoBERTa for Named Entity Recognition (NER) within low-resource languages. arXiv preprint arXiv:2412.15252.

Aguiar, R.L., Costa, Y.M., and Silla, C.N. 2018. Exploring data augmentation to improve music genre classification with convnets. In: 2018 International Joint Conference On Neural Networks (IJCNN). IEEE, United States, pp. 1-8.

Al-Onazi, B.B., Nauman, M.A., Jahangir, R., Malik, M.M., Alkhammash, E.H., and Elshewey, A.M. 2022. Transformer-based multilingual speech emotion recognition using data augmentation and feature fusion. Applied Sciences, 12, p. 9188.

Al-Talabani, A.K., Abdul, Z.K., and Ameen, A.A. 2017. Kurdish dialects and neighbor languages automatic recognition. ARO-The Scientific Journal of Koya University, 5, pp. 20-23.

Bahari, M.H., Dehak, N., Burget, L., Ali, A.M., and Glass, J. 2014. Non-negative factor analysis of gaussian mixture model weight adaptation for language and dialect recognition. IEEE/ACM Transactions On Audio, Speech, and Language Processing, 22, pp. 1117-1129.

Cheng, W.K., Khairuddin, I.M., Majeed, A.P.A., and Razman, M.A.M. 2020. The Classification of heart murmurs: The identification of significant time domain features. Mekatronika: Journal of Intelligent Manufacturing and Mechatronics, 2, pp. 36-43.

Damskägg, E.P., and Välimäki, V. 2017. Audio time stretching using fuzzy classification of spectral bins. Applied Sciences, 7, p. 1293.

Das, A., Guha, S., Singh, P.K., Ahmadian, A., Senu, N., and Sarkar, R. 2020. A hybrid meta-heuristic feature selection method for identification of Indian spoken languages from audio signals. IEEE Access, 8, pp. 181432-181449.

Das, P.P., Allayear, S.M., Amin, R., and Rahman, Z. Bangladeshi dialect recognition using Mel frequency cepstral coefficient, delta, delta-delta and Gaussian mixture model. In: 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI), IEEE, United States, pp. 359-364.

Ghafoor, K.J., Rawf, K.M.H., Abdulrahman, A.O., and Taher, S.H. 2021. Kurdish dialect recognition using 1D CNN. ARO-The Scientific Journal of Koya University, 9, pp. 10-14.

Hama Rawf, K.M., Abdulrahman, A.O., and Mohammed, A.A. 2024. Improved recognition of Kurdish sign language using modified CNN. Computers, 13, p. 37.

Hu, H., Tan, T., and Qian, Y. Generative adversarial networks based data augmentation for noise robust speech recognition. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, United States, pp. 5044-5048.

Kanda, N., Takeda, R., and Obuchi, Y. Elastic spectral distortion for low resource speech recognition with deep neural networks. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE, United States, pp. 309-314.

Karim, S.H.T., Ghafoor, K.J., Abdulrahman, A.O., and Rawf, K.M.H. 2024. AMulti-feature fusion approach for dialect identification using 1D CNN. JOIV: International Journal on Informatics Visualization, 8, pp. 1246-1252.

Khamparia, A., Gupta, D., Nguyen, N.G., Khanna, A., Pandey, B., and Tiwari, P. 2019. Sound classification using convolutional neural network and tensor deep stacking network. IEEE Access, 7, pp.7717-7727.

Kupryjanow, A., and Czyzewski, A. 2012. A method of real-time non-uniform speech stretching. E-Business and Telecommunications: International Joint Conference, ICETE 2011, Seville, Spain, July 18-21. Revised Selected Papers. Springer, Germany, pp. 362-373.

Li, X., Zhang, W., Ding, Q., and Sun, J.Q. 2020. Intelligent rotating machinery fault diagnosis based on deep learning using data augmentation. Journal of Intelligent Manufacturing, 31, pp.433-452.

Lounnas, K., Lichouri, M., and Abbas, M. 2022. Analysis of the effect of audio data augmentation techniques on phone digit recognition for algerian arabic dialect. In: 2022 International Conference on Advanced Aspects of Software Engineering (ICAASE). IEEE, United States, pp. 1-5.

Ma, R., Tao, P., and Tang, H. 2019. Optimizing data augmentation for semantic segmentation on small-scale dataset. In: Proceedings of the 2nd International Conference on Control and Computer Vision, pp. 77-81.

Moreno-Barea, F.J., Jerez, J.M., and Franco, L. 2020. Improving classification accuracy using data augmentation on small data sets. Expert Systems with Applications, 161,

Mulahuwaish, A., Gyorick, K., Ghafoor, K.Z., Maghdid, H.S., and Rawat, D.B. 2020. Efficient classification model of web news documents using machine learning algorithms for accurate information. Computers and Security, 98, p. 102006.

Nguyen, T.S., Stueker, S., Niehues, J., and Waibel, A. 2020. Improving sequenceto-sequence speech recognition training with on-the-fly data augmentation. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, United States, pp. 7689-7693.

Nugroho, K., and Noersasongko, E. 2022. Enhanced Indonesian ethnic speaker recognition using data augmentation deep neural network. Journal of King Saud University-Computer and Information Sciences, 34, pp. 4375-4384.

Peddinti, V., Chen, G., Manohar, V., Ko, T., Povey, D., and Khudanpur, S. 2015. Jhu aspire system: Robust lvcsr with tdnns, ivector adaptation and rnn-lms. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE, United States, pp. 539-546.

Ragni, A., Knill, K.M., Rath, S.P., and Gales, M.J. Data augmentation for low resource languages. In: Interspeech 2014: 15th Annual Conference of the International Speech Communication Association, 2014. International Speech Communication Association (ISCA), pp. 810-814.

Rawf, K.M.H., Karim, S.H.T., Abdulrahman, A.O., and Ghafoor, K.J. 2024. Dataset for the recognition of Kurdish sound dialects. Data in Brief, 53, p. 1.

Rebai, I., Benayed, Y., Mahdi, W., and Lorré, J.P. 2017. Improving speech recognition using data augmentation and acoustic model fusion. Procedia Computer Science, 112, pp. 316-322.

Rituerto-González, E., Mínguez-Sánchez, A., Gallardo-Antolín, A., and PeláezMoreno, C. 2019. Data augmentation for speaker identification under stress conditions to combat gender-based violence. Applied Sciences, 9, p. 2298.

Salamon, J., and Bello, J.P. 2017. Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters, 24, pp. 279-283.

Sangwan, P., Deshwal, D., and Dahiya, N. (2021). Performance of a language identification system using hybrid features and ANN learning algorithms. Applied Acoustics, 175, p. 107815.

Shetty, N., Patnaik, L., and Prasad, N. 2022. Emerging research in computing, information, communication and applications proceedings of ERCICA 2022. In: Proceedings of ERCICA, p. 1.

Tu, Z., Liu, B., Zhao, W., Yan, R., and Zou, Y. 2023. Afeature fusion model with data augmentation for speech emotion recognition. Applied Sciences, 13, p. 4124.

Tubishat, M., Abushariah, M.A., Idris, N., and Aljarah, I. 2019. Improved whale optimization algorithm for feature selection in Arabic sentiment analysis. Applied Intelligence, 49, pp. 1688-1707.

Turab, M., Kumar, T., Bendechache, M., and Saber, T. 2022. Investigating multi-feature selection and ensembling for audio classification. arXiv preprint arXiv:2206.07511.

Wu, T., Duchateau, J., Martens, J.P., and Van Compernolle, D. 2010. Feature subset selection for improved native accent identification. Speech Communication, 52, pp. 83-98.

Zheng, Q., Yang, M., Tian, X., Jiang, N., and Wang, D. 2020. A full stage data augmentation method in deep convolutional neural network for natural image classification. Discrete Dynamics in Nature and Society, 2020, p. 4706576.

Zhou, H., Wang, X., and Zhu, R. 2022. Feature selection based on mutual information with correlation coefficient. Applied Intelligence, 52, pp. 5457-5474.