ML-based Smart Voice Analysis for Healthy and Pathological Voice Detection

Kamran Saeed, Muhammad Fatih Adak, Khalid Javeed, Rizwan Saeed, Samiah Ijaz


Voice pathology is increasing dramatically, especially due to unhealthy social habits, being too much talkative, age factor or some kind of pathology in throat. Normally the people who speaks a lot for example, teachers or announcers etc. suffer from voice pathology in their elderly age. A research-oriented simulation project is developed, which will help a general physician to identify this voice pathology. It is work of specialist ENT doctor. This will help and assist general physician to identify pathology and to refer the patient to the specialist doctor. For implementing the work, a hybrid system is developed which is based on two models. For assisting general physician, the model first detects the pathology i.e. normal voice or pathological voice. In the second model the system will classify the type of disease present in the patient. The main objective of this research work is to investigate commonly used features available for feature extraction of voice pathology classification and detection. Then train the system using various machine learning techniques that can provide us with better pathology detection and classification rate. This paper focusses on developing an accurate Mel frequency Cepstral Coefficient feature extraction technique for classifying and detecting voice pathologies using a German Saarbrucken voice database. The system is trained by using four different kinds of machine learning algorithms which are Naïve Bayes, Random Forest, Support Vector Machine, K nearest neighbor and then finding out which algorithm perform well in both of these models. In this research validating the results and comparing it with previous research work is analyzed. Analysis of several different diseases and techniques are also provided in these studies.


Voice pathology; MFCC; KNN; RF; Naïve Bayes; SVM; SVD

Full Text:



R. Islam, M. Tarique and E. A. Raheem, "A survey on signal processing based pathological voice detection techniques," IEEE Access, vol. 8, pp. 99, 2020. DOI: 10.1109/ACCESS.2020.2985280

"Acute Asthma, Prognosis and Treatment," World Allergy Organization, July 2021. [Online]. Available: [Accessed March 2023].

N. Roy, R. M. Merrill, S. Thibeault, R. A. Parsa, S. D. Gray and E. M. Smith, "Prevalence of voice disorders in teachers and the general population," J. Speech, Lang., Hearing Res., vol. 47, pp. 281-293, 2004. DOI: 10.1044/1092-4388(2004/023)

A. Rehman, S. Arif, H. M. Hayat, A. Kamran and S. Shakeel "Prevalence and Risk Factors for Occupational Voice problems in Teachers," Asian Journal of Allied Health Sciences (AJAHS), vol.2(2), pp. 33-36, 2020. DOI: 10.1159/000089610

N. Shokouhi and J. Hansen, "Teager–kaiser energy operators for overlapped speech detection," IEEE/ACM Trans Audio Speech Lang Process, vol. 25(5), pp. 1035-1047, 2017. DOI: 10.1109/TASLP.2017.2678684

A. Al-Nasheri, G. Muhammad, M. Alsulaiman, Z. Ali, K.H. Malki, A. T. Mesallam and M. F. Ibrahim, "Voice pathology detection and classification using auto correlation and entropy features in different frequency regions," IEEE Access, vol. 6, pp. 6961-6974, 2018. DOI: 10.1109/ACCESS.2017.2696056

G. Muhammad, G. Altuwaijri, M. Alsulaiman, T. Mesallam, K. Malki, G. Altuwaijri, Z. Ali M. Farhat and A. Al-nasheri "Automatic voice pathology detection and classification using vocal tract area irregularity," Journal of Applied Biomedicine, vol. 36(2), 2016. DOI: 10.1016/j.bbe.2016.01.004

L. Cai and J Zhao, "Speech quality evaluation: A new application of digital watermarking," IEEE Instrumentation and Measurement Technology Conference Proceedings, vol. 56(1), pp. 45-55, 2007. DOI: 10.1109/IMTC.2005.1604213

M. Ghulam, S. Rahman, A. Alelaiwi and A. Alamri, "Smart health solution integrating iot and cloud: A case study of voice pathology monitoring," IEEE Commun Mag, vol. 55(1), pp. 69-73, 2017. DOI: 10.1109/ MCOM.2017. 1600425CM

M.S. Hossain, G. Muhammad and A. Alamri, "Smart healthcare monitoring: a voice pathology detection paradigm for smart cities," Multimed Syst, vol. 25(5), pp. 565-575, 2019. DOI: 10.1007/s00530-017-0561-x

A. Lovato, M. R. Barillari, L. Giacomelli, L. Gamberini and C. D. Filippis "Predicting the outcome of unilateral vocal fold paralysis: a multivariate discriminating model including grade of dysphonia, jitter, shimmer, and voice handicap index-10," Ann Otol Rhinol Laryngol, vol. 128(5), pp. 447-452, 2019. DOI: 10.1177/0003489419826597

A. Castellana, A. Carullo, S. Corbellini and A. Astolfi, "Discriminating pathological voice from healthy voice using cepstral peak prominence smoothed distribution in sustained vowel," IEEE Trans Instrum Meas, vol. 67(3), pp. 646-654, 2018. DOI: 10.1109/TIM.2017. 2781958

Z. Changwei, Z. Lili, Z. Xiaojun, W. Yuanbo, W. Di and T. Zhi "Classification of normal and pathological voices using convolutional neural network.," 2020 International conference on sensing, measurement & data analytics in the era of artificial intelligence (ICSMD), pp.325-329, 2020. DOI: 10.1109/ICSMD50554.2020.9261730

S.R. Kadiri, and P. Alku, "Analysis and detection of pathological voice using glottal source features," IEEE J Select Top Signal Process, vol. 14(2), pp. 367-379, 2020. DOI: 10.1109/JSTSP.2019.2957988

K. Yokota, Y. Koba, S. Ishikawa and S.Kijimoto, "Inverse analysis of vocal sound source using an analytical model of the vocal tract," Appl Acoust, vol. 150, pp. 89-103, 2019. DOI: 10.1016/j.apacoust.2019.02.005

T. Tuncer, S. Dogan and F. Ertam "Automatic voice based disease detection method using one dimensional local binary pattern feature extraction network," Appl Acous, vol. 155, pp. 500-506, 2019. DOI: 10.1016/j.apacoust. 2019.05.023

S. Souli, R. Amami and S. B. Yahia "A robust pathological voices recognition system based on dcnn and scattering transform," Appl Acoust, vol. 177, p. Article 107854, 2021. DOI: 10.1016/j.apacoust.2020.107854

S. Fujimura, T. Kojima, Y. Okanoue, K. Shoji, M. Inoue, K. Omori and R. Hori, "Classification of voice disorders using a one-dimensional convolutional neural network," J Voice, vol. 36(1),pp. 15-20, 2022. DOI: 10.1016/j.jvoice. 2020.02.009

S. Hidaka, K. Wakamiya, Y. Lee and T. Nakagawa "Automatic estimation of pathological voice quality based on recurrent neural network using amplitude and phase spectrogram," Proc. Interspeech 2020, pp. 3880-3884, 2020. DOI: 10.21437/Interspeech.2020-3228

H. Wu, J. J. Soraghan, A. Lowit and G. D. Caterina "A deep learning method for pathological voice detection using convolutional deep belief networks.," Interspeech 2018, 2018. DOI: 10.21437/Interspeech.2018-1351

Z. Ali, M. Alsulaiman, G. Muhammad and I. Elamvazuthi "Vocal fold disorder detection based on continuous speech by using MFCC and GMM," 2013 7TH IEEE GCC conference and Exhibition (GCC). IEEE, 2013. DOI: 10.1109/IEEEGCC.2013.6705792

G. Muhammad, G. Altuwaijri, M. Alsulaiman and Z. Ali "Automatic voice pathology detection and classification using vocal tract area irregularity.," Biocybern Biomed Eng. , vol. 36, 2016. DOI: 10.1016/j.bbe.2016.01.004

M. Dahmani, and M. Guerti, "Vocal fold Pathologies classification using Naïve Bayesian Networks.," 6th International Conference on system and control (ICSC). IEEE, 2017. DOI: 10.1109/ICoSC.2017.7958686

G. Muhammad, M. Alsulaiman, Z. Ali and T. Mesallam, "Voice Pathology Detection using interlaced derivative pattern on global source excitation," Biomed Signal Process Control, 2017. DOI:10.1016/j.bspc.2016.08.002

J. D. Arias-Londoño, J. I. Godino-Llorente, M. Markaki and Y. Stylianou "On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices," Logoped Phoniatr Vocol., 2011. DOI: 10.3109/14015439.2010.528788

A. Mahmood, "A Solution to the Security Authentication Problem in Smart Houses Based on Speech," PhD Thesis, King Saud University, Riyadh, 2019. DOI:10.1016/j.procs.2019.08.085

L. Verde, G. D. Pietro and G. Sannino, "Voice Disorder Identification by Using Machine Learning Techniques," IEEE Access, vol. 6(1), pp. 6246-16255., 2018. DOI: 10.1109/ ACCESS.2018.2816338

A. Al-Nasheri, G. Muhammad, M. Alsulaiman, Z. Ali, K.H. Malki, A. T. Mesallam and M. F. Ibrahim, "Voice pathology detection and classification using auto correlation and entropy features in different frequency regions," IEEE Access, vol. 6, pp. 6961-6974, 2018. DOI: 10.1109/ACCESS.2017.2696056



  • There are currently no refbacks.

EXPERT: Jurnal Manajemen Sistem Informasi dan Teknologi

Published by Pusat Studi Teknologi Informasi, Fakultas Ilmu Komputer, Universitas Bandar Lampung
Gedung M Lt.2 Pascasarjana Universitas Bandar Lampung
Jln Zainal Abidin Pagaralam No.89 Gedong Meneng, Rajabasa, Bandar Lampung,

Indexed by:

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.