Multidirectional Regression (MDR)-Based Features for Automatic Voice Disorder Detection
GhulamM, Mesallam T, Malki K, Farahat M, Mahmood A, Alsulaiman M, . 2012
BACKGROUND AND OBJECTIVE:
Objective assessment of voice pathology has a growing interest nowadays. Automatic speech/speaker recognition (ASR) systems are commonly deployed in voice pathology detection. The aim of this work was to develop a novel feature extraction method for ASR that incorporates distributions of voiced and unvoiced parts, and voice onset and offset characteristics in a time-frequency domain to detect voice pathology.
MATERIALS AND METHODS:
The speech samples of 70 dysphonic patients with six different types of voice disorders and 50 normal subjects were analyzed. The Arabic spoken digits (1-10) were taken as an input. The proposed feature extraction method was embedded into the ASR system with Gaussian mixture model (GMM) classifier to detect voice disorder.
RESULTS:
Accuracy of 97.48% was obtained in text independent (all digits' training) case, and over 99% accuracy was obtained in text dependent (separate digit's training) case. The proposed method outperformed the conventional Mel frequency cepstral coefficient (MFCC) features.
CONCLUSION:
The results of this study revealed that incorporating voice onset and offset information leads to efficient automatic voice disordered detection.
Copyright © 2012 The Voice Foundation. Published by Mosby, Inc. All rights reserved.
The Dysphagia Handicap Index (DHI) is a 25-item self-administered questionnaire. It is a noninvasive tool for measuring the handicapping effect of dysphagia on the physical, functional, and…
Objective. The aims of this study were to obtain normative nasalance scores for a normal Saudi population with different ages and genders, to develop nasometric Arabic speech materials, and to…