Research Grants

A) Grant # 1
Supported by: King Abdulaziz City for Science and Technology (KACST)
Titele: Arabic Alphadigits Recognition System Using HMM and SAAVB Corpus
Period of Project:      12 months.
Grant Code: 28-157
Summary:
Spoken alphadigits recognition process is needed in many applications that need numbers and alphabets as input, such as telephone dialing using speech, addresses, airline reservation, automatic directory to retrieve or send information, etc. In the research Arabic alphadigits was investigated from the speech recognition problem point of view. The hidden Morkov model toolkit (HTK) was used to implement the isolated word recognizer with phoneme based HMM models. The system was an isolated whole-word speech recognizer. In the training and testing phase of the system, isolated digits data sets are taken from the telephony Arabic speech corpus, SAAVB. This standard corpus was developed by KACST and it is classified as a noisy speech database collected via fixed and mobile telephones. SAAVB consist of many subsets covering all digits, alphabets, read scripts, spontaneous speech, etc.
The recognition system achieved 92.89% overall correct digit recognition using mixed training and testing subsets collectively. On the other hand, the recognition system achieved 64.06% overall correct alphabets recognition using mixed training and testing subsets collectively. Also Arabic alphabets and digits were considered in one experiment by considering all six SAAVB partitions mentioned above. By this we considered 29 alphabets and ten digits and the system gave recognition correct rate of 76.06%.  The SAAVB evaluation was included in this research and put in this final report along with some recommendation in order to improve this important corpus.
 
B) Grant # 2
Supported by: King Abdulaziz City for Science and Technology (KACST)
Titele: Rhythm in Arabic Speech and Language (RASL): A New Research Initiative
Period of Project:      24 months.
Grant Code: 10INF1325-02
Summary:
In the project we aimed at establishing a new research paradigm in speech and language processing of Arabic language by giving the temporal organization of utterances and rhythm-based features a new role in various study domains and applications. Despite the fact that timing and rhythm play a crucial role in syntax and semantics of Arabic, current acoustical analyzers of speech do not include these aspects when comes the time to build speech processing systems. Recent work has developed a number of metrics that quantify rhythm in languages.  These rhythm metrics are based on acoustic measures and they can be calculated in both raw and rate-normalized forms.  Acoustic correlates of rhythm class in the speech signal have been proposed. These correlates have provided new insights into how speech timing functions both across and within languages. Main goal was to examine whether the rhythm metrics are sufficiently relevant to quantify and capture the duration variability in order to better characterize the durational differences that we observed among Arabic speakers with respect to some social factors such as age, native origin and regional accents.  These social factors are playing a role in structuring linguistic variation in Arabic. This is permitting us to gain a fuller account of potential differences between speakers that might be related to between-speaker variability in terms of rhythm scores. Therefore, a new and effective speech-enabled interface that incorporates well-adapted Arabic speech recognition systems using rhythm metrics was conducted and reported.
 
C) Grant  # 3
Supported by: King Abdulaziz City for Science and Technology (KACST)
Titele: Arabic Distinctive Phonetic Features (ADPF) and their Correlates: A Quantitative Approach
Period of Project:      24 months.
Grant Code: 11INF1968_02
Summary:
The purpose of this project is to establish a new approach to Modern Standard Arabic (MSA) speech and language processing by specifying, analyzing, and verifying Distinctive Phonetic Features (DPFs). Compared with other languages, there has been limited work and few developments regarding Arabic DPFs and their actualization in modern digital speech processing. In the project, the effectiveness of representing speech by DPFs in analyzing variable combinations of Arabic distinctive articulatory features observed in MSA is being examined. Despite the fact that phonetic features play a crucial role in discriminating phonetic units, current acoustical analyzers of speech do not include these aspects when they build speech processing systems.
A new method based on genetic algorithms is searched to find optimal weights of each type of features used in the front-end. This evolutionary approach is compared to empirical and discriminative techniques. To extend project outcomes to other various Arabic linguistic resources, a novel automatic labeling tool built on a parallel computation structure is proposed in order to respond to the needs of Arabic labeled corpora that are so crucially lacking. This work is the first attempt of Arabic language and speech processing based on a multidisciplinary approach using knowledge acquired from both quantitative and qualitative analyses of DPFs, APs and AFs. This research will jointly contribute to the knowledge advances while delivering highly added-value systems in the field of Arabic spoken interactive systems. It also contributes to respond to the needs of real-life applications for reliable Arabic spoken interfaces that are currently lacking dramatically.