Khamis A. Al-Karawi*
Speaker recognition developed in laboratories with clean speech samples can reach high performance when tested under the same controlled conditions, providing a potentially useful tool for critical applications for person identification. Nevertheless, non-stationary environmental noises and reverberance inevitably comprised in real-world speech samples in several cases compromises the reliability of recognition. Robustness of these recognition systems is crucial for applications such as security and forensics. To improve the performance of speaker verification systems, an effective and robust technique is proposed to extract features for speech processing, capable of operating in the clean and noisy condition. This paper investigates the performance of GFCC feature spaces and conventional MFCC in noisy and clean conditions. Furthermore, the effects of the signal-to-noise ratio (SNR), language mismatch on the system performance have been taken into account in this work. Experimental results have shown significant improvement in system performance in terms of reduced equal error rate and detection error trade-off. Performance in terms of recognition rates under various types of noise, various signal-to-noise ratios (SNRs) is quantified via simulation. Results from the study are presented and discussed.
MFCC, GFCC, speaker recognition, noise, robustness.
Acoustic Department, Salford University