Enhancement of CASA-Based Source Separation System in Farsi
Author(s):
Abstract:
In this paper, new systems to enhance the performance of binaural source separation system, called MESSL, are proposed. In the source separation system, first, the Gaussian models for the interaural phase difference (IPD) and interaural level difference (ILD) parameters are obtained by using the EM algorithm. Then, by using the generated model for each source, a soft mask is extracted and multiplied with the short-time Fourier transform (STFT) of the mixture signal to separate the target signal. Because of incomplete performance of the separation system, two post-processing systems are proposed to remove the unwanted signals from the target signal. The first proposed method is the adaptive noise cancellation using learning-based particle swarm optimization (LPSO). The second proposed post-processing system includes two stages. In the first stage of this system, the denoising technique of the Wavelet transform is employed to remove the main part of the distracter signal. In the second step, the minimum mean-squares-error (MMSE) approach is used to enhance further the quality of the separated target signal. Evaluation and comparison of the proposed systems for Farsi database shows that the second proposed system performs well in the enhancement of the separated target speech and is also computationally efficient.
Keywords:
Language:
Persian
Published:
Journal of Electrical Engineering, Volume:46 Issue: 4, 2016
Pages:
273 to 283
https://www.magiran.com/p1598842