Performance of time- and frequency-domain binaural beamformers based on recorded signals from real rooms
Extraction of a target sound source amidst multiple interfering sound sources is difficult when there are fewer sensors than sources, as is the case for human listeners in the classic cocktail-party situation. This study compares the signal extraction performance of five algorithms using recordings of speech sources made with three different two-microphone arrays in three rooms of varying reverberation time. Test signals, consisting of two to five speech sources, were constructed for each room and array. The signals were processed with each algorithm, and the signal extraction performance was quantified by calculating the signal-to-noise ratio of the output. A frequency-domain minimum-variance distortionless-response beamformer outperformed the time-domain based Frost beamformer and generalized sidelobe canceler for all tests with two or more interfering sound sources, and performed comparably or better than the time-domain algorithms for tests with one interfering sound source. The frequency-domain minimum-variance algorithm offered performance comparable to that of the Peissig-Kollmeier binaural frequency-domain algorithm, but with much less distortion of the target signal. Comparisons were also made to a simple beamformer. In addition, computer simulations illustrate that, when processing speech signals, the chosen implementation of the frequency-domain minimum-variance technique adapts more quickly and accurately than time-domain techniques.