Frequency-Sliding Generalized Cross-Correlation: A Sub-band Time Delay Estimation Approach
The generalized cross correlation (GCC) is regarded as the most popular approach for estimating the time difference of arrival (TDOA) between the signals received at two sensors. Time delay estimates are obtained by maximizing the GCC output, where the direct-path delay is usually observed as a prominent peak. Moreover, GCCs play also an important role in steered response power (SRP) localization algorithms, where the SRP functional can be written as an accumulation of the GCCs computed from multiple sensor pairs. Unfortunately, the accuracy of TDOA estimates is affected by multiple factors, including noise, reverberation and signal bandwidth. In this paper, a sub-band approach for time delay estimation aimed at improving the performance of the conventional GCC is presented. The proposed method is based on the extraction of multiple GCCs corresponding to different frequency bands of the cross-power spectrum phase in a sliding-window fashion. The major contributions of this paper include: 1) a sub-band GCC representation of the cross-power spectrum phase that, despite having a reduced temporal resolution, provides a more suitable representation for estimating the true TDOA; 2) such matrix representation is shown to be rank one in the ideal noiseless case, a property that is exploited in more adverse scenarios to obtain a more robust and accurate GCC; 3) we propose a set of low-rank approximation alternatives for processing the sub-band GCC matrix, leading to better TDOA estimates and source localization performance. An extensive set of experiments is presented to demonstrate the validity of the proposed approach.
- Pub Date:
- October 2019
- Electrical Engineering and Systems Science - Audio and Speech Processing
- Article accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processing