|Field of activity:||Audio and Acoustic Signal Processing|
|Research topic:||Acoustic Scene Analysis|
|Staff:||Prof. Dr.-Ing. Walter Kellermann
PhD Shmulik Markovich-Golan
Acoustic source localization aims at extracting the localization information of one or several sound sources from signals captured by a number of spatially distinct microphones. By exploiting the spatial diversity offered by an array of several microphones, acoustic source localization techniques allow to estimate the position of one or several sound sources in a two-dimensional plane or in a three-dimensional space without any prior knowledge about the observed acoustical scene. Accurate localization of one or several sound sources can serve in many applications as a preliminary step to other processes like, e.g., steering a beamformer or pointing a camera in the direction of a sound source. A wide variety of algorithms exist, each addressing different acoustical scenarios depending on the nature of the source (broadband or narrowband, stationary or non-stationary...), the room reverberation or the amount of background noise. Figure 1 provides an overview of existing approaches. We can identify two different strategies:
At the LMS, also temperature issues of acoustic source localization have been investigated, where changes of the speed of sound due to varying room temperatures are taken into account.
In the direct approach, the position of the active sound source(s) is characterized by an acoustical energy map of the search space. Depending on the localization task, the search space can be a discrete set of grid points in a plane or in a 3D space. It can also be a discrete set of directions, when the range from the sources to the sensors is disregarded. The latter case is usually referred to as the far-field search (i.e., for sources located far away from the sensors), in contrast to the near-field search. Figure 2 depicts two exemplary search grids.
Computed directly from the observed sensor signals, the energy map reflects the activity of the source(s) in the search space. The position of the active source(s) can then be estimated by identifying the local extrema in the energy map, as depicted in Fig. 3.
Fig. 3: Energy map reflecting the source position in the near field (left) and in the far field (right).
Therefore, localization strategies following the direct approach differ only from the way the acoustical map is computed. Among the existing methods, we can identify three categories of algorithms:
Approaches based on Time Differences Of Arrival (TDOA) rely on a two-step procedure. In a first step one or several time delays between different pairs of microphones (i.e., the TDOAs) are estimated. Figure 4 shows the locus of potential positions corresponding to a given TDOA. The microphone pair is depicted by the two black balls. In general, the locus of potential positions corresponds to one half of a hyperboloïd of two sheets (see the left surface in Fig. 4), with the sensor positions as foci. The asymptotes of the hyperboloïd are shown by red dashed lines in the figure. Using a set of TDOA estimates computed from different sensor pairs, the position of the sources can be calculated in a second step as the intersection of the different hyperboloïds. Assuming a source located far-away from the sensors (i.e., in the far field), the hyperboloïd can be approximated as a cone (see the right surface in Fig. 4). This reduces the dimensionality of the problem since only the Direction-Of-Arrival (DOA) needs to be taken into account, hence disregarding the range coordinate.
Fig. 4: Cone of potential positions for a given TDOA (left) and its far-field approximation (right).
Most of the direct methods listed abovecan be reformulated for the extraction of TDOAs by considering only a single pair of sensors. Other TDOA estimation techniques can be classified into two categories:
||P. Annibale, R. Rabenstein
Closed-Form Estimation of the Speed of Propagating Waves from Time Measurements
Springer Journal on Multidimensional Systems and Signal Processing (MDSSP) Vol. 25, Num. 2, Pages: 361-378, 2014
||K. Kowalczyk, E.A.P. Habets, W. Kellermann, P.A. Naylor
Blind system identification using sparse learning for TDOA estimation of room reflections
IEEE Signal Processing Letters (IEEE SPL) Vol. 20, Online Publication, Num. 7, Pages: 653--656, 2013
||P. Annibale, J. Filos, P. A. Naylor, R. Rabenstein
TDOA-based Speed of Sound Estimation for Air Temperature and Room Geometry Inference
IEEE Transactions on Audio, Speech and Language Processing (IEEE TASLP) Vol. 21, Num. 2, Pages: 234 - 246, Feb. 2013
||H. Sun, W. Kellermann, E. Mabande, K. Kowalczyk
Localization of distinct reflections in rooms using spherical microphone array eigenbeam processing
J. Acoust. Soc. Am. (JASA) Vol. 131, Num. 4, Pages: 2828--2840, Apr. 2012