Chair of
Multimedia Communications and Signal Processing
Prof. Dr.-Ing. André Kaup

Nonlinear Acoustic Echo Cancellation

Field of activity: Audio and Acoustic Signal Processing
Research topic: Signal Improvement and Detection
Staff: Prof. Dr.-Ing. Walter Kellermann
Dipl.-Ing. Christian Hofmann

The acoustic echo cancellation (AEC) is a typcial application of system identification. The problem of fed back echos usually occurs in hands-free communication scenarios, when there is a strong coupling between the emitted loudpseaker sound (y) and the recorded microphone signal (d) (see Fig. 1). Due to their slim and relatively exposed design, however, this kind of feedback is also increasingly problematic even for the "normal" use of modern mobile devices. Since the typical round-trip period of packet-switched communication networks is about 200ms, this delay causes a very disturbing acoustical experience for the far-end speaker. Seeking to relieve this drawback and thus enhancing the duplex communication, a digital echo canceller is used in parallel to the echo path. The task of this canceller is to approximate the acoustical transmission properties of the local room, so as to generate an appropriate echo estimate (y') at the filter output that can then be subtracted from the microphone signal (d).

[AEC]

Fig. 1: Nonlinear Acoustic Echo Cancellation (NLAEC) due to nonideal hardware components

Handling Nonlinearities in the Echo Path

The conventionally used signal model assumes a purely linear transmission path from the input (x) to the microphone signal (d). However, due to the increasingly low-cost and small communication devices, another focus of AEC has been put to the influence of nonlinear hardware components, recently. For instance, amplifiers operating close to saturation or miniaturised loudspeakers in mobile devices have to be considered as sources of potential nonlinearity. Thus, the overall input/output relation of the echo path from (x) to (d) also exhibits nonlinear distortions (see Fig. 1).

In such a case of a nonlinear acoustic echo cancellation (NLAEC), the adaptive filter itself also needs to be designed as a corresponding nonlinear system in order to obtain an appropriate cancellation of the echo. This is especially important, since otherwise any nonlinear distortions present in the microphone signal can not only be hardly removed effectively, but will also hamper the performance of the conventional linear compensation as additional noise. Appropriate filter structures for compensating the echos are therefore given by cascades of saturation curves and linear filters (i.e. Wiener-Hammerstein models or power filters). For the sake of robustness, the blockwise linear and nonlinear components as depicted in Fig. 1, are usually not realized in a cascade structure, but are modelled as equivalent overall system in the form of a finite-order Volterra filter. Regardless of the used filter structure, additional components for detecting speech activity, estimating the present noise power, controlling the step-size adjustment and for detecting critical double-talk situations (i.e. if and when the local speaker is active) are moreover required in order to obtain a fully functional, integrated system with robust behaviour.

Volterra Filters

The so-called Volterra filters represent nonlinear models with memory that are well-suited in order to suitably characterise a broad class of such nonlinear distortions and, hence, are within the main focus of research. These models represent a very general and promising approach for the description of nonlinear systems, since they can be interpreted as both an extension of linear transversal filters towards higher-order convolutional products as well as an incorporation of memory into the Taylor series approximation. As can be seen from Fig. 2, the structure of these filters is given by the parallel processing due to several Volterra kernels of different order, performing a filtering operation on different products of input samples (starting with the second order). Since this description always comprises a linear kernel (impulse response) as well, this filter type represents a generalisation of the conventional FIR filter. Hence, each linear filter can also be understood as a Volterra filter of first order. The subsequent Fig. 3 furthermore visualises the linear (left) and the quadratic kernel (right) of a second-order VF that has been obtained by measurements from a small loudspeaker.

[VF]

Fig. 2: Volterra filter of order P
 
[h1] [h2]

Fig. 3: Linear (left) and quadratic kernel (right) of a measured second-order VF


Since the task of nonlinear system identification requires corresponding time-variant nonlinear filters, adaptive implementations of VF have been thoroughly investigated and improved. However, up to now the application of these model is restricted to lower nonlinear orders only, due to the number of kernel coefficients increasing exponentionally with the order of the kernel. On the other hand, this high number of degrees of freedom also yields a relatively slow convergence that is furthermore reduced by the mostly weak excitation of the nonlinear parts of the underlying system.

In order to tackle these challenges, research activites have presented improved and efficient system descriptions in time and frequeny domain (DFT) have been developed over the recent years, with adaptive algorithms based on extensions of LMS-type updates from the linear case. Moreover, several approaches for increasing the speed of convergence as well as for realizing memory-efficient, self-configuring filter models have been proposed. Thus, for each application at hand the specific advantages and shortcomings as well as the computational complexity can be traded off separately. The developed structures and algorithm are thereby of general interest for arbitrary nonlinear systems. However, in the acoustic context the obtained performance is evaluated with respect to the NLAEC scenario, as this can be seen as a prime example for the task of nonlinear modelling due to its challenging constraints (strongly nonstationary inputs, quickly time-variant system and relatively long system memory).

Performance Comparison of Various Echo Cancellers

In order to briefly illustrate the different performance of adaptive linear and adaptive nonlinear filters, respectively, Fig. 3 shows the achievable echo cancellation in terms of the ERLE measure ("Echo Return Loss Enhancement") for various adaptive filter and white noise and music/speech at the input (x). The "unknown" echo path has thereby been modelled by a second-order Volterra model and exhibits a signal-to-noise ratio of 40dB. These results especially demonstrate that lack of modelling power that is implied by the linear adaptive filter, as the power ratio of linear to nonlinear signal components (which is approx. 10dB here) is hardly exceeded.

[ERLE]

Fig. 4: Echo cancellation performance for different adaptive filters and white noise (left) and music/speech input (right)

Funding

The LMS Chair gratefully acknowledges the funding by the Deutsche Forschungsgemeinschaft (DFG) related to nonlinear acoustic echo cancellation under project numbers KE 890/1-1 to KE 890/1-3. Current activities are still funded by means of the project KE 890/5-1.

Publications

2015-8 A. Schwarz, C. Hofmann, W. Kellermann
   [bib]

Nonlinear acoustic echo cancellation apparatus and method thereof (비선형 음향 에코 소거 장치 및 그 방법)
1020150012752, Feb. 2015
2014-51 A. Schwarz, W. Kellermann, Jae-Hoon Jeong
   [bib]

An audio signal processing system and method for removing an echo signal
CN103546839, Jan. 2014
2014-30
CRIS
A. Schwarz, C. Hofmann, W. Kellermann
   [pdf]   [link]   [bib]

Combined Nonlinear Echo Cancellation and Residual Echo Suppression
ITG Conference on Speech Communication, Pages: 1--4, Erlangen, Germany, Sep. 2014
2014-22
CRIS
R. Maas, C. Hümmer, A. Schwarz, C. Hofmann, W. Kellermann
   [bib]

A Bayesian Network View on Linear and Nonlinear Acoustic Echo Cancellation
IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), Pages: 495--499, Xi'an, China, Jul. 2014
2014-14
CRIS
C. Hofmann, C. Hümmer, W. Kellermann
   [bib]

Significance-Aware Hammerstein Group Models for Nonlinear Acoustic Echo Cancellation
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Pages: 5934--5938, Florence, Italy, May 2014
2014-11
CRIS
C. Hümmer, C. Hofmann, R. Maas, A. Schwarz, W. Kellermann
   [bib]

The elitist particle filter based on evolutionary strategies as novel approach for nonlinear acoustic echo cancellation
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Pages: 1315--1319, Florence, Italy, May 2014
2014-8 Jae-Hoon Jeong, A. Schwarz, W. Kellermann
   [pdf]   [link]   [bib]

Audio signal processing system and Method for removing echo signal thereof (오디오 신호 처리 시스템 및 이의 에코 신호 제거 방법)
1020120074629, Jan. 2014
2014-7 Jae-Hoon Jeong, A. Schwarz, W. Kellermann
   [pdf]   [bib]

Audio signal processing system and echo signal removing method thereof
US20140010382 A1, Jan. 2014
2013-58
CRIS
M. Zeller
   [bib]

Generalized Nonlinear System Identification using Adaptive Volterra Filters with Evolutionary Kernels
Dr. Hut Verlag, München, 2013
2013-50
CRIS
A. Schwarz, C. Hofmann, W. Kellermann
   [link]   [bib]

Spectral Feature-based Nonlinear Residual Echo Suppression
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Pages: 1--4, New Paltz, NY, USA, Oct. 2013
2007-70
CRIS
F. Küch, W. Kellermann
   [pdf]   [bib]

Nonlinear residual echo suppression using a power filter model of the acoustic echo path
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Vol. I, Editor(s): IEEE, Pages: I73-I76, Honolulu, Hawaii, Apr. 2007