Chair of
Multimedia Communications and Signal Processing
Prof. Dr.-Ing. André Kaup

Embodied Audition for RobotS (EARS)

Field of Activity: Audiosignalverarbeitung
Staff: Prof. Dr.-Ing. Walter Kellermann
Dr.-Ing. Heinrich Löllmann
M.Sc. Hendrik Barfuss


EARS will explore new algorithms for enhancing the auditive capabilities of humanoid robots.

A main focus is to develop the fundamentals for a natural spoken dialogue between humans and robots in adverse acoustical environments.

The success of future natural intuitive human-robot interaction (HRI) will critically depend on how responsive the robot will be to all forms of human expressions and how well it will be aware of its environment. With acoustic signals distinctively characterizing physical environments and speech being the most effective means of communication among humans, truly humanoid robots must be able to fully extract the rich auditory information from their environment and to use voice communication as much as humans do. While vision-based HRI is well developed, current limitations in robot audition do not allow for such an effective, natural acoustic human-robot communication in real-world environments, mainly because of the severe degradation of the desired acoustic signals due to noise, interference and reverberation when captured by the robot’s microphones.

To overcome these limitations, the project Embodied Audition for RobotS (EARS) will provide intelligent ‘ears’ with close-to-human auditory capabilities and use it for HRI in complex real-world environments. Novel microphone arrays and powerful signal processing algorithms shall be able to localize and track multiple sound sources of interest and to extract and recognize the desired signals.

After fusion with robot vision, embodied robot cognition will then derive HRI actions and knowledge on the entire scenario, and feed this back to the acoustic interface for further auditory scene analysis. As a prototypical application, EARS will consider a welcoming robot in a hotel lobby offering all the above challenges. Representing a large class of generic applications, this scenario is of key interest to industry and, thus, a leading European robot manufacturer will integrate EARS’s results into a robot platform for the consumer market and validate it.

In addition, the provision of open-source software and an advisory board with key players from the relevant robot industry should help to make EARS a turnkey project for promoting audition in the robotics world.


Publications:

2017-11
CRIS
H. Barfuss, M. Buerger, Jasper Podschus, W. Kellermann
   [link]   [doi]   [bib]

HRTF-based two-dimensional robust least-squares frequency-invariant beamformer design for robot audition
Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), Pages: 56-60, San Francisco, CA, Mar. 2017
2017-9
CRIS
Heinrich W. Löllmann, Alastair H. Moore, Patrick A. Naylor, Boaz Rafaely, Radu Horaud, Alexandre Mazel, W. Kellermann
   [link]   [bib]

Microphone Array Signal Processing for Robot Audition
Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), Pages: 1-5, Mar. 2017
2017-7
CRIS
H. Barfuss, C. Hümmer, A. Schwarz, W. Kellermann
   [link]   [doi]   [bib]

Robust coherence-based spectral enhancement for speech recognition in adverse real-world environments
Elsevier Computer Speech and Language Special Issue: Multi-Microphone Automatic Speech Recognition (CSL) Num. 46, Pages: 388 - 400, 2017
2016-38
CRIS
H. Barfuss, M. Müglich, W. Kellermann
   [link]   [doi]   [bib]

HRTF-based Robust Least-Squares Frequency-Invariant Polynomial Beamforming
Int. Workshop on Acoustic Echo and Noise Control (IWAENC), Pages: 1-5, Sep. 2016
2016-17
CRIS
A. El-Rayyes, Heinrich W. Löllmann, C. Hofmann, W. Kellermann
   [bib]

Acoustic Echo Control for Humanoid Robots
accepted for DAGA 2016, Aachen, Germany, Mar. 2016
2016-16
CRIS
H. Barfuss, W. Kellermann
   [link]   [bib]

On the Impact of Localization Errors on HRTF-based Robust Least-Squares Beamforming
Jahrestagung für Akustik (DAGA), Pages: 1072-1075 , Aachen, Germany, Mar. 2016
2015-47
CRIS
H. Barfuss, C. Hümmer, A. Schwarz, W. Kellermann
   [link]   [bib]

Robust coherence-based spectral enhancement for distant speech recognition
available on arXiv.org, Dec. 2015
2015-42
CRIS
A. Deleforge, Sharon Gannot, W. Kellermann
   [bib]

Towards a generalization of relative transfer functions to more than one source
European Signal Processing Conf. (EUSIPCO), Pages: 419-423, Nice, France, Sep. 2015
2015-31
CRIS
H. Barfuss, C. Hümmer, G. Lamani, A. Schwarz, W. Kellermann
   [link]   [doi]   [bib]

HRTF-based Robust Least-Squares Frequency-Invariant Beamforming
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Pages: 1-5, New Paltz, NY, USA, Oct. 2015
2015-9
CRIS
A. Deleforge, W. Kellermann
   [bib]

Phase-Optimized K-SVD for Signal Extraction from Underdetermined Multichannel Sparse Mixtures
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Apr. 2015
2015-5
CRIS
V. Tourbabin, H. Barfuss, B. Rafaely, W. Kellermann
   [link]   [doi]   [bib]

Enhanced robot audition by dynamic acoustic sensing in moving humanoids
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Pages: 5625-5629, Brisbane, Australia, Apr. 2015
2014-38
CRIS
H. Barfuss, W. Kellermann
   [doi]   [bib]

An Adaptive Microphone Array Topology for Target Signal Extraction with Humanoid Robots
Int. Workshop on Acoustic Echo and Noise Control (IWAENC), Pages: 16-20, Antibes - Juan les Pins, France, Sep. 2014
2014-33
CRIS
Heinrich W. Löllmann, H. Barfuss, A. Deleforge, W. Kellermann
   [link]   [bib]

Challenges in Acoustic Signal Enhancement for Human-Robot Communication
ITG Fachtagung Sprachkommunikation, Pages: 1-4, Erlangen, Germany, Sep. 2014
2014-19
CRIS
Y. Zheng, K. Reindl, W. Kellermann
   [link]   [doi]   [bib]

Analysis of dual-channel ICA-based blocking matrix for improved noise estimation
EURASIP Journal on Advances in Signal Processing (JASP) Vol. 2014, Online Publication, Num. 2014 :26, Pages: 1--24, Mar. 2014