Acoustic localization of people in reverberant environments using deep learning techniques
- VERA DÍAZ, JUAN MANUEL
- Daniel Pizarro Pérez Director
- Javier Macías Guarasa Co-director
Defence university: Universidad de Alcalá
Fecha de defensa: 03 November 2023
- Manuel Rosa Zurera Chair
- Juana María Gutiérrez Arriola Secretary
- Máximo Cobos Serrano Committee member
Type: Thesis
Abstract
Locating people from acoustic information is becoming increasingly important in real world applications such as security, surveillance,and human-robot interaction. In many cases, there is a need to accurately locate people or objects based on the sound they produce, especially in noisy and reverberant environments where traditional localization methods may fail, or in scenarios where video analytics-based methods are not feasible due to lack of such sensors or relevant occlusions. For instance, in security and surveillance, the ability to accurately locate a sound source can help identify potential threats or intruders. In healthcare, acoustic localization can be used to monitor the movement sand activities of patients, especially those with mobility issues. In human-robot interaction, robots equipped with acoustic localization capabilities can better sense and respond to their environment, enabling more natural and intuitive interactions with humans. Hence, the development of accurate and robust acoustic localization systems using advanced techniques such as deep learning is of great practical importance. Therefore, this thesis addresses this problem in three fundamental researchlines: (i) The design of anend-to-end system based on neural networks capable of improving the localization accuracy rates ofexisting state-of-the-art systems. (ii)The design of a system capable of simultaneously localizing one or more speakers in environments with different characteristics and sensor array geometries without the need for retraining. (iii)The design of systems capable of refining the acoustic power maps required for acoustic source localization in order to achieve better localization rates later. In order to evaluate the achievement of these objectives, several realistic databases with different characteristics have been used,where the people involved in the scenes can act without any constraints. All the proposed systems have been evaluated under the same conditions and have out performed the current state-of-the-art systems in terms of localization error.