Search Results for author: Antonio Miguel

Found 11 papers, 3 papers with code

Permutation Invariant Recurrent Neural Networks for Sound Source Tracking Applications

no code implementations • 14 Jun 2023 • David Diaz-Guerra, Archontis Politis, Antonio Miguel, Jose R. Beltran, Tuomas Virtanen

Conventional recurrent neural networks (RNNs), such as the long short-term memories (LSTMs) or the gated recurrent units (GRUs), take a vector as their input and use another vector to store their state.

Paper
Add Code

Direction of Arrival Estimation of Sound Sources Using Icosahedral CNNs

2 code implementations • 31 Mar 2022 • David Diaz-Guerra, Antonio Miguel, Jose R. Beltran

In this paper, we present a new model for Direction of Arrival (DOA) estimation of sound sources based on an Icosahedral Convolutional Neural Network (CNN) applied over SRP-PHAT power maps computed from the signals received by a microphone array.

Direction of Arrival Estimation

Paper
Code

Class Token and Knowledge Distillation for Multi-head Self-Attention Speaker Verification Systems

no code implementations • 6 Nov 2021 • Victoria Mingote, Antonio Miguel, Alfonso Ortega, Eduardo Lleida

This paper explores three novel approaches to improve the performance of speaker verification (SV) systems based on deep neural networks (DNN) using Multi-head Self-Attention (MSA) mechanisms and memory layers.

Knowledge Distillation Philosophy +1

Paper
Add Code

Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data

no code implementations • 27 Oct 2021 • Pablo Gimeno, Victoria Mingote, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

Area under the ROC curve (AUC) optimisation techniques developed for neural networks have recently demonstrated their capabilities in different audio and speech related tasks.

Segmentation

Paper
Add Code

Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural Networks

2 code implementations • 16 Jun 2020 • David Diaz-Guerra, Antonio Miguel, Jose R. Beltran

In this paper, we present a new single sound source DOA estimation and tracking system based on the well-known SRP-PHAT algorithm and a three-dimensional Convolutional Neural Network.

Paper
Code

Optimization of the Area Under the ROC Curve using Neural Network Supervectors for Text-Dependent Speaker Verification

no code implementations • 31 Jan 2019 • Victoria Mingote, Antonio Miguel, Alfonso Ortega, Eduardo Lleida

This paper explores two techniques to improve the performance of text-dependent speaker verification systems based on deep neural networks.

Text-Dependent Speaker Verification

Paper
Add Code

Disentangling and Learning Robust Representations with Natural Clustering

no code implementations • 27 Jan 2019 • Javier Antoran, Antonio Miguel

Learning representations that disentangle the underlying factors of variability in data is an intuitive way to achieve generalization in deep models.

Clustering

Paper
Add Code

Deep Speech Enhancement for Reverberated and Noisy Signals using Wide Residual Networks

no code implementations • 3 Jan 2019 • Dayana Ribas, Jorge Llombart, Antonio Miguel, Luis Vicente

The DNN model, trained with artificial synthesized reverberation data, was able to deal with far-field reverberated speech from real scenarios.

Speech Enhancement speech-recognition +1

Paper
Add Code

Tied Hidden Factors in Neural Networks for End-to-End Speaker Recognition

no code implementations • 27 Dec 2018 • Antonio Miguel, Jorge Llombart, Alfonso Ortega, Eduardo Lleida

As in Joint Factor Analysis, the model uses tied hidden variables to model speaker and session variability and a MAP adaptation of some of the parameters of the model.

Speaker Recognition Speaker Verification

Paper
Add Code

Differentiable Supervector Extraction for Encoding Speaker and Phrase Information in Text Dependent Speaker Verification

no code implementations • 22 Dec 2018 • Victoria Mingote, Antonio Miguel, Alfonso Ortega, Eduardo Lleida

Moreover, we can apply a convolutional neural network as front-end, and thanks to the alignment process being differentiable, we can train the whole network to produce a supervector for each utterance which will be discriminative with respect to the speaker and the phrase simultaneously.

Text-Dependent Speaker Verification

Paper
Add Code

gpuRIR: A python library for Room Impulse Response simulation with GPU acceleration

3 code implementations • 26 Oct 2018 • David Diaz-Guerra, Antonio Miguel, Jose R. Beltran

The Image Source Method (ISM) is one of the most employed techniques to calculate acoustic Room Impulse Responses (RIRs), however, its computational complexity grows fast with the reverberation time of the room and its computation time can be prohibitive for some applications where a huge number of RIRs are needed.

Room Impulse Response (RIR)

457

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.