1 code implementation • NeurIPS 2023 • Miriam Barrabes, Daniel Mas Montserrat, Margarita Geleta, Xavier Giro-i-Nieto, Alexander G. Ioannidis
Data shift is a phenomenon present in many real-world applications, and while there are multiple methods attempting to detect shifts, the task of localizing and correcting the features originating such shifts has not been studied in depth.
1 code implementation • 9 Mar 2023 • Jaume Ros, Margarita Geleta, Jordi Pons, Xavier Giro-i-Nieto
The field of steganography has experienced a surge of interest due to the recent advancements in AI-powered techniques, particularly in the context of multimodal setups that enable the concealment of signals within signals of a different nature.
Ranked #1 on Image Reconstruction on Audio Set
no code implementations • 7 Sep 2022 • Pol Caselles, Eduard Ramon, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer, Gil Triginer
Our key ingredients are two data-driven statistical models based on neural fields that resolve the ambiguities of single-view 3D surface reconstruction and appearance factorization.
1 code implementation • 1 Sep 2022 • Alvaro Budria, Laia Tarres, Gerard I. Gallego, Francesc Moreno-Noguer, Jordi Torres, Xavier Giro-i-Nieto
Significant progress has been made recently on challenging tasks in automatic sign language understanding, such as sign language recognition, translation and production.
1 code implementation • ICCV 2021 • Eduard Ramon, Gil Triginer, Janna Escur, Albert Pumarola, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer
In this paper, we tackle these limitations for the specific problem of few-shot full 3D head reconstruction, by endowing coordinate-based representations with a probabilistic shape prior that enables faster convergence and better generalization when using few input images (down to three).
no code implementations • ICML Workshop URL 2021 • Juan José Nieto, Roger Creus, Xavier Giro-i-Nieto
Pre-training Reinforcement Learning agents in a task-agnostic manner has shown promising results.
2 code implementations • 8 Jun 2021 • Ioannis Kazakos, Carles Ventura, Miriam Bellver, Carina Silberer, Xavier Giro-i-Nieto
Recent advances in deep learning have brought significant progress in visual grounding tasks such as language-guided video object segmentation.
3 code implementations • ICCV 2021 • Oscar Mañas, Alexandre Lacoste, Xavier Giro-i-Nieto, David Vazquez, Pau Rodriguez
Transfer learning approaches can reduce the data requirements of deep learning algorithms.
Ranked #4 on Change Detection on OSCD - 13ch (using extra training data)
no code implementations • 20 Dec 2020 • Lucas Ventura, Amanda Duarte, Xavier Giro-i-Nieto
Recent work have addressed the generation of human poses represented by 2D/3D coordinates of human joints for sign language.
2 code implementations • 1 Oct 2020 • Miriam Bellver, Carles Ventura, Carina Silberer, Ioannis Kazakos, Jordi Torres, Xavier Giro-i-Nieto
The task of video object segmentation with referring expressions (language-guided VOS) is to, given a linguistic phrase and a video, generate binary masks for the object to which the phrase refers.
Ranked #1 on Referring Expression Segmentation on A2Dre test
no code implementations • 25 Aug 2020 • Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier Giro-i-Nieto
Our method consists in first predicting pseudo-masks for the unlabeled pool of samples, together with a score predicting the quality of the mask.
1 code implementation • CVPR 2021 • Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giro-i-Nieto
Towards this end, we introduce How2Sign, a multimodal and multiview continuous American Sign Language (ASL) dataset, consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth.
no code implementations • 1 Jun 2020 • Benet Oriol, Jordi Luque, Ferran Diego, Xavier Giro-i-Nieto
In this work, we propose an effective approach for training unique embedding representations by combining three simultaneous modalities: image and spoken and textual narratives.
1 code implementation • ICML 2020 • Víctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres
We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.
no code implementations • 5 Nov 2019 • Alba Herrera-Palacio, Carles Ventura, Carina Silberer, Ionut-Teodor Sorodoc, Gemma Boleda, Xavier Giro-i-Nieto
The goal of this work is to segment the objects in an image that are referred to by a sequence of linguistic descriptions (referring expressions).
1 code implementation • 25 Oct 2019 • Mariona Caros, Maite Garolera, Petia Radeva, Xavier Giro-i-Nieto
With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily.
1 code implementation • 5 Oct 2019 • Benet Oriol Sabat, Cristian Canton Ferrer, Xavier Giro-i-Nieto
This work addresses the challenge of hate speech detection in Internet memes, and attempts using visual information to automatically detect hate speech, unlike any previous work of our knowledge.
2 code implementations • 3 Jul 2019 • Panagiotis Linardos, Eva Mohedano, Juan Jose Nieto, Noel E. O'Connor, Xavier Giro-i-Nieto, Kevin McGuinness
This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain.
no code implementations • 14 May 2019 • Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier Giro-i-Nieto
Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention.
3 code implementations • 25 Mar 2019 • Amanda Duarte, Francisco Roldan, Miquel Tubau, Janna Escur, Santiago Pascual, Amaia Salvador, Eva Mohedano, Kevin McGuinness, Jordi Torres, Xavier Giro-i-Nieto
Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker.
1 code implementation • CVPR 2019 • Carles Ventura, Miriam Bellver, Andreu Girbau, Amaia Salvador, Ferran Marques, Xavier Giro-i-Nieto
Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence.
Ranked #1 on One-shot visual object segmentation on YouTube-VOS
4 code implementations • CVPR 2019 • Amaia Salvador, Michal Drozdzal, Xavier Giro-i-Nieto, Adriana Romero
Our system predicts ingredients as sets by means of a novel architecture, modeling their dependencies without imposing any order, and then generates cooking instructions by attending to both image and its inferred ingredients simultaneously.
Ranked #1 on Recipe Generation on Recipe1M
no code implementations • 12 Nov 2018 • Víctor Campos, Xavier Giro-i-Nieto, Jordi Torres
Evolution Strategies (ES) emerged as a scalable alternative to popular Reinforcement Learning (RL) techniques, providing an almost perfect speedup when distributed across hundreds of CPU cores thanks to a reduced communication overhead.
1 code implementation • 3 Sep 2018 • Marc Assens, Xavier Giro-i-Nieto, Kevin McGuinness, Noel E. O'Connor
We introduce PathGAN, a deep neural network for visual scanpath prediction trained on adversarial examples.
2 code implementations • 28 Aug 2018 • Panagiotis Linardos, Eva Mohedano, Monica Cherto, Cathal Gurrin, Xavier Giro-i-Nieto
This work adapts a deep neural model for image saliency prediction to the temporal domain of egocentric video.
no code implementations • 21 Mar 2018 • Daniel Fojo, Víctor Campos, Xavier Giro-i-Nieto
Adaptive Computation Time for Recurrent Neural Networks (ACT) is one of the most promising architectures for variable computation.
no code implementations • ECCV 2018 • Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giro-i-Nieto, Shih-Fu Chang
We aim to tackle a novel task in action detection - Online Detection of Action Start (ODAS) in untrimmed, streaming videos.
1 code implementation • 2 Dec 2017 • Amaia Salvador, Miriam Bellver, Victor Campos, Manel Baradad, Ferran Marques, Jordi Torres, Xavier Giro-i-Nieto
We present a recurrent model for semantic instance segmentation that sequentially generates binary masks and their associated class probabilities for every object in an image.
2 code implementations • 29 Nov 2017 • Miriam Bellver, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Xavier Giro-i-Nieto, Jordi Torres, Luc van Gool
A fully automatic technique for segmenting the liver and localizing its unhealthy tissues is a convenient tool in order to diagnose hepatic diseases and assess the response to the according treatments.
1 code implementation • 29 Nov 2017 • Eva Mohedano, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor
This work explores attention models to weight the contribution of local convolutional representations for the instance search task.
2 code implementations • 24 Nov 2017 • Marc Gorriz, Axel Carlier, Emmanuel Faure, Xavier Giro-i-Nieto
We propose a novel Active Learning framework capable to train effectively a convolutional neural network for semantic segmentation of medical imaging, with a limited amount of training labeled data.
3 code implementations • ICLR 2018 • Victor Campos, Brendan Jou, Xavier Giro-i-Nieto, Jordi Torres, Shih-Fu Chang
We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph.
1 code implementation • 21 Aug 2017 • Delia Fernandez, Alejandro Woodward, Victor Campos, Xavier Giro-i-Nieto, Brendan Jou, Shih-Fu Chang
This work aims at disentangling the contributions of the `adjectives' and `nouns' in the visual prediction of ANPs.
1 code implementation • 13 Jul 2017 • Xunyu Lin, Victor Campos, Xavier Giro-i-Nieto, Jordi Torres, Cristian Canton Ferrer
This paper introduces an unsupervised framework to extract semantically rich features for video representation.
1 code implementation • 11 Jul 2017 • Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor
The first part of the network consists of a model trained to generate saliency volumes, whose parameters are fit by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes.
2 code implementations • 9 Jul 2017 • Albert Jimenez, Jose M. Alvarez, Xavier Giro-i-Nieto
In this paper, we go beyond this spatial information and propose a local-aware encoding of convolutional features based on semantic information predicted in the target image.
3 code implementations • 4 Jan 2017 • Junting Pan, Cristian Canton Ferrer, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol, Xavier Giro-i-Nieto
We introduce SalGAN, a deep convolutional neural network for visual saliency prediction trained with adversarial examples.
1 code implementation • 11 Nov 2016 • Miriam Bellver, Xavier Giro-i-Nieto, Ferran Marques, Jordi Torres
We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.
1 code implementation • 9 Oct 2016 • Issey Masuda, Santiago Pascual de la Puente, Xavier Giro-i-Nieto
This thesis report studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework.
no code implementations • 29 Aug 2016 • Cristian Reyes, Eva Mohedano, Kevin McGuinness, Noel E. O'Connor, Xavier Giro-i-Nieto
This work presents a retrieval pipeline and evaluation scheme for the problem of finding the last appearance of personal objects in a large dataset of images captured from a wearable camera.
3 code implementations • 29 Aug 2016 • Alberto Montes, Amaia Salvador, Santiago Pascual, Xavier Giro-i-Nieto
This thesis explore different approaches using Convolutional and Recurrent Neural Networks to classify and temporally localize activities on videos, furthermore an implementation to achieve it has been proposed.
3 code implementations • 29 Apr 2016 • Amaia Salvador, Xavier Giro-i-Nieto, Ferran Marques, Shin'ichi Satoh
This work explores the suitability for instance retrieval of image- and region-wise representations pooled from an object detection CNN such as Faster R-CNN.
2 code implementations • 15 Apr 2016 • Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O'Connor, Xavier Giro-i-Nieto
This work proposes a simple instance retrieval pipeline based on encoding the convolutional features of CNN using the bag of words aggregation scheme (BoW).
2 code implementations • 12 Apr 2016 • Victor Campos, Brendan Jou, Xavier Giro-i-Nieto
Visual multimedia have become an inseparable part of our digital social lives, and they often capture moments tied with deep affections.
1 code implementation • CVPR 2016 • Junting Pan, Kevin McGuinness, Elisa Sayrol, Noel O'Connor, Xavier Giro-i-Nieto
The prediction of salient areas in images has been traditionally addressed with hand-crafted features based on neuroscience principles.
no code implementations • 24 Apr 2015 • Amaia Salvador, Matthias Zeppelzauer, Daniel Manchon-Vizuete, Andrea Calafell, Xavier Giro-i-Nieto
Our solution is based on the combination of visual features extracted from convolutional neural networks with temporal information using a hierarchical classifier scheme.
no code implementations • 19 Aug 2014 • Eva Mohedano, Graham Healy, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor, Alan F. Smeaton
This paper explores the potential of brain-computer interfaces in segmenting objects from images.