Search Results for author: Xavier Giro-i-Nieto

Found 47 papers, 34 papers with code

Adversarial Learning for Feature Shift Detection and Correction

1 code implementation • NeurIPS 2023 • Miriam Barrabes, Daniel Mas Montserrat, Margarita Geleta, Xavier Giro-i-Nieto, Alexander G. Ioannidis

Data shift is a phenomenon present in many real-world applications, and while there are multiple methods attempting to detect shifts, the task of localizing and correcting the features originating such shifts has not been studied in depth.

Paper
Code

Towards Robust Image-in-Audio Deep Steganography

1 code implementation • 9 Mar 2023 • Jaume Ros, Margarita Geleta, Jordi Pons, Xavier Giro-i-Nieto

The field of steganography has experienced a surge of interest due to the recent advancements in AI-powered techniques, particularly in the context of multimodal setups that enable the concealment of signals within signals of a different nature.

Ranked #1 on Image Reconstruction on Audio Set

Image Reconstruction

Paper
Code

SIRA: Relightable Avatars from a Single Image

no code implementations • 7 Sep 2022 • Pol Caselles, Eduard Ramon, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer, Gil Triginer

Our key ingredients are two data-driven statistical models based on neural fields that resolve the ambiguities of single-view 3D surface reconstruction and appearance factorization.

Surface Reconstruction

Paper
Add Code

Topic Detection in Continuous Sign Language Videos

1 code implementation • 1 Sep 2022 • Alvaro Budria, Laia Tarres, Gerard I. Gallego, Francesc Moreno-Noguer, Jordi Torres, Xavier Giro-i-Nieto

Significant progress has been made recently on challenging tasks in automatic sign language understanding, such as sign language recognition, translation and production.

Sign Language Recognition Translation

Paper
Code

H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

1 code implementation • ICCV 2021 • Eduard Ramon, Gil Triginer, Janna Escur, Albert Pumarola, Jaime Garcia, Xavier Giro-i-Nieto, Francesc Moreno-Noguer

In this paper, we tackle these limitations for the specific problem of few-shot full 3D head reconstruction, by endowing coordinate-based representations with a probabilistic shape prior that enables faster convergence and better generalization when using few input images (down to three).

3D Reconstruction Multi-View 3D Reconstruction +1

119

Paper
Code

Unsupervised Skill-Discovery and Skill-Learning in Minecraft

no code implementations • ICML Workshop URL 2021 • Juan José Nieto, Roger Creus, Xavier Giro-i-Nieto

Pre-training Reinforcement Learning agents in a task-agnostic manner has shown promising results.

Self-Supervised Learning

Paper
Add Code

SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation

2 code implementations • 8 Jun 2021 • Ioannis Kazakos, Carles Ventura, Miriam Bellver, Carina Silberer, Xavier Giro-i-Nieto

Recent advances in deep learning have brought significant progress in visual grounding tasks such as language-guided video object segmentation.

Ranked #1 on Referring Expression Segmentation on Refer-YouTube-VOS

Object object-detection +3

Paper
Code

Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data

3 code implementations • ICCV 2021 • Oscar Mañas, Alexandre Lacoste, Xavier Giro-i-Nieto, David Vazquez, Pau Rodriguez

Transfer learning approaches can reduce the data requirements of deep learning algorithms.

Ranked #4 on Change Detection on OSCD - 13ch (using extra training data)

Change Detection Self-Supervised Learning +2

153

Paper
Code

Can Everybody Sign Now? Exploring Sign Language Video Generation from 2D Poses

no code implementations • 20 Dec 2020 • Lucas Ventura, Amanda Duarte, Xavier Giro-i-Nieto

Recent work have addressed the generation of human poses represented by 2D/3D coordinates of human joints for sign language.

Sign Language Production Video Generation

Paper
Add Code

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

2 code implementations • 1 Oct 2020 • Miriam Bellver, Carles Ventura, Carina Silberer, Ioannis Kazakos, Jordi Torres, Xavier Giro-i-Nieto

The task of video object segmentation with referring expressions (language-guided VOS) is to, given a linguistic phrase and a video, generate binary masks for the object to which the phrase refers.

Ranked #1 on Referring Expression Segmentation on A2Dre test

Image Segmentation Referring Expression Segmentation +2

Paper
Code

Mask-guided sample selection for Semi-Supervised Instance Segmentation

no code implementations • 25 Aug 2020 • Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier Giro-i-Nieto

Our method consists in first predicting pseudo-masks for the unlabeled pool of samples, together with a score predicting the quality of the mask.

Active Learning Image Segmentation +4

Paper
Add Code

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

1 code implementation • CVPR 2021 • Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giro-i-Nieto

Towards this end, we introduce How2Sign, a multimodal and multiview continuous American Sign Language (ASL) dataset, consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth.

Sign Language Production Sign Language Translation +1

Paper
Code

Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and Videos

no code implementations • 1 Jun 2020 • Benet Oriol, Jordi Luque, Ferran Diego, Xavier Giro-i-Nieto

In this work, we propose an effective approach for training unique embedding representations by combining three simultaneous modalities: image and spoken and textual narratives.

Retrieval

Paper
Add Code

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills

1 code implementation • ICML 2020 • Víctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Giro-i-Nieto, Jordi Torres

We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.

Paper
Code

Recurrent Instance Segmentation using Sequences of Referring Expressions

no code implementations • 5 Nov 2019 • Alba Herrera-Palacio, Carles Ventura, Carina Silberer, Ionut-Teodor Sorodoc, Gemma Boleda, Xavier Giro-i-Nieto

The goal of this work is to segment the objects in an image that are referred to by a sequence of linguistic descriptions (referring expressions).

Referring Expression Referring Expression Segmentation +1

Paper
Add Code

Automatic Reminiscence Therapy for Dementia

1 code implementation • 25 Oct 2019 • Mariona Caros, Maite Garolera, Petia Radeva, Xavier Giro-i-Nieto

With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily.

Paper
Code

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

1 code implementation • 5 Oct 2019 • Benet Oriol Sabat, Cristian Canton Ferrer, Xavier Giro-i-Nieto

This work addresses the challenge of hate speech detection in Internet memes, and attempts using visual information to automatically detect hate speech, unlike any previous work of our knowledge.

Hate Speech Detection

Paper
Code

Simple vs complex temporal recurrences for video saliency prediction

2 code implementations • 3 Jul 2019 • Panagiotis Linardos, Eva Mohedano, Juan Jose Nieto, Noel E. O'Connor, Xavier Giro-i-Nieto, Kevin McGuinness

This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain.

Ranked #9 on Video Saliency Detection on MSU Video Saliency Prediction

Saliency Prediction Video Saliency Detection +1

Paper
Code

Budget-aware Semi-Supervised Semantic and Instance Segmentation

no code implementations • 14 May 2019 • Miriam Bellver, Amaia Salvador, Jordi Torres, Xavier Giro-i-Nieto

Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention.

Image Segmentation Instance Segmentation +2

Paper
Add Code

Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks

3 code implementations • 25 Mar 2019 • Amanda Duarte, Francisco Roldan, Miquel Tubau, Janna Escur, Santiago Pascual, Amaia Salvador, Eva Mohedano, Kevin McGuinness, Jordi Torres, Xavier Giro-i-Nieto

Speech is a rich biometric signal that contains information about the identity, gender and emotional state of the speaker.

Face Generation Generative Adversarial Network

160

Paper
Code

RVOS: End-to-End Recurrent Network for Video Object Segmentation

1 code implementation • CVPR 2019 • Carles Ventura, Miriam Bellver, Andreu Girbau, Amaia Salvador, Ferran Marques, Xavier Giro-i-Nieto

Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence.

Ranked #1 on One-shot visual object segmentation on YouTube-VOS

Object One-shot visual object segmentation +3

278

Paper
Code

Inverse Cooking: Recipe Generation from Food Images

4 code implementations • CVPR 2019 • Amaia Salvador, Michal Drozdzal, Xavier Giro-i-Nieto, Adriana Romero

Our system predicts ingredients as sets by means of a novel architecture, modeling their dependencies without imposing any order, and then generates cooking instructions by attending to both image and its inferred ingredients simultaneously.

Ranked #1 on Recipe Generation on Recipe1M

Recipe Generation Retrieval

613

Paper
Code

Importance Weighted Evolution Strategies

no code implementations • 12 Nov 2018 • Víctor Campos, Xavier Giro-i-Nieto, Jordi Torres

Evolution Strategies (ES) emerged as a scalable alternative to popular Reinforcement Learning (RL) techniques, providing an almost perfect speedup when distributed across hundreds of CPU cores thanks to a reduced communication overhead.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

1 code implementation • 3 Sep 2018 • Marc Assens, Xavier Giro-i-Nieto, Kevin McGuinness, Noel E. O'Connor

We introduce PathGAN, a deep neural network for visual scanpath prediction trained on adversarial examples.

Scanpath prediction

Paper
Code

Temporal Saliency Adaptation in Egocentric Videos

2 code implementations • 28 Aug 2018 • Panagiotis Linardos, Eva Mohedano, Monica Cherto, Cathal Gurrin, Xavier Giro-i-Nieto

This work adapts a deep neural model for image saliency prediction to the temporal domain of egocentric video.

Saliency Prediction Video Saliency Prediction

Paper
Code

Comparing Fixed and Adaptive Computation Time for Recurrent Neural Networks

no code implementations • 21 Mar 2018 • Daniel Fojo, Víctor Campos, Xavier Giro-i-Nieto

Adaptive Computation Time for Recurrent Neural Networks (ACT) is one of the most promising architectures for variable computation.

Paper
Add Code

Online Detection of Action Start in Untrimmed, Streaming Videos

no code implementations • ECCV 2018 • Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giro-i-Nieto, Shih-Fu Chang

We aim to tackle a novel task in action detection - Online Detection of Action Start (ODAS) in untrimmed, streaming videos.

Action Detection Generative Adversarial Network

Paper
Add Code

Recurrent Neural Networks for Semantic Instance Segmentation

1 code implementation • 2 Dec 2017 • Amaia Salvador, Miriam Bellver, Victor Campos, Manel Baradad, Ferran Marques, Jordi Torres, Xavier Giro-i-Nieto

We present a recurrent model for semantic instance segmentation that sequentially generates binary masks and their associated class probabilities for every object in an image.

Instance Segmentation Object +2

132

Paper
Code

Detection-aided liver lesion segmentation using deep learning

2 code implementations • 29 Nov 2017 • Miriam Bellver, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Xavier Giro-i-Nieto, Jordi Torres, Luc van Gool

A fully automatic technique for segmenting the liver and localizing its unhealthy tissues is a convenient tool in order to diagnose hepatic diseases and assess the response to the according treatments.

Computed Tomography (CT) Lesion Segmentation +1

Paper
Code

Saliency Weighted Convolutional Features for Instance Search

1 code implementation • 29 Nov 2017 • Eva Mohedano, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor

This work explores attention models to weight the contribution of local convolutional representations for the instance search task.

Instance Search Retrieval

Paper
Code

Cost-Effective Active Learning for Melanoma Segmentation

2 code implementations • 24 Nov 2017 • Marc Gorriz, Axel Carlier, Emmanuel Faure, Xavier Giro-i-Nieto

We propose a novel Active Learning framework capable to train effectively a convolutional neural network for semantic segmentation of medical imaging, with a limited amount of training labeled data.

Active Learning Image Segmentation +3

276

Paper
Code

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

3 code implementations • ICLR 2018 • Victor Campos, Brendan Jou, Xavier Giro-i-Nieto, Jordi Torres, Shih-Fu Chang

We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph.

124

Paper
Code

More cat than cute? Interpretable Prediction of Adjective-Noun Pairs

1 code implementation • 21 Aug 2017 • Delia Fernandez, Alejandro Woodward, Victor Campos, Xavier Giro-i-Nieto, Brendan Jou, Shih-Fu Chang

This work aims at disentangling the contributions of the `adjectives' and `nouns' in the visual prediction of ANPs.

Paper
Code

Disentangling Motion, Foreground and Background Features in Videos

1 code implementation • 13 Jul 2017 • Xunyu Lin, Victor Campos, Xavier Giro-i-Nieto, Jordi Torres, Cristian Canton Ferrer

This paper introduces an unsupervised framework to extract semantically rich features for video representation.

Paper
Code

SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes

1 code implementation • 11 Jul 2017 • Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor

The first part of the network consists of a model trained to generate saliency volumes, whose parameters are fit by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes.

Scanpath prediction

Paper
Code

Class-Weighted Convolutional Features for Visual Instance Search

2 code implementations • 9 Jul 2017 • Albert Jimenez, Jose M. Alvarez, Xavier Giro-i-Nieto

In this paper, we go beyond this spatial information and propose a local-aware encoding of convolutional features based on semantic information predicted in the target image.

Image Retrieval Instance Search +2

223

Paper
Code

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

3 code implementations • 4 Jan 2017 • Junting Pan, Cristian Canton Ferrer, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol, Xavier Giro-i-Nieto

We introduce SalGAN, a deep convolutional neural network for visual saliency prediction trained with adversarial examples.

Binary Classification General Classification +1

368

Paper
Code

Hierarchical Object Detection with Deep Reinforcement Learning

1 code implementation • 11 Nov 2016 • Miriam Bellver, Xavier Giro-i-Nieto, Ferran Marques, Jordi Torres

We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.

Object object-detection +4

423

Paper
Code

Open-Ended Visual Question-Answering

1 code implementation • 9 Oct 2016 • Issey Masuda, Santiago Pascual de la Puente, Xavier Giro-i-Nieto

This thesis report studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework.

Question Answering Sentence +3

Paper
Code

Where is my Phone ? Personal Object Retrieval from Egocentric Images

no code implementations • 29 Aug 2016 • Cristian Reyes, Eva Mohedano, Kevin McGuinness, Noel E. O'Connor, Xavier Giro-i-Nieto

This work presents a retrieval pipeline and evaluation scheme for the problem of finding the last appearance of personal objects in a large dataset of images captured from a wearable camera.

Retrieval

Paper
Add Code

Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks

3 code implementations • 29 Aug 2016 • Alberto Montes, Amaia Salvador, Santiago Pascual, Xavier Giro-i-Nieto

This thesis explore different approaches using Convolutional and Recurrent Neural Networks to classify and temporally localize activities on videos, furthermore an implementation to achieve it has been proposed.

Action Detection Activity Detection

195

Paper
Code

Faster R-CNN Features for Instance Search

3 code implementations • 29 Apr 2016 • Amaia Salvador, Xavier Giro-i-Nieto, Ferran Marques, Shin'ichi Satoh

This work explores the suitability for instance retrieval of image- and region-wise representations pooled from an object detection CNN such as Faster R-CNN.

Instance Search object-detection +3

217

Paper
Code

Bags of Local Convolutional Features for Scalable Instance Search

2 code implementations • 15 Apr 2016 • Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O'Connor, Xavier Giro-i-Nieto

This work proposes a simple instance retrieval pipeline based on encoding the convolutional features of CNN using the bag of words aggregation scheme (BoW).

Instance Search Retrieval

111

Paper
Code

From Pixels to Sentiment: Fine-tuning CNNs for Visual Sentiment Prediction

2 code implementations • 12 Apr 2016 • Victor Campos, Brendan Jou, Xavier Giro-i-Nieto

Visual multimedia have become an inseparable part of our digital social lives, and they often capture moments tied with deep affections.

Sentiment Analysis Visual Sentiment Prediction

Paper
Code

Shallow and Deep Convolutional Networks for Saliency Prediction

1 code implementation • CVPR 2016 • Junting Pan, Kevin McGuinness, Elisa Sayrol, Noel O'Connor, Xavier Giro-i-Nieto

The prediction of salient areas in images has been traditionally addressed with hand-crafted features based on neuroscience principles.

Saliency Prediction

185

Paper
Code

Cultural Event Recognition with Visual ConvNets and Temporal Models

no code implementations • 24 Apr 2015 • Amaia Salvador, Matthias Zeppelzauer, Daniel Manchon-Vizuete, Andrea Calafell, Xavier Giro-i-Nieto

Our solution is based on the combination of visual features extracted from convolutional neural networks with temporal information using a hierarchical classifier scheme.

Classification General Classification

Paper
Add Code

Object Segmentation in Images using EEG Signals

no code implementations • 19 Aug 2014 • Eva Mohedano, Graham Healy, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O'Connor, Alan F. Smeaton

This paper explores the potential of brain-computer interfaces in segmenting objects from images.

EEG Object +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.