Search Results for author: Swathikiran Sudhakaran

Found 15 papers, 6 papers with code

Relevance-based Margin for Contrastively-trained Video Retrieval Models

1 code implementation • 27 Apr 2022 • Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera, Oswald Lanz

We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance.

Ranked #7 on Multi-Instance Retrieval on EPIC-KITCHENS-100

Multi-Instance Retrieval Natural Language Queries +2

Paper
Code

Gate-Shift-Fuse for Video Action Recognition

1 code implementation • 16 Mar 2022 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs.

Ranked #17 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)

Action Recognition Temporal Action Localization +1

Paper
Code

SAIC_Cambridge-HuPBA-FBK Submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021

no code implementations • 6 Oct 2021 • Swathikiran Sudhakaran, Adrian Bulat, Juan-Manuel Perez-Rua, Alex Falcon, Sergio Escalera, Oswald Lanz, Brais Martinez, Georgios Tzimiropoulos

This report presents the technical details of our submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021.

Action Recognition Temporal Action Localization

Paper
Add Code

Space-time Mixing Attention for Video Transformer

1 code implementation • NeurIPS 2021 • Adrian Bulat, Juan-Manuel Perez-Rua, Swathikiran Sudhakaran, Brais Martinez, Georgios Tzimiropoulos

In this work, we propose a Video Transformer model the complexity of which scales linearly with the number of frames in the video sequence and hence induces no overhead compared to an image-based Transformer model.

Ranked #32 on Action Classification on Kinetics-600

Action Classification Action Recognition In Videos +1

Paper
Code

Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries

no code implementations • 16 Feb 2021 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

We present EgoACO, a deep neural architecture for video action recognition that learns to pool action-context-object descriptors from frame level features by leveraging the verb-noun structure of action labels in egocentric video datasets.

Action Recognition Object +1

Paper
Add Code

FBK-HUPBA Submission to the EPIC-Kitchens Action Recognition 2020 Challenge

no code implementations • 24 Jun 2020 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

In this report we describe the technical details of our submission to the EPIC-Kitchens Action Recognition 2020 Challenge.

Action Recognition

Paper
Add Code

Gate-Shift Networks for Video Action Recognition

2 code implementations • CVPR 2020 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Deep 3D CNNs for video action recognition are designed to learn powerful representations in the joint spatio-temporal feature space.

Ranked #26 on Action Recognition on Something-Something V1 (using extra training data)

Action Recognition

150

Paper
Code

An Analysis of Deep Neural Networks with Attention for Action Recognition from a Neurophysiological Perspective

no code implementations • 2 Jul 2019 • Swathikiran Sudhakaran, Oswald Lanz

We review three recent deep learning based methods for action recognition and present a brief comparative analysis of the methods from a neurophyisiological point of view.

Action Recognition

Paper
Add Code

FBK-HUPBA Submission to the EPIC-Kitchens 2019 Action Recognition Challenge

no code implementations • 21 Jun 2019 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

In this report we describe the technical details of our submission to the EPIC-Kitchens 2019 action recognition challenge.

Action Recognition

Paper
Add Code

Hierarchical Feature Aggregation Networks for Video Action Recognition

no code implementations • 29 May 2019 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Most action recognition methods base on a) a late aggregation of frame level CNN features using average pooling, max pooling, or RNN, among others, or b) spatio-temporal aggregation via 3D convolutions.

Ranked #51 on Action Recognition on HMDB-51 (using extra training data)

Action Recognition Temporal Action Localization

Paper
Add Code

LSTA: Long Short-Term Attention for Egocentric Action Recognition

1 code implementation • CVPR 2019 • Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Egocentric activity recognition is one of the most challenging tasks in video analysis.

Ranked #5 on Egocentric Activity Recognition on EGTEA

Action Recognition Egocentric Activity Recognition +1

Paper
Code

Top-down Attention Recurrent VLAD Encoding for Action Recognition in Videos

no code implementations • 29 Aug 2018 • Swathikiran Sudhakaran, Oswald Lanz

Most recent approaches for action recognition from video leverage deep architectures to encode the video clip into a fixed length representation vector that is then used for classification.

Action Recognition In Videos General Classification +2

Paper
Add Code

Attention is All We Need: Nailing Down Object-centric Attention for Egocentric Activity Recognition

1 code implementation • 31 Jul 2018 • Swathikiran Sudhakaran, Oswald Lanz

Our model is built on the observation that egocentric activities are highly characterized by the objects and their locations in the video.

Ranked #6 on Egocentric Activity Recognition on EGTEA

Egocentric Activity Recognition Hand Segmentation

Paper
Code

Learning to Detect Violent Videos using Convolutional Long Short-Term Memory

no code implementations • 19 Sep 2017 • Swathikiran Sudhakaran, Oswald Lanz

A convolutional neural network is used to extract frame level features from a video.

Paper
Add Code

Convolutional Long Short-Term Memory Networks for Recognizing First Person Interactions

no code implementations • 19 Sep 2017 • Swathikiran Sudhakaran, Oswald Lanz

The proposed approach uses a pair of convolutional neural networks, whose parameters are shared, for extracting frame level features from successive frames of the video.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.