Search Results for author: Trisha Mittal

Found 13 papers, 4 papers with code

Naturalistic Head Motion Generation from Speech

no code implementations26 Oct 2022 Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald

Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience.

3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos

no code implementations CVPR 2022 Vikram Gupta, Trisha Mittal, Puneet Mathur, Vaibhav Mishra, Mayank Maheshwari, Aniket Bera, Debdoot Mukherjee, Dinesh Manocha

We present 3MASSIV, a multilingual, multimodal and multi-aspect, expertly-annotated dataset of diverse short videos extracted from short-video social media platform - Moj.

Dynamic Graph Modeling of Simultaneous EEG and Eye-tracking Data for Reading Task Identification

no code implementations21 Feb 2021 Puneet Mathur, Trisha Mittal, Dinesh Manocha

We present a new approach, that we call AdaGTCN, for identifying human reader intent from Electroencephalogram~(EEG) and Eye movement~(EM) data in order to help differentiate between normal reading and task-oriented reading.

EEG Graph Learning

Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues

no code implementations14 Mar 2020 Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Additionally, we extract and compare affective cues corresponding to perceived emotion from the two modalities within a video to infer whether the input video is "real" or "fake".

DeepFake Detection Face Swapping

Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs

no code implementations arXiv 2019 Rohan Chandra, Tianrui Guan, Srujan Panuganti, Trisha Mittal, Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha

In practice, our approach reduces the average prediction error by more than 54% over prior algorithms and achieves a weighted average accuracy of 91. 2% for behavior prediction.

Robotics

M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues

no code implementations9 Nov 2019 Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

Our approach combines cues from multiple co-occurring modalities (such as face, text, and speech) and also is more robust than other methods to sensor noise in any of the individual modalities.

Multimodal Emotion Recognition

STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

1 code implementation28 Oct 2019 Uttaran Bhattacharya, Trisha Mittal, Rohan Chandra, Tanmay Randhavane, Aniket Bera, Dinesh Manocha

We use hundreds of annotated real-world gait videos and augment them with thousands of annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE).

General Classification

Game of Sketches: Deep Recurrent Models of Pictionary-style Word Guessing

1 code implementation29 Jan 2018 Ravi Kiran Sarvadevabhatla, Shiv Surya, Trisha Mittal, Venkatesh Babu Radhakrishnan

Similarly, performance on multi-disciplinary tasks such as Visual Question Answering (VQA) is considered a marker for gauging progress in Computer Vision.

Question Answering Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.