no code implementations • 5 May 2024 • Changan Chen, Jordi Ramos, Anshul Tomar, Kristen Grauman
We propose the first treatment of sim2real for audio-visual navigation by disentangling it into acoustic field prediction (AFP) and waypoint navigation.
no code implementations • 5 Sep 2023 • Yunhao Yang, Anshul Tomar
The rapid advancement of large language models, such as the Generative Pre-trained Transformer (GPT) series, has had significant implications across various disciplines.
no code implementations • 26 Jan 2021 • Vishal Kaushal, Suraj Kothawade, Anshul Tomar, Rishabh Iyer, Ganesh Ramakrishnan
For long videos, human reference summaries necessary for supervised video summarization techniques are difficult to obtain.