no code implementations • 17 Sep 2023 • Burak Satar, Hongyuan Zhu, Hanwang Zhang, Joo Hwee Lim
Many studies focus on improving pretraining or developing new backbones in text-video retrieval.
no code implementations • 7 Jun 2023 • Burak Satar, Hongyuan Zhu, Hanwang Zhang, Joo Hwee Lim
Text-video retrieval contains various challenges, including biases coming from diverse sources.
no code implementations • 9 Dec 2022 • Manas Gupta, Sarthak Ketanbhai Modi, Hang Zhang, Joon Hei Lee, Joo Hwee Lim
Four of the five Bio-algorithms tested outperform BP by upto 5% accuracy when only 20% of the training dataset is available.
no code implementations • 21 Nov 2022 • Zenglin Shi, Ying Sun, Joo Hwee Lim, Mengmi Zhang
To the best of our knowledge, no existing technique can accomplish all of these objectives simultaneously.
no code implementations • 9 Nov 2022 • Yew Lee Tan, Ernest Yu Kai Chew, Adams Wai-Kin Kong, Jung-jae Kim, Joo Hwee Lim
To generate the portmanteau feature, a non-linear input pipeline with a block matrix initialization is presented.
no code implementations • 3 Aug 2022 • Mei Chee Leong, Haosong Zhang, Hui Li Tan, Liyuan Li, Joo Hwee Lim
Fine-grained action recognition is a challenging task in computer vision.
1 code implementation • 29 Jun 2022 • Burak Satar, Hongyuan Zhu, Hanwang Zhang, Joo Hwee Lim
In this report, we present our approach for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022.
Ranked #9 on Multi-Instance Retrieval on EPIC-KITCHENS-100
1 code implementation • 26 Jun 2022 • Burak Satar, Hongyuan Zhu, Xavier Bresson, Joo Hwee Lim
With the emergence of social media, voluminous video clips are uploaded every day, and retrieving the most relevant visual content with a language query becomes critical.
Ranked #13 on Video Retrieval on YouCook2
1 code implementation • 26 Jun 2022 • Burak Satar, Hongyuan Zhu, Hanwang Zhang, Joo Hwee Lim
Most methods consider only one joint embedding space between global visual and textual features without considering the local structures of each modality.
Ranked #12 on Video Retrieval on YouCook2
no code implementations • 28 Nov 2021 • Kenan E. Ak, Joo Hwee Lim, Ying Sun, Jo Yew Tham, Ashraf A. Kassim
A key challenge in e-commerce is that images have multiple attributes where users would like to manipulate and it is important to estimate discriminative feature representations for each of these attributes.
no code implementations • 12 Oct 2021 • Mei Chee Leong, Hui Li Tan, Haosong Zhang, Liyuan Li, Feng Lin, Joo Hwee Lim
Inspired by the recently proposed hierarchy representation of fine-grained actions in FineGym and SlowFast network for action recognition, we propose a novel multi-task network which exploits the FineGym hierarchy representation to achieve effective joint learning and prediction for fine-grained human action recognition.
no code implementations • 25 Sep 2019 • Mengmi Zhang, Tao Wang, Joo Hwee Lim, Jiashi Feng
Without tampering with the performance on initial tasks, our method learns novel concepts given a few training examples of each class in new tasks.
1 code implementation • 23 May 2019 • Mengmi Zhang, Tao Wang, Joo Hwee Lim, Gabriel Kreiman, Jiashi Feng
In each classification task, our method learns a set of variational prototypes with their means and variances, where embedding of the samples from the same class can be represented in a prototypical distribution and class-representative prototypes are separated apart.
1 code implementation • 31 Jul 2018 • Mengmi Zhang, Keng Teck Ma, Shih-Cheng Yen, Joo Hwee Lim, Qi Zhao, Jiashi Feng
Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective.
no code implementations • CVPR 2018 • Kenan E. Ak, Ashraf A. Kassim, Joo Hwee Lim, Jo Yew Tham
In this paper, we investigate ways of conducting a detailed fashion search using query images and attributes.
no code implementations • ICLR 2018 • Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Shih-Cheng Yen, Qi Zhao, Jiashi Feng
During the exploration, our proposed ESM network model updates belief of the global map based on local observations using a recurrent neural network.
1 code implementation • CVPR 2017 • Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, Jiashi Feng
Through competition with discriminator, the generator progressively improves quality of the future frames and thus anticipates future gaze better.