1 code implementation • 27 Mar 2024 • Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai
With these findings, we advocate using COCO-ReM for future object detection research.
1 code implementation • 21 Dec 2023 • Jitesh Jain, Jianwei Yang, Humphrey Shi
Secondly, we leverage the images from COCO and outputs from off-the-shelf vision perception models to create our COCO Segmentation Text (COST) dataset for training and evaluating MLLMs on the object perception task.
1 code implementation • 8 Jun 2023 • Jiachen Li, Jitesh Jain, Humphrey Shi
In this paper, we propose the Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
2 code implementations • CVPR 2023 • Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi
However, such panoptic architectures do not truly unify image segmentation because they need to be trained individually on the semantic, instance, or panoptic segmentation to achieve the best performance.
Ranked #1 on Panoptic Segmentation on COCO minival
1 code implementation • 5 Aug 2022 • Jitesh Jain, Yuqian Zhou, Ning Yu, Humphrey Shi
We claim that the performance of inpainting algorithms can be better judged by the generated structures and textures.
1 code implementation • arXiv 2021 • Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi
To achieve this, we propose SeMask, a simple and effective framework that incorporates semantic information into the encoder with the help of a semantic attention operation.
Ranked #10 on Semantic Segmentation on Cityscapes val
1 code implementation • 19 Sep 2020 • Ayush Mangal, Jitesh Jain, Keerat Kaur Guliani, Omkar Bhalerao
While previous approaches used the past as an indicator of the future, we instead explicitly model the future frequency and recency in a multi-task fashion with prefetching, leveraging the abilities of deep networks to capture futuristic trends and use them for learning eviction and admission.