Search Results for author: Dianmo Sheng

Found 2 papers, 1 papers with code

Towards More Unified In-context Visual Understanding

no code implementations • 5 Dec 2023 • Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu

Thanks to this design, the model is capable of handling in-context vision understanding tasks with multimodal output in a unified pipeline. Experimental results demonstrate that our model achieves competitive performance compared with specialized models and previous ICL baselines.

Decoder Image Captioning +2

Paper
Add Code

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

1 code implementation • 7 Dec 2022 • Hanqing Zhao, Dianmo Sheng, Jianmin Bao, Dongdong Chen, Dong Chen, Fang Wen, Lu Yuan, Ce Liu, Wenbo Zhou, Qi Chu, Weiming Zhang, Nenghai Yu

We demonstrate for the first time that using a text2image model to generate images or zero-shot recognition model to filter noisily crawled images for different object categories is a feasible way to make Copy-Paste truly scalable.

Ranked #7 on Instance Segmentation on LVIS v1.0 val

Data Augmentation Instance Segmentation +5

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.