no code implementations • 5 May 2024 • Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou
Predicting future human pose is a fundamental application for machine intelligence, which drives robots to plan their behavior and paths ahead of time to seamlessly accomplish human-robot collaboration in real-world 3D scenarios.
no code implementations • 16 Mar 2024 • Rui Wang, Hailong Guo, Jiaming Liu, Huaxia Li, Haibo Zhao, Xu Tang, Yao Hu, Hao Tang, Peipei Li
In this paper, we introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks, including GC text-to-image, controllable GC text-to-image, stylized GC text-to-image, and robust virtual try-on.
no code implementations • 12 Mar 2024 • Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhao
Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios.
2 code implementations • 15 Jan 2024 • Qixun Wang, Xu Bai, Haofan Wang, Zekui Qin, Anthony Chen, Huaxia Li, Xu Tang, Yao Hu
There has been significant progress in personalized image synthesis with methods such as Textual Inversion, DreamBooth, and LoRA.
Ranked #2 on Diffusion Personalization Tuning Free on AgeDB
1 code implementation • 28 Dec 2023 • Shanglin Li, Bohan Zeng, Yutang Feng, Sicheng Gao, Xuhui Liu, Jiaming Liu, Li Lin, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang
We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.
1 code implementation • 26 Dec 2023 • Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing
Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.
1 code implementation • 9 Oct 2023 • Bohan Zeng, Shanglin Li, Yutang Feng, Hong Li, Sicheng Gao, Jiaming Liu, Huaxia Li, Xu Tang, Jianzhuang Liu, Baochang Zhang
Recent advances in 3D generation have been remarkable, with methods such as DreamFusion leveraging large-scale text-to-image diffusion-based models to supervise 3D generation.
no code implementations • 13 Sep 2023 • Xiangrong Zhang, Tianyang Zhang, Guanchun Wang, Peng Zhu, Xu Tang, Xiuping Jia, Licheng Jiao
In this era of rapid technical evolution, this review aims to present a comprehensive review of the recent achievements in deep learning based RSOD methods.
1 code implementation • 17 Aug 2023 • Liang Pan, Jingbo Wang, Buzhen Huang, Junyu Zhang, Haofan Wang, Xu Tang, Yangang Wang
Experimental results demonstrate that our framework can synthesize physically plausible long-term human motions in complex 3D scenes.
1 code implementation • 17 May 2023 • Bohan Zeng, Shanglin Li, Xuhui Liu, Sicheng Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Jianzhuang Liu, Baochang Zhang
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
1 code implementation • 23 Apr 2023 • Cilin Yan, Haochen Wang, Jie Liu, XiaoLong Jiang, Yao Hu, Xu Tang, Guoliang Kang, Efstratios Gavves
Click-based interactive segmentation aims to generate target masks via human clicking, which facilitates efficient pixel-level annotation and image editing.
no code implementations • 14 Apr 2023 • Jie Guo, Qimeng Wang, Yan Gao, XiaoLong Jiang, Xu Tang, Yao Hu, Baochang Zhang
CLIP (Contrastive Language-Image Pretraining) is well-developed for open-vocabulary zero-shot image-level recognition, while its applications in pixel-level tasks are less investigated, where most efforts directly adopt CLIP features without deliberative adaptations.
1 code implementation • ICCV 2023 • Haochen Wang, Cilin Yan, Shuai Wang, XiaoLong Jiang, Xu Tang, Yao Hu, Weidi Xie, Efstratios Gavves
Video Instance Segmentation (VIS) aims at segmenting and categorizing objects in videos from a closed set of training categories, lacking the generalization ability to handle novel categories in real-world videos.
no code implementations • 8 Mar 2023 • Yuqun Yang, Xu Tang, Xiangrong Zhang, Jingjing Ma, Licheng Jiao
Therefore, there is a novel solution that intuitively dividing changes into three trends (``appear'', ``disappear'' and ``transform'') instead of semantic categories, named it trend change detection (TCD) in this paper.
1 code implementation • CVPR 2023 • Keyan Chen, XiaoLong Jiang, Yao Hu, Xu Tang, Yan Gao, Jianqi Chen, Weidi Xie
In this paper, we consider the problem of simultaneously detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.
Ranked #1 on Open Vocabulary Attribute Detection on OVAD benchmark (using extra training data)
no code implementations • 21 Apr 2022 • Guanchun Wang, Xiangrong Zhang, Zelin Peng, Xu Tang, Huiyu Zhou, Licheng Jiao
In the exploiting stage, we utilize the extracted NDI to construct a novel negative contrastive learning mechanism and a negative guided instance selection strategy for dealing with the issues of part domination and missing instances, respectively.
no code implementations • 24 Mar 2022 • Yuting Yang, Licheng Jiao, Xu Liu, Fang Liu, Shuyuan Yang, Zhixi Feng, Xu Tang
Three image tasks and two video tasks of computer vision are investigated.
no code implementations • 2 Feb 2022 • Yan Gao, Qimeng Wang, Xu Tang, Haochen Wang, Fei Ding, Jing Li, Yao Hu
Prior works propose to predict Intersection-over-Union (IoU) between bounding boxes and corresponding ground-truths to improve NMS, while accurately predicting IoU is still a challenging problem.
1 code implementation • CVPR 2022 • Yicheng Qian, Weixin Luo, Dongze Lian, Xu Tang, Peilin Zhao, Shenghua Gao
In this paper, we propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations but still conducting the same task.
no code implementations • 25 Jul 2021 • Tianyang Zhang, Xiangrong Zhang, Peng Zhu, Xu Tang, Chen Li, Licheng Jiao, Huiyu Zhou
To address the above problems, we propose an end-to-end multi-category instance segmentation model, namely Semantic Attention and Scale Complementary Network, which mainly consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB).
1 code implementation • 18 Jun 2021 • Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Shiwei Zhang, Song Bai, Xiang Bai
Temporal action detection (TAD) aims to determine the semantic label and the temporal interval of every action instance in an untrimmed video.
Ranked #8 on Temporal Action Localization on HACS
no code implementations • IEEE Transactions on Neural Networks and Learning Systems 2021 • Licheng Jiao, Ruohan Zhang, Fang Liu, Shuyuan Yang, Biao Hou, Lingling Li, Xu Tang
Video object detection, a basic task in the computer vision field, is rapidly evolving and widely used.
no code implementations • 26 Aug 2020 • Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu
Unlike many existing trackers that focus on modeling only the target, in this work, we consider the \emph{transient variations of the whole scene}.
no code implementations • 27 Jun 2020 • Jiahua Dong, Yang Cong, Gan Sun, Tao Zhang, Xu Tang, Xiaowei Xu
Online metric learning has been widely exploited for large-scale data classification due to the low computational cost.
no code implementations • 19 Dec 2019 • Yang Liu, Xu Tang, Xiang Wu, Junyu Han, Jingtuo Liu, Errui Ding
In this paper, we propose an Online High-quality Anchor Mining Strategy (HAMBox), which explicitly helps outer faces compensate with high-quality anchors.
no code implementations • 15 Apr 2019 • Shuai Chen, Jinpeng Li, Chuanqi Yao, Wenbo Hou, Shuo Qin, Wenyao Jin, Xu Tang
Working with multi-scale features, the designed dual scale residual unit makes dual scale detectors no longer run independently.
4 code implementations • 31 Mar 2019 • Zhihang Li, Xu Tang, Junyu Han, Jingtuo Liu, Ran He
With the rapid development of deep convolutional neural network, face detection has made great progress in recent years.
no code implementations • 19 Jul 2018 • Lin Cheng, Xu Liu, Lingling Li, Licheng Jiao, Xu Tang
More recently, a two-stage detector Faster R-CNN is proposed and demonstrated to be a promising tool for object detection in optical remote sensing images, while the sparse and dense characteristic of objects in remote sensing images is complexity.
1 code implementation • 9 Jul 2018 • Xu Liu, Licheng Jiao, Xu Tang, Qigong Sun, Dan Zhang
Based on sparse scattering coding and convolution neural network, the polarimetric convolutional network is proposed to classify PolSAR images by making full use of polarimetric information.
2 code implementations • CVPR 2018 • Zongwei Wang, Xu Tang, Weixin Luo, Shenghua Gao
By grouping faces with target age together, the objective of face aging is equivalent to transferring aging patterns of faces within the target age group to the face whose aged face is to be synthesized.
5 code implementations • ECCV 2018 • Xu Tang, Daniel K. Du, Zeqiang He, Jingtuo Liu
This paper proposes a novel context-assisted single shot face detector, named \emph{PyramidBox} to handle the hard face detection problem.
Ranked #4 on Face Detection on FDDB