no code implementations • ECCV 2020 • Lu Zhang, Jianming Zhang, Zhe Lin, Radomír Měch, Huchuan Lu, You He
We reformulate the problem of detecting and tracking of salient object spots as a new task called object hotspot tracking.
no code implementations • 15 Mar 2024 • Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, He Zhang, Wei Xiong, Daniel Aliaga
Generative object compositing emerges as a promising new avenue for compositional image editing.
1 code implementation • 14 Jan 2024 • Mingzhe Gao, Jieru Zhao, Zhe Lin, Minyi Guo
High-level synthesis (HLS) notably speeds up the hardware design process by avoiding RTL programming.
1 code implementation • 22 Dec 2023 • Nannan Li, Qing Liu, Krishna Kumar Singh, Yilin Wang, Jianming Zhang, Bryan A. Plummer, Zhe Lin
In this paper, we propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings.
no code implementations • 8 Dec 2023 • Jaskirat Singh, Jianming Zhang, Qing Liu, Cameron Smith, Zhe Lin, Liang Zheng
To overcome these limitations, we introduce SmartMask, which allows any novice user to create detailed masks for precise object insertion.
no code implementations • 4 Dec 2023 • Kangfu Mei, Luis Figueroa, Zhe Lin, Zhihong Ding, Scott Cohen, Vishal M. Patel
Recovering textures under shadows has remained a challenging problem due to the difficulty of inferring shadow-free scenes from shadow images.
no code implementations • 6 Nov 2023 • Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu
On the highly competitive ADE20K and COCO benchmarks, our data generation method markedly improves the performance of state-of-the-art segmentation models in semantic segmentation, panoptic segmentation, and instance segmentation.
1 code implementation • ICCV 2023 • Lingzhi Zhang, Zhengjie Xu, Connelly Barnes, Yuqian Zhou, Qing Liu, He Zhang, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi
Recent advancements in deep generative models have facilitated the creation of photo-realistic images across various tasks.
1 code implementation • 24 Aug 2023 • Ziyan Yang, Kushal Kafle, Zhe Lin, Scott Cohen, Zhihong Ding, Vicente Ordonez
To solve this problem, we propose an auto-regressive model that given a subject, it predicts its relations, objects, and object locations by casting this output as a sequence of tokens.
no code implementations • 23 Jul 2023 • Guan Shen, Jieru Zhao, Zeke Wang, Zhe Lin, Wenchao Ding, Chentao Wu, Quan Chen, Minyi Guo
Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly.
1 code implementation • 28 May 2023 • Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu, Ming-Hsuan Yang
Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved.
1 code implementation • CVPR 2023 • Chuong Huynh, Yuqian Zhou, Zhe Lin, Connelly Barnes, Eli Shechtman, Sohrab Amirghodsi, Abhinav Shrivastava
In photo editing, it is common practice to remove visual distractions to improve the overall image quality and highlight the primary subject.
no code implementations • 18 May 2023 • Lihui Qian, Xintong Han, Faqiang Wang, Hongyu Liu, Haoye Dong, Zhiwen Li, Huawei Wei, Zhe Lin, Cheng-Bin Jin
We present XFormer, a novel human mesh and motion capture method that achieves real-time performance on consumer CPUs given only monocular images as input.
Ranked #31 on 3D Human Pose Estimation on 3DPW
1 code implementation • ICCV 2023 • Qiucheng Wu, Yujian Liu, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang
We then impose spatial attention control by combining the attention over the entire text description and that over the local description of the particular object in the corresponding pixel region of that object.
no code implementations • 6 Apr 2023 • Jing Shi, Wei Xiong, Zhe Lin, Hyun Joon Jung
First, we learn the general concept of the input images by converting them to a textual token with a learnable image encoder.
Diffusion Personalization Tuning Free Text-to-Image Generation
no code implementations • CVPR 2023 • Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.
1 code implementation • CVPR 2023 • Mang Tik Chiu, Xuaner Zhang, Zijun Wei, Yuqian Zhou, Eli Shechtman, Connelly Barnes, Zhe Lin, Florian Kainz, Sohrab Amirghodsi, Humphrey Shi
In this paper, we present an automatic wire clean-up system that eases the process of wire segmentation and removal/inpainting to within a few seconds.
1 code implementation • 8 Mar 2023 • Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia
This paper presents Video-P2P, a novel framework for real-world video editing with cross-attention control.
1 code implementation • 22 Feb 2023 • Hongyu Liu, Xintong Han, ChengBin Jin, Lihui Qian, Huawei Wei, Zhe Lin, Faqiang Wang, Haoye Dong, Yibing Song, Jia Xu, Qifeng Chen
In this paper, we propose Human MotionFormer, a hierarchical ViT framework that leverages global and local perceptions to capture large and subtle motion matching, respectively.
no code implementations • ICCV 2023 • Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang
Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.
no code implementations • CVPR 2023 • Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga
Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.
1 code implementation • CVPR 2023 • Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang
Based on this finding, we further propose a simple, light-weight image editing algorithm where the mixing weights of the two text embeddings are optimized for style matching and content preservation.
no code implementations • 13 Dec 2022 • Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Qing Liu, Yuqian Zhou, Sohrab Amirghodsi, Jiebo Luo
Moreover, the object-level discriminators take aligned instances as inputs to enforce the realism of individual objects.
no code implementations • CVPR 2023 • Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, Kun Zhang
By contrast, multi-modal inpainting provides more flexible and useful controls on the inpainted content, \eg, a text prompt can be used to describe an object with richer attributes, and a mask can be used to constrain the shape of the inpainted object rather than being only considered as a missing area.
2 code implementations • 6 Dec 2022 • Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia
To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.
1 code implementation • 2 Dec 2022 • Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga
Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.
no code implementations • CVPR 2023 • Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel
We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes.
1 code implementation • 10 Nov 2022 • Lu Qi, Jason Kuen, Weidong Guo, Tiancheng Shen, Jiuxiang Gu, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang
It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.
no code implementations • 24 Aug 2022 • Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Richard Zhang, S. Y. Kung
While concatenating GAN inversion and a 3D-aware, noise-to-image GAN is a straight-forward solution, it is inefficient and may lead to noticeable drop in editing quality.
no code implementations • 17 Aug 2022 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, John Collomosse
We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives.
no code implementations • 9 Aug 2022 • Dan Ruta, Andrew Gilbert, Saeid Motiian, Baldo Faieta, Zhe Lin, John Collomosse
We present HyperNST; a neural style transfer (NST) technique for the artistic stylization of images, based on Hyper-networks and the StyleGAN2 architecture.
no code implementations • 6 Aug 2022 • Lingzhi Zhang, Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman, Zhe Lin, Jianbo Shi
Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes.
1 code implementation • 5 Aug 2022 • Lingzhi Zhang, Yuqian Zhou, Connelly Barnes, Sohrab Amirghodsi, Zhe Lin, Eli Shechtman, Jianbo Shi
Inspired by this workflow, we propose a new learning task of automatic segmentation of inpainting perceptual artifacts, and apply the model for inpainting model evaluation and iterative refinement.
no code implementations • 12 Jul 2022 • Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes
It can be used to calculate hard shadows in a 2D image based on the projective geometry, providing precise control of the shadows' direction and shape.
no code implementations • CVPR 2022 • Soo Ye Kim, Jianming Zhang, Simon Niklaus, Yifei Fan, Simon Chen, Zhe Lin, Munchurl Kim
Depth maps are used in a wide range of applications from 3D rendering to 2D image effects such as Bokeh.
no code implementations • 16 Apr 2022 • Yu Zeng, Zhe Lin, Vishal M. Patel
Therefore, we propose a new data preparation method and a novel Contextual Object Generator (CogNet) for the object inpainting task.
no code implementations • 31 Mar 2022 • Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
To move a step further, this paper proposes GALA (Geometry-and-Lighting-Aware), a generic foreground object search method with discriminative modeling on geometry and lighting compatibility for open-world image compositing.
1 code implementation • CVPR 2022 • Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia
Recent studies have shown the importance of modeling long-range interactions in the inpainting problem.
Ranked #1 on Image Inpainting on CelebA-HQ
1 code implementation • 22 Mar 2022 • Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo
We propose cascaded modulation GAN (CM-GAN), a new network design consisting of an encoder with Fourier convolution blocks that extract multi-scale feature representations from the input image with holes and a dual-stream decoder with a novel cascaded global-spatial modulation block at each scale level.
Ranked #1 on Image Inpainting on Places2
1 code implementation • 17 Mar 2022 • Cusuh Ham, Gemma Canet Tarres, Tu Bui, James Hays, Zhe Lin, John Collomosse
CoGS enables exploration of diverse appearance possibilities for a given sketched object, enabling decoupled control over the structure and the appearance of the output.
no code implementations • 15 Mar 2022 • Jeya Maria Jose Valanarasu, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Jose Echevarria, Yinglan Ma, Zijun Wei, Kalyan Sunkavalli, Vishal M. Patel
To enable flexible interaction between user and harmonization, we introduce interactive harmonization, a new setting where the harmonization is performed with respect to a selected \emph{region} in the reference image instead of the entire background.
no code implementations • 10 Mar 2022 • Dan Ruta, Andrew Gilbert, Pranav Aggarwal, Naveen Marri, Ajinkya Kale, Jo Briggs, Chris Speed, Hailin Jin, Baldo Faieta, Alex Filipkowski, Zhe Lin, John Collomosse
We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools.
1 code implementation • 25 Jan 2022 • Zhe Lin, Zike Yuan, Jieru Zhao, Wei zhang, Hui Wang, Yonghong Tian
Specifically, in the graph construction flow, we introduce buffer insertion, datapath merging, graph trimming and feature annotation techniques to transform HLS designs into graph-structured data, which encode both intra-operation micro-architectures and inter-operation interconnects annotated with switching activities.
1 code implementation • COLING 2022 • Zhe Lin, Xiaojun Wan
Zero-shot paraphrase generation has drawn much attention as the large-scale high-quality paraphrase corpus is limited.
no code implementations • CVPR 2022 • Haoyu Ma, Handong Zhao, Zhe Lin, Ajinkya Kale, Zhangyang Wang, Tong Yu, Jiuxiang Gu, Sunav Choudhary, Xiaohui Xie
recommendation, and marketing services.
1 code implementation • CVPR 2022 • Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zijun Wei, Zhe Lin, Alan Yuille
We propose Lite Vision Transformer (LVT), a novel light-weight transformer network with two enhanced self-attention mechanisms to improve the model performances for mobile deployment.
1 code implementation • 9 Dec 2021 • Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia
To improve instance-level detection/segmentation performance, existing self-supervised and semi-supervised methods extract either task-unrelated or task-specific training signals from unlabeled data.
no code implementations • CVPR 2022 • Yu Zeng, Zhe Lin, Vishal M. Patel
Our model can be trained in a self-supervised fashion by learning the reconstruction of an image region from the style vector and sketch.
1 code implementation • CVPR 2022 • Tiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia
To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation.
1 code implementation • CVPR 2022 • Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar
To address this, we propose a cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images.
1 code implementation • Findings (EMNLP) 2021 • Zhe Lin, Yitao Cai, Xiaojun Wan
Paraphrase generation is an important task in natural language processing.
2 code implementations • CVPR 2021 • Lu Qi, Jason Kuen, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya Jia
However, this option traditionally hurts the detection performance much.
1 code implementation • Findings (ACL) 2021 • Zhe Lin, Xiaojun Wan
Both automatic and human evaluation show BTmPG can improve the diversity of paraphrase while preserving the semantics of the original sentence.
1 code implementation • ICCV 2021 • Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang
Image harmonization aims to improve the quality of image compositing by matching the "appearance" (\eg, color tone, brightness and contrast) between foreground and background images.
2 code implementations • 29 Jul 2021 • Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia
By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.
no code implementations • CVPR 2021 • Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava
In this paper, we introduce a large-scale in-the-wild visual attribute prediction dataset consisting of over 927K attribute annotations for over 260K object instances.
1 code implementation • Findings (ACL) 2021 • Yitao Cai, Zhe Lin, Xiaojun Wan
We argue that the misprediction of concepts is due to the high relevance between English tokens and AMR concepts.
no code implementations • CVPR 2021 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta
We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.
4 code implementations • 26 Apr 2021 • Wei Zeng, Xiaozhe Ren, Teng Su, Hui Wang, Yi Liao, Zhiwei Wang, Xin Jiang, ZhenZhang Yang, Kaisheng Wang, Xiaoda Zhang, Chen Li, Ziyan Gong, Yifan Yao, Xinjing Huang, Jun Wang, Jianfeng Yu, Qi Guo, Yue Yu, Yan Zhang, Jin Wang, Hengtao Tao, Dasen Yan, Zexuan Yi, Fang Peng, Fangqing Jiang, Han Zhang, Lingfeng Deng, Yehong Zhang, Zhe Lin, Chao Zhang, Shaojie Zhang, Mingyue Guo, Shanzhi Gu, Gaojun Fan, YaoWei Wang, Xuefeng Jin, Qun Liu, Yonghong Tian
To enhance the generalization ability of PanGu-$\alpha$, we collect 1. 1TB high-quality Chinese data from a wide range of domains to pretrain the model.
Ranked #1 on Reading Comprehension (One-Shot) on DuReader
Cloze (multi-choices) (Few-Shot) Cloze (multi-choices) (One-Shot) +19
1 code implementation • CVPR 2021 • Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, S. Y. Kung
We then propose a novel content-aware method to guide the processes of both pruning and distillation.
no code implementations • 27 Mar 2021 • Shervin Minaee, Ping Luo, Zhe Lin, Kevin Bowyer
In this work, we provide a detailed overview of some of the most representative deep learning based face detection methods by grouping them into a few major categories, and present their core architectural designs and accuracies on popular benchmarks.
no code implementations • ICCV 2021 • Dan Ruta, Saeid Motiian, Baldo Faieta, Zhe Lin, Hailin Jin, Alex Filipkowski, Andrew Gilbert, John Collomosse
We present ALADIN (All Layer AdaIN); a novel architecture for searching images based on the similarity of their artistic style.
1 code implementation • ICCV 2021 • Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel
The auxiliary branch (i. e. CR loss) is required only during training, and only the inpainting generator is required during the inference.
Ranked #8 on Image Inpainting on Places2
no code implementations • ICCV 2021 • Alireza Zaeemzadeh, Shabnam Ghadar, Baldo Faieta, Zhe Lin, Nazanin Rahnavard, Mubarak Shah, Ratheesh Kalarot
For example, a user can ask for retrieving images similar to a query image, but with a different hair color, and no preference for absence/presence of eyeglasses in the results.
no code implementations • ICCV 2021 • Wentao Jiang, Ning Xu, Jiayun Wang, Chen Gao, Jing Shi, Zhe Lin, Si Liu
Given the cycle, we propose several free augmentation strategies to help our model understand various editing requests given the imbalanced dataset.
1 code implementation • 14 Dec 2020 • Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Jianming Zhang, Ning Xu, Jiebo Luo
A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic.
1 code implementation • 13 Dec 2020 • Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zhe Lin, Alan Yuille
To evaluate segmentation quality near object boundaries, we propose the Meticulosity Quality (MQ) score considering both the mask coverage and boundary precision.
1 code implementation • CVPR 2021 • Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille
We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.
no code implementations • 11 Dec 2020 • Zhe Lin, Sharad Sinha, Wei zhang
Following this, we present Hard-ODT, a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques.
1 code implementation • COLING 2020 • Renliang Sun, Zhe Lin, Xiaojun Wan
Our model uses neural networks to learn the different effects of the preceding sentences and the following sentences on the current sentence and applies them to the improved transformer model.
1 code implementation • 25 Nov 2020 • Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel
Due to the lack of supervision signals for the correspondence between missing regions and known regions, it may fail to find proper reference features, which often leads to artifacts in the results.
no code implementations • 4 Nov 2020 • He Zhang, Jianming Zhang, Federico Perazzi, Zhe Lin, Vishal M. Patel
In this paper, we propose a new method which can automatically generate high-quality image compositing without any user input.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Xuanli He, Quan Hung Tran, Gholamreza Haffari, Walter Chang, Trung Bui, Zhe Lin, Franck Dernoncourt, Nhan Dam
In this paper, we explore the novel problem of graph modification, where the systems need to learn how to update an existing scene graph given a new user's command.
no code implementations • 3 Sep 2020 • Zhe Lin, Sharad Sinha, Wei zhang
We further present a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques.
no code implementations • 3 Sep 2020 • Zhe Lin, Sharad Sinha, Wei zhang
As field-programmable gate arrays become prevalent in critical application domains, their power consumption is of high concern.
no code implementations • 3 Sep 2020 • Zhe Lin, Wei zhang, Sharad Sinha
A flexible architecture of the hardware power monitoring is proposed, which can be instrumented in any RTL design for runtime power estimation, dispensing with the need for extra power measurement devices.
1 code implementation • ECCV 2020 • Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, Hongsheng Li
We propose a novel algorithm, named Open-Edit, which is the first attempt on open-domain image manipulation with open-vocabulary instructions.
1 code implementation • CVPR 2020 • Chenyun Wu, Zhe Lin, Scott Cohen, Trung Bui, Subhransu Maji
We consider the problem of segmenting image regions given a natural language phrase, and study it on a novel dataset of 77, 262 images and 345, 486 phrase-region pairs.
Ranked #4 on Referring Expression Segmentation on PhraseCut
1 code implementation • ECCV 2020 • Shikun Liu, Zhe Lin, Yilin Wang, Jianming Zhang, Federico Perazzi, Edward Johns
We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers, such as pooling, bilinear sampling, and strided convolution.
no code implementations • ECCV 2020 • Liqian Ma, Zhe Lin, Connelly Barnes, Alexei A. Efros, Jingwan Lu
Due to the ubiquity of smartphones, it is popular to take photos of one's self, or "selfies."
no code implementations • ECCV 2020 • Kenan E. Ak, Ning Xu, Zhe Lin, Yilin Wang
To our best knowledge, the proposed method is first to enable adversarial learning in autoregressive models for image generation.
1 code implementation • 7 Jul 2020 • Ping Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko, Stan Sclaroff
The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism and captures the same rich spatial context at a small fraction of the computational cost, by changing the order of operations.
Ranked #32 on Semantic Segmentation on DensePASS
1 code implementation • ECCV 2020 • Yu Zeng, Zhe Lin, Jimei Yang, Jianming Zhang, Eli Shechtman, Huchuan Lu
To address this challenge, we propose an iterative inpainting method with a feedback mechanism.
Ranked #6 on Image Inpainting on Places2
no code implementations • CVPR 2020 • Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille
In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.
1 code implementation • CVPR 2020 • Ping Hu, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Stan Sclaroff, Federico Perazzi
We present TDNet, a temporally distributed network designed for fast and accurate video semantic segmentation.
Ranked #2 on Video Semantic Segmentation on Cityscapes val
1 code implementation • ICCV 2019 • Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan
Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.
1 code implementation • ICCV 2019 • Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu
This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).
Ranked #11 on RGB Salient Object Detection on DAVIS-S (using extra training data)
1 code implementation • ACL 2019 • Hao Tan, Franck Dernoncourt, Zhe Lin, Trung Bui, Mohit Bansal
To push forward the research in this direction, we first introduce a new language-guided image editing dataset that contains a large number of real image pairs with corresponding editing instructions.
no code implementations • 30 May 2019 • Pranav Aggarwal, Zhe Lin, Baldo Faieta, Saeid Motiian
In this paper, we propose a new method for learning text-visual embedding using both image titles and click-through data from an image search engine.
2 code implementations • ICCV 2019 • Yulun Zhang, Chen Fang, Yilin Wang, Zhaowen Wang, Zhe Lin, Yun Fu, Jimei Yang
An assumption widely used in recent neural style transfer methods is that image styles can be described by global statics of deep features like Gram or covariance matrices.
no code implementations • CVPR 2019 • Jiuxiang Gu, Handong Zhao, Zhe Lin, Sheng Li, Jianfei Cai, Mingyang Ling
Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction,~\etc.
2 code implementations • CVPR 2019 • Zhifei Zhang, Zhaowen Wang, Zhe Lin, Hairong Qi
Reference-based super-resolution (RefSR), on the other hand, has proven to be promising in recovering high-resolution (HR) details when a reference (Ref) image with similar content as that of the LR input is given.
Ranked #2 on Image Super-Resolution on CUFED5 - 4x upscaling
no code implementations • CVPR 2019 • Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, Jiebo Luo
We show that by such disentanglement, the contour completion model predicts reasonable contours of objects, and further substantially improves the performance of image inpainting.
3 code implementations • 2 Jan 2019 • Mengtian Li, Zhe Lin, Radomir Mech, Ersin Yumer, Deva Ramanan
Edges, boundaries and contours are important subjects of study in both computer graphics and computer vision.
1 code implementation • CVPR 2019 • Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille
By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.
no code implementations • NeurIPS 2018 • Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras
Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments.
no code implementations • 16 Nov 2018 • Long Nguyen, Jia Zhen, Zhe Lin, Hanxiang Du, Zhou Yang, Wenxuan Guo, Fang Jin
Understanding and accurately predicting within-field spatial variability of crop yield play a key role in site-specific management of crop inputs such as irrigation water and fertilizer for optimized crop production.
no code implementations • 18 Oct 2018 • Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu
To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.
no code implementations • 21 Sep 2018 • Xin Ye, Zhe Lin, Joon-Young Lee, Jianming Zhang, Shibin Zheng, Yezhou Yang
We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest in an indoor environment solely from its visual inputs.
1 code implementation • ECCV 2018 • Wei-Chih Hung, Jianming Zhang, Xiaohui Shen, Zhe Lin, Joon-Young Lee, Ming-Hsuan Yang
Specifically, given a foreground image and a background image, our proposed method automatically generates a set of blending photos with scores that indicate the aesthetics quality with the proposed quality network and policy network.
no code implementations • ECCV 2018 • Hengshuang Zhao, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Brian Price, Jiaya Jia
We present a new image search technique that, given a background image, returns compatible foreground objects for image compositing tasks.
no code implementations • ECCV 2018 • Yufei Wang, Zhe Lin, Xiaohui Shen, Jianming Zhang, Scott Cohen
Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts.
no code implementations • 30 Jul 2018 • Xin Ye, Zhe Lin, Haoxiang Li, Shibin Zheng, Yezhou Yang
We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs.
30 code implementations • ICCV 2019 • Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas Huang
We present a generative image inpainting system to complete images with free-form mask and guidance.
Ranked #3 on Image Inpainting on Places2 val
no code implementations • CVPR 2018 • Shanghang Zhang, Xiaohui Shen, Zhe Lin, RadomÃr MÄch, João P. Costeira, José M. F. Moura
In this paper, we propose a unified framework to estimate a spatially-varying blur map and understand its desirability in terms of image quality at the same time.
no code implementations • CVPR 2018 • Zijun Wei, Jianming Zhang, Xiaohui Shen, Zhe Lin, RadomÃr Mech, Minh Hoai, Dimitris Samaras
Finding views with good photo composition is a challenging task for machine learning methods.
no code implementations • 10 Apr 2018 • Zhifei Zhang, Zhaowen Wang, Zhe Lin, Hairong Qi
We focus on transferring the high-resolution texture from reference images to the super-resolution process without the constraint of content similarity between reference and target images, which is a key difference from previous example-based methods.
1 code implementation • 24 Feb 2018 • Kaichun Mo, Haoxiang Li, Zhe Lin, Joon-Young Lee
Synthetic data suffers from domain gap to the real-world scenes while visual inputs rendered from 3D reconstructed scenes have undesired holes and artifacts.
Robotics
3 code implementations • ICLR 2018 • Jianbo Ye, Xin Lu, Zhe Lin, James Z. Wang
Model pruning has become a useful technique that improves the computational efficiency of deep learning, making it possible to deploy solutions in resource-limited scenarios.
1 code implementation • CVPR 2018 • Jason Kuen, Xiangfei Kong, Zhe Lin, Gang Wang, Jianxiong Yin, Simon See, Yap-Peng Tan
We propose a novel approach for cost-adjustable inference in CNNs - Stochastic Downsampling Point (SDPoint).
28 code implementations • CVPR 2018 • Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang
Motivated by these observations, we propose a new deep generative model-based approach which can not only synthesize novel image structures but also explicitly utilize surrounding image features as references during network training to make better predictions.
1 code implementation • CVPR 2018 • Licheng Yu, Zhe Lin, Xiaohui Shen, Jimei Yang, Xin Lu, Mohit Bansal, Tamara L. Berg
In this paper, we address referring expression comprehension: localizing an image region described by a natural language expression.
Generalized Referring Expression Segmentation Referring Expression +1
no code implementations • ECCV 2018 • Yuhang Song, Chao Yang, Zhe Lin, Xiaofeng Liu, Qin Huang, Hao Li, C. -C. Jay Kuo
We study the task of image inpainting, which is to fill in the missing region of an incomplete image with plausible contents.
no code implementations • NeurIPS 2017 • Xiaojie Jin, Huaxin Xiao, Xiaohui Shen, Jimei Yang, Zhe Lin, Yunpeng Chen, Zequn Jie, Jiashi Feng, Shuicheng Yan
The ability of predicting the future is important for intelligent systems, e. g. autonomous vehicles and robots to plan early and make decisions accordingly.
1 code implementation • ICCV 2017 • Wei-Chih Hung, Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang
We present a scene parsing method that utilizes global context information based on both the parametric and non- parametric models.
no code implementations • ICCV 2017 • Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, David J. Foran
To accommodate our study, we first collect two distinct datasets, a large image dataset from Flickr and annotated by Amazon Mechanical Turk, and a small dataset of real personal albums rated by owners.
no code implementations • ICCV 2017 • Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng
Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors.
1 code implementation • 19 Jul 2017 • Yufei Wang, Zhe Lin, Xiaohui Shen, Radomir Mech, Gavin Miller, Garrison W. Cottrell
Automatic organization of personal photos is a problem with many real world ap- plications, and can be divided into two main tasks: recognizing the event type of the photo collection, and selecting interesting images from the collection.
no code implementations • CVPR 2017 • Long Mai, Hailin Jin, Zhe Lin, Chen Fang, Jonathan Brandt, Feng Liu
We train a convolutional neural network to synthesize appropriate visual features that captures the spatial-semantic constraints from the user canvas query.
no code implementations • CVPR 2017 • Yufei Wang, Zhe Lin, Xiaohui Shen, Scott Cohen, Garrison W. Cottrell
Furthermore, our algorithm can generate descriptions with varied length, benefiting from the separate control of the skeleton and attributes.
1 code implementation • ICCV 2017 • Chenxi Liu, Zhe Lin, Xiaohui Shen, Jimei Yang, Xin Lu, Alan Yuille
In this paper we are interested in the problem of image segmentation given natural language descriptions, i. e. referring expressions.
2 code implementations • CVPR 2017 • Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang
Compositing is one of the most common operations in photo editing.
1 code implementation • 6 Dec 2016 • Ning Yu, Xiaohui Shen, Zhe Lin, Radomir Mech, Connelly Barnes
Our new dataset enables us to formulate the problem as a multi-task learning problem and train a multi-column deep convolutional neural network (CNN) to simultaneously predict the severity of all the defects.
no code implementations • ICCV 2017 • Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan
In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.
1 code implementation • CVPR 2017 • Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, Hao Li
Recent advances in deep learning have shown exciting promise in filling large holes in natural images with semantically plausible and context aware details, impacting fundamental image manipulation tasks such as object removal.
no code implementations • 20 Oct 2016 • Omid Bakhshandeh, Trung Bui, Zhe Lin, Walter Chang
One of the most interesting recent open-ended question answering challenges is Visual Question Answering (VQA) which attempts to evaluate a system's visual understanding through its answers to natural language questions about images.
3 code implementations • 1 Aug 2016 • Jianming Zhang, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Stan Sclaroff
We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps.
no code implementations • CVPR 2015 • Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech
We study the problem of Salient Object Subitizing, i. e. predicting the existence and the number of salient objects in an image using holistic cues.
1 code implementation • 8 Jun 2016 • Paul Hongsuck Seo, Zhe Lin, Scott Cohen, Xiaohui Shen, Bohyung Han
We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images.
2 code implementations • 6 Jun 2016 • Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, Charless Fowlkes
In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function.
Ranked #7 on Aesthetics Quality Assessment on AVA
no code implementations • CVPR 2016 • Haoxiang Li, Jonathan Brandt, Zhe Lin, Xiaohui Shen, Gang Hua
Our new framework enables efficient use of these complementary multi-level contextual cues to improve overall recognition rates on the photo album person recognition task, as demonstrated through state-of-the-art results on a challenging public dataset.
no code implementations • CVPR 2016 • Yufei Wang, Zhe Lin, Xiaohui Shen, Radomir Mech, Gavin Miller, Garrison W. Cottrell
In this paper, we show that the selection of important images is consistent among different viewers, and that this selection process is related to the event type of the album.
1 code implementation • CVPR 2016 • Jianming Zhang, Stan Sclaroff, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech
Our system leverages a Convolutional-Neural-Network model to generate location proposals of salient objects.
no code implementations • CVPR 2016 • Jae-Pil Heo, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Sung-Eui Yoon
We have tested the proposed method with the inverted index and multi-index on a diverse set of benchmarks including up to one billion data points with varying dimensions, and found that our method robustly improves the accuracy of shortlists (up to 127% relatively higher) over the state-of-the-art techniques with a comparable or even faster computational cost.
no code implementations • 22 Dec 2015 • Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang, Alan Yuille
Visual-semantic embedding models have been recently proposed and shown to be effective for image classification and zero-shot learning, by mapping images into a continuous semantic label space.
no code implementations • ICCV 2015 • Xin Lu, Zhe Lin, Xiaohui Shen, Radomir Mech, James Z. Wang
We propose a deep multi-patch aggregation network training approach, which allows us to train models using multiple patches generated from one image.
Ranked #8 on Aesthetics Quality Assessment on AVA
no code implementations • ICCV 2015 • Jianming Zhang, Stan Sclaroff, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech
Powered by this fast MBD transform algorithm, the proposed salient object detection method runs at 80 FPS, and significantly outperforms previous methods with similar speed on four large benchmark datasets, and achieves comparable or better performance than state-of-the-art methods.
Ranked #6 on Video Salient Object Detection on VOS-T (using extra training data)
no code implementations • CVPR 2016 • Joon-Young Lee, Kalyan Sunkavalli, Zhe Lin, Xiaohui Shen, In So Kweon
We introduce a new technique that automatically generates diverse, visually compelling stylizations for a photograph in an unsupervised manner.
no code implementations • 17 Aug 2015 • Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price
In this paper, we propose a novel deep neural network framework embedded with low-level features (LCNN) for salient object detection in complex images.
no code implementations • CVPR 2015 • Jimei Yang, Brian Price, Scott Cohen, Zhe Lin, Ming-Hsuan Yang
The transferred local shape masks constitute a patch-level segmentation solution space and we thus develop a novel cascade algorithm, PatchCut, for coarse-to-fine object segmentation.
no code implementations • CVPR 2015 • Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan L. Yuille
By allowing for interactions between the depth and semantic information, the joint network provides more accurate depth prediction than a state-of-the-art CNN trained solely for depth prediction [5].
no code implementations • CVPR 2015 • Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Gang Hua
To improve localization effectiveness, and reduce the number of candidates at later stages, we introduce a CNN-based calibration stage after each of the detection stages in the cascade.
2 code implementations • 27 May 2015 • Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price
For most natural images, some boundary superpixels serve as the background labels and the saliency of other superpixels are determined by ranking their similarities to the boundary labels based on an inner propagation scheme.
no code implementations • ICCV 2015 • Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille
Segmenting semantic objects from images and parsing them into their respective semantic parts are fundamental steps towards detailed object understanding in computer vision.
no code implementations • CVPR 2015 • Chen Fang, Hailin Jin, Jianchao Yang, Zhe Lin
We validate our feature learning paradigm on this dataset and find that the learned feature significantly outperforms the state-of-the-art image features in learning better image similarities.
no code implementations • CVPR 2014 • Haoxiang Li, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Gang Hua
Despite the fact that face detection has been studied intensively over the past several decades, the problem is still not completely solved.
no code implementations • CVPR 2014 • Jae-Pil Heo, Zhe Lin, Sung-Eui Yoon
This result is achieved mainly because our method accurately estimates distances between two data points with the new binary codes and distance metric.
no code implementations • CVPR 2014 • Brandon M. Smith, Jonathan Brandt, Zhe Lin, Li Zhang
We propose a data-driven approach to facial landmark localization that models the correlations between each landmark and its surrounding appearance features.
no code implementations • 24 Apr 2014 • Zhaowen Wang, Jianchao Yang, Zhe Lin, Jonathan Brandt, Shiyu Chang, Thomas Huang
In this paper, we present an image similarity learning method that can scale well in both the number of images and the dimensionality of image descriptors.
no code implementations • 21 Dec 2013 • Thomas Paine, Hailin Jin, Jianchao Yang, Zhe Lin, Thomas Huang
The ability to train large-scale neural networks has resulted in state-of-the-art performance in many areas of computer vision.
no code implementations • CVPR 2013 • Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning.
no code implementations • CVPR 2013 • Brandon M. Smith, Li Zhang, Jonathan Brandt, Zhe Lin, Jianchao Yang
Given a test image, our algorithm first selects a subset of exemplar images from the database, Our algorithm then computes a nonrigid warp for each exemplar image to align it with the test image.
no code implementations • CVPR 2013 • Jianchao Yang, Zhe Lin, Scott Cohen
Extensive experiments on benchmark and realworld images demonstrate that our algorithm can produce natural-looking results with sharp edges and preserved fine details, while the current state-of-the-art algorithms are prone to visual artifacts.
no code implementations • CVPR 2013 • Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus.
no code implementations • CVPR 2013 • Zhuoyuan Chen, Hailin Jin, Zhe Lin, Scott Cohen, Ying Wu
We use approximate nearest neighbor fields to compute an initial motion field and use a robust algorithm to compute a set of similarity transformations as the motion candidates for segmentation.