no code implementations • ECCV 2020 • Lijun Wang, Jianming Zhang, Yifan Wang, Huchuan Lu, Xiang Ruan
This paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps.
no code implementations • ECCV 2020 • Lu Zhang, Jianming Zhang, Zhe Lin, Radomír Měch, Huchuan Lu, You He
We reformulate the problem of detecting and tracking of salient object spots as a new task called object hotspot tracking.
no code implementations • 8 Apr 2024 • Jing Gu, Yilin Wang, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang
Compared with existing methods for personalized subject swapping, SwapAnything has three unique advantages: (1) precise control of arbitrary objects and parts rather than the main subject, (2) more faithful preservation of context pixels, (3) better adaptation of the personalized concept to the image.
no code implementations • 15 Mar 2024 • Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, He Zhang, Wei Xiong, Daniel Aliaga
Generative object compositing emerges as a promising new avenue for compositional image editing.
no code implementations • 14 Mar 2024 • Yiqun Mei, Yu Zeng, He Zhang, Zhixin Shu, Xuaner Zhang, Sai Bi, Jianming Zhang, HyunJoon Jung, Vishal M. Patel
At the core of portrait photography is the search for ideal lighting and viewpoint.
1 code implementation • 22 Dec 2023 • Nannan Li, Qing Liu, Krishna Kumar Singh, Yilin Wang, Jianming Zhang, Bryan A. Plummer, Zhe Lin
In this paper, we propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings.
no code implementations • 11 Dec 2023 • Mengwei Ren, Wei Xiong, Jae Shin Yoon, Zhixin Shu, Jianming Zhang, HyunJoon Jung, Guido Gerig, He Zhang
Portrait harmonization aims to composite a subject into a new background, adjusting its lighting and color to ensure harmony with the background scene.
no code implementations • 8 Dec 2023 • Jaskirat Singh, Jianming Zhang, Qing Liu, Cameron Smith, Zhe Lin, Liang Zheng
To overcome these limitations, we introduce SmartMask, which allows any novice user to create detailed masks for precise object insertion.
no code implementations • 4 Dec 2023 • Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, Feng Liu
Specifically, we build a global static scene model using an extended plane-based scene representation to synthesize temporally coherent novel video.
1 code implementation • 30 Nov 2023 • Dina Bashkirova, Arijit Ray, Rupayan Mallick, Sarah Adel Bargal, Jianming Zhang, Ranjay Krishna, Kate Saenko
Although generative editing methods now enable some forms of image editing, relighting is still beyond today's capabilities; existing methods struggle to keep other aspects of the image -- colors, shapes, and textures -- consistent after the edit.
no code implementations • 4 Aug 2023 • Jiaqi Li, Yiran Wang, Zihao Huang, Jinghong Zheng, Ke Xian, Zhiguo Cao, Jianming Zhang
We leverage the structural characteristics of diffusion model to enforce depth structures of depth models in a plug-and-play manner.
3 code implementations • ICCV 2023 • Yiran Wang, Min Shi, Jiaqi Li, Zihao Huang, Zhiguo Cao, Jianming Zhang, Ke Xian, Guosheng Lin
Video depth estimation aims to infer temporally consistent depth.
Ranked #16 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)
no code implementations • CVPR 2023 • Yiqun Mei, He Zhang, Xuaner Zhang, Jianming Zhang, Zhixin Shu, Yilin Wang, Zijun Wei, Shi Yan, HyunJoon Jung, Vishal M. Patel
Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map.
no code implementations • CVPR 2023 • Xingyi Li, Zhiguo Cao, Huiqiang Sun, Jianming Zhang, Ke Xian, Guosheng Lin
To animate the scene, we perform motion estimation and lift the 2D motion into the 3D scene flow.
no code implementations • CVPR 2023 • Yichen Sheng, Jianming Zhang, Julien Philip, Yannick Hold-Geoffroy, Xin Sun, He Zhang, Lu Ling, Bedrich Benes
To compensate for the lack of geometry in 2D Image compositing, recent deep learning-based approaches introduced a pixel height representation to generate soft shadows and reflections.
no code implementations • CVPR 2023 • Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga
Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.
no code implementations • CVPR 2023 • Byeong-Uk Lee, Jianming Zhang, Yannick Hold-Geoffroy, In So Kweon
In this paper, we propose a single image scale estimation method based on a novel scale field representation.
no code implementations • ICCV 2023 • Desai Xie, Ping Hu, Xin Sun, Soren Pirk, Jianming Zhang, Radomir Mech, Arie E. Kaufman
Placing and orienting a camera to compose aesthetically meaningful shots of a scene is not only a key objective in real-world photography and cinematography but also for virtual content creation.
no code implementations • ICCV 2023 • Dominique Piché-Meunier, Yannick Hold-Geoffroy, Jianming Zhang, Jean-François Lalonde
Instead, we go further and propose to use a lens-based representation that models the depth of field using two parameters: the blur factor and focus disparity.
no code implementations • 13 Dec 2022 • Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Qing Liu, Yuqian Zhou, Sohrab Amirghodsi, Jiebo Luo
Moreover, the object-level discriminators take aligned instances as inputs to enforce the realism of individual objects.
1 code implementation • CVPR 2023 • Linyi Jin, Jianming Zhang, Yannick Hold-Geoffroy, Oliver Wang, Kevin Matzen, Matthew Sticha, David F. Fouhey
We propose perspective fields as a representation that models the local perspective properties of an image.
1 code implementation • 2 Dec 2022 • Yizhi Song, Zhifei Zhang, Zhe Lin, Scott Cohen, Brian Price, Jianming Zhang, Soo Ye Kim, Daniel Aliaga
Object compositing based on 2D images is a challenging problem since it typically involves multiple processing stages such as color harmonization, geometry correction and shadow generation to generate realistic results.
no code implementations • CVPR 2023 • Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel
We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes.
1 code implementation • 28 Aug 2022 • Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Yifan Liu, Chunhua Shen
To do so, we propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
no code implementations • 17 Aug 2022 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, John Collomosse
We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives.
1 code implementation • 31 Jul 2022 • Yiran Wang, Zhiyu Pan, Xingyi Li, Zhiguo Cao, Ke Xian, Jianming Zhang
Temporal consistency is the key challenge of video depth estimation.
1 code implementation • 29 Jul 2022 • Guangkai Xu, Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Jia-Wang Bian
Our method leverages a data-driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model.
1 code implementation • 18 Jul 2022 • Juewen Peng, Jianming Zhang, Xianrui Luo, Hao Lu, Ke Xian, Zhiguo Cao
Partial occlusion effects are a phenomenon that blurry objects near a camera are semi-transparent, resulting in partial appearance of occluded background.
no code implementations • 12 Jul 2022 • Yichen Sheng, Yifan Liu, Jianming Zhang, Wei Yin, A. Cengiz Oztireli, He Zhang, Zhe Lin, Eli Shechtman, Bedrich Benes
It can be used to calculate hard shadows in a 2D image based on the projective geometry, providing precise control of the shadows' direction and shape.
no code implementations • 12 Jul 2022 • Xiao Pan, Hao Luo, Weihua Chen, Fan Wang, Hao Li, Wei Jiang, Jianming Zhang, Jianyang Gu, Peike Li
To address this issue, we propose the Ranking-based Backward Compatible Learning (RBCL), which directly optimizes the ranking metric between new features and old features.
1 code implementation • CVPR 2022 • Juewen Peng, Zhiguo Cao, Xianrui Luo, Hao Lu, Ke Xian, Jianming Zhang
Based on this formulation, we implement the classical renderer by a scattering-based method and propose a two-stage neural renderer to fix the erroneous areas from the classical renderer.
no code implementations • CVPR 2022 • Soo Ye Kim, Jianming Zhang, Simon Niklaus, Yifei Fan, Simon Chen, Zhe Lin, Munchurl Kim
Depth maps are used in a wide range of applications from 3D rendering to 2D image effects such as Bokeh.
1 code implementation • 22 Mar 2022 • Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, Jiebo Luo
We propose cascaded modulation GAN (CM-GAN), a new network design consisting of an encoder with Fourier convolution blocks that extract multi-scale feature representations from the input image with holes and a dual-stream decoder with a novel cascaded global-spatial modulation block at each scale level.
Ranked #1 on Image Inpainting on Places2
no code implementations • 15 Mar 2022 • Jeya Maria Jose Valanarasu, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Jose Echevarria, Yinglan Ma, Zijun Wei, Kalyan Sunkavalli, Vishal M. Patel
To enable flexible interaction between user and harmonization, we introduce interactive harmonization, a new setting where the harmonization is performed with respect to a selected \emph{region} in the reference image instead of the entire background.
1 code implementation • CVPR 2022 • Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zijun Wei, Zhe Lin, Alan Yuille
We propose Lite Vision Transformer (LVT), a novel light-weight transformer network with two enhanced self-attention mechanisms to improve the model performances for mobile deployment.
1 code implementation • ICCV 2021 • Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang
Image harmonization aims to improve the quality of image compositing by matching the "appearance" (\eg, color tone, brightness and contrast) between foreground and background images.
1 code implementation • 23 Jul 2021 • Zhenyu Wu, Zhaowen Wang, Ye Yuan, Jianming Zhang, Zhangyang Wang, Hailin Jin
Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters.
no code implementations • 15 Jul 2021 • Manuel Lagunas, Xin Sun, Jimei Yang, Ruben Villegas, Jianming Zhang, Zhixin Shu, Belen Masia, Diego Gutierrez
We present a single-image data-driven method to automatically relight images with full-body humans in them.
no code implementations • CVPR 2021 • Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta
We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.
1 code implementation • CVPR 2021 • Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen
Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length.
Ranked #1 on Indoor Monocular Depth Estimation on DIODE (using extra training data)
1 code implementation • 14 Dec 2020 • Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Jianming Zhang, Ning Xu, Jiebo Luo
A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic.
1 code implementation • 13 Dec 2020 • Chenglin Yang, Yilin Wang, Jianming Zhang, He Zhang, Zhe Lin, Alan Yuille
To evaluate segmentation quality near object boundaries, we propose the Meticulosity Quality (MQ) score considering both the mask coverage and boundary precision.
1 code implementation • CVPR 2021 • Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille
We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance.
no code implementations • 4 Nov 2020 • He Zhang, Jianming Zhang, Federico Perazzi, Zhe Lin, Vishal M. Patel
In this paper, we propose a new method which can automatically generate high-quality image compositing without any user input.
no code implementations • 11 Sep 2020 • Jianan Li, Jimei Yang, Jianming Zhang, Chang Liu, Christina Wang, Tingfa Xu
In this paper, we introduce Attribute-conditioned Layout GAN to incorporate the attributes of design elements for graphic layout generation by forcing both the generator and the discriminator to meet attribute conditions.
1 code implementation • 13 Aug 2020 • Akash Gupta, Rameswar Panda, Sujoy Paul, Jianming Zhang, Amit K. Roy-Chowdhury
While machine learning approaches to visual recognition offer great promise, most of the existing methods rely heavily on the availability of large quantities of labeled training data.
1 code implementation • ECCV 2020 • Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, Hongsheng Li
We propose a novel algorithm, named Open-Edit, which is the first attempt on open-domain image manipulation with open-vocabulary instructions.
1 code implementation • ECCV 2020 • Shikun Liu, Zhe Lin, Yilin Wang, Jianming Zhang, Federico Perazzi, Edward Johns
We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers, such as pooling, bilinear sampling, and strided convolution.
1 code implementation • CVPR 2021 • Yichen Sheng, Jianming Zhang, Bedrich Benes
We demonstrate that our model produces realistic soft shadows in real-time.
1 code implementation • ECCV 2020 • Yu Zeng, Zhe Lin, Jimei Yang, Jianming Zhang, Eli Shechtman, Huchuan Lu
To address this challenge, we propose an iterative inpainting method with a feedback mechanism.
Ranked #6 on Image Inpainting on Places2
no code implementations • 13 Mar 2020 • Andrea Zunino, Sarah Adel Bargal, Riccardo Volpi, Mehrnoosh Sameki, Jianming Zhang, Stan Sclaroff, Vittorio Murino, Kate Saenko
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision.
no code implementations • 25 Sep 2019 • Zhenyu Wu, Ye Yuan, Zhaowen Wang, Jianming Zhang, Zhangyang Wang, Hailin Jin
Generative adversarial networks (GANs) nowadays are capable of producing im-ages of incredible realism.
1 code implementation • ICCV 2019 • Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan
Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.
1 code implementation • 28 Aug 2019 • Siwang Zhou, Yan He, Yonghe Liu, Chengqing Li, Jianming Zhang
Specifically, with our multichannel structure, the image blocks with a variety of sampling rates can be reconstructed in a single model.
1 code implementation • ICCV 2019 • Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu
This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).
Ranked #11 on RGB Salient Object Detection on DAVIS-S (using extra training data)
no code implementations • 5 Jun 2019 • Donghyun Kim, Sarah Adel Bargal, Jianming Zhang, Stan Sclaroff
However, it has been shown that deep models are vulnerable to adversarial examples.
no code implementations • ICLR 2019 • Donghyun Kim, Sarah Adel Bargal, Jianming Zhang, Stan Sclaroff
Deep models are state-of-the-art for many computer vision tasks including image classification and object detection.
1 code implementation • ICLR 2019 • Jianan Li, Tingfa Xu, Jianming Zhang, Aaron Hertzmann, Jimei Yang
Layouts are important for graphic design and scene generation.
no code implementations • 3 Apr 2019 • Peng Zhou, Long Mai, Jianming Zhang, Ning Xu, Zuxuan Wu, Larry S. Davis
Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots.
1 code implementation • 21 Jan 2019 • Jianan Li, Jimei Yang, Aaron Hertzmann, Jianming Zhang, Tingfa Xu
Layout is important for graphic design and scene generation.
no code implementations • 6 Dec 2018 • Sarah Adel Bargal, Andrea Zunino, Vitali Petsiuk, Jianming Zhang, Kate Saenko, Vittorio Murino, Stan Sclaroff
We propose Guided Zoom, an approach that utilizes spatial grounding of a model's decision to make more informed predictions.
1 code implementation • CVPR 2019 • Siyuan Qiao, Zhe Lin, Jianming Zhang, Alan Yuille
By simply replacing standard optimizers with Neural Rejuvenation, we are able to improve the performances of neural networks by a very large margin while using similar training efforts and maintaining their original resource usages.
no code implementations • NeurIPS 2018 • Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras
Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments.
no code implementations • 18 Oct 2018 • Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu
To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.
no code implementations • 21 Sep 2018 • Xin Ye, Zhe Lin, Joon-Young Lee, Jianming Zhang, Shibin Zheng, Yezhou Yang
We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest in an indoor environment solely from its visual inputs.
1 code implementation • ECCV 2018 • Wei-Chih Hung, Jianming Zhang, Xiaohui Shen, Zhe Lin, Joon-Young Lee, Ming-Hsuan Yang
Specifically, given a foreground image and a background image, our proposed method automatically generates a set of blending photos with scores that indicate the aesthetics quality with the proposed quality network and policy network.
no code implementations • ECCV 2018 • Yufei Wang, Zhe Lin, Xiaohui Shen, Jianming Zhang, Scott Cohen
Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts.
no code implementations • ECCV 2018 • Rameswar Panda, Jianming Zhang, Haoxiang Li, Joon-Young Lee, Xin Lu, Amit K. Roy-Chowdhury
While machine learning approaches to visual emotion recognition offer great promise, current methods consider training and testing models on small scale datasets covering limited visual emotion concepts.
no code implementations • CVPR 2018 • Zijun Wei, Jianming Zhang, Xiaohui Shen, Zhe Lin, RadomÃr Mech, Minh Hoai, Dimitris Samaras
Finding views with good photo composition is a challenging task for machine learning methods.
1 code implementation • 23 May 2018 • Andrea Zunino, Sarah Adel Bargal, Pietro Morerio, Jianming Zhang, Stan Sclaroff, Vittorio Murino
In this work, we utilize the evidence at each neuron to determine the probability of dropout, rather than dropping out neurons uniformly at random as in standard dropout.
1 code implementation • CVPR 2018 • Sarah Adel Bargal, Andrea Zunino, Donghyun Kim, Jianming Zhang, Vittorio Murino, Stan Sclaroff
Models are trained to caption or classify activity in videos, but little is known about the evidence used to make such decisions.
no code implementations • 30 Apr 2017 • Danna Gurari, Kun He, Bo Xiong, Jianming Zhang, Mehrnoosh Sameki, Suyog Dutt Jain, Stan Sclaroff, Margrit Betke, Kristen Grauman
We propose the ambiguity problem for the foreground object segmentation task and motivate the importance of estimating and accounting for this ambiguity when designing vision systems.
6 code implementations • CVPR 2017 • Vasili Ramanishka, Abir Das, Jianming Zhang, Kate Saenko
Neural image/video captioning models can generate accurate descriptions, but their internal process of mapping regions to words is a black box and therefore difficult to explain.
3 code implementations • 1 Aug 2016 • Jianming Zhang, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Stan Sclaroff
We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps.
no code implementations • CVPR 2015 • Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech
We study the problem of Salient Object Subitizing, i. e. predicting the existence and the number of salient objects in an image using holistic cues.
1 code implementation • CVPR 2016 • Jianming Zhang, Stan Sclaroff, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech
Our system leverages a Convolutional-Neural-Network model to generate location proposals of salient objects.
no code implementations • 22 Dec 2015 • Shugao Ma, Sarah Adel Bargal, Jianming Zhang, Leonid Sigal, Stan Sclaroff
In contrast, collecting action images from the Web is much easier and training on images requires much less computation.
Ranked #14 on Action Recognition on ActivityNet (using extra training data)
no code implementations • ICCV 2015 • Jianming Zhang, Stan Sclaroff, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech
Powered by this fast MBD transform algorithm, the proposed salient object detection method runs at 80 FPS, and significantly outperforms previous methods with similar speed on four large benchmark datasets, and achieves comparable or better performance than state-of-the-art methods.
Ranked #6 on Video Salient Object Detection on VOS-T (using extra training data)