Search Results for author: Yinghao Xu

Found 45 papers, 19 papers with code

CameraCtrl: Enabling Camera Control for Text-to-Video Generation

1 code implementation • 2 Apr 2024 • Hao He, Yinghao Xu, Yuwei Guo, Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang

Controllability plays a crucial role in video generation since it allows users to create desired content.

Text-to-Video Generation Video Generation

208

Paper
Code

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

1 code implementation • 21 Mar 2024 • Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein

We introduce GRM, a large-scale reconstructor capable of recovering a 3D asset from sparse-view images in around 0. 1s.

3D Reconstruction Image to 3D +1

447

Paper
Code

Real-time 3D-aware Portrait Editing from a Single Image

no code implementations • 21 Feb 2024 • Qingyan Bai, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen, Qifeng Chen

This work presents 3DPE, a practical method that can efficiently edit a face image following given prompts, like reference images or text descriptions, in a 3D-aware manner.

Paper
Add Code

SceneWiz3D: Towards Text-guided 3D Scene Composition

no code implementations • 13 Dec 2023 • Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang, Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying Lee

We are witnessing significant breakthroughs in the technology for generating 3D objects from text.

Text to 3D

Paper
Add Code

Learning Naturally Aggregated Appearance for Efficient 3D Editing

1 code implementation • 11 Dec 2023 • Ka Leong Cheng, Qiuyu Wang, Zifan Shi, Kecheng Zheng, Yinghao Xu, Hao Ouyang, Qifeng Chen, Yujun Shen

Neural radiance fields, which represent a 3D scene as a color field and a density field, have demonstrated great progress in novel view synthesis yet are unfavorable for editing due to the implicitness.

Novel View Synthesis

Paper
Code

BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation

no code implementations • 4 Dec 2023 • Qihang Zhang, Yinghao Xu, Yujun Shen, Bo Dai, Bolei Zhou, Ceyuan Yang

Generating large-scale 3D scenes cannot simply apply existing 3D object synthesis technique since 3D scenes usually hold complex spatial configurations and consist of a number of objects at varying scales.

Scene Generation

Paper
Add Code

Gaussian Shell Maps for Efficient 3D Human Generation

no code implementations • 29 Nov 2023 • Rameen Abdal, Wang Yifan, Zifan Shi, Yinghao Xu, Ryan Po, Zhengfei Kuang, Qifeng Chen, Dit-yan Yeung, Gordon Wetzstein

Instead of rasterizing the shells directly, we sample 3D Gaussians on the shells whose attributes are encoded in the texture features.

Paper
Add Code

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

no code implementations • 20 Nov 2023 • Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang

We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1. 3 seconds on a single A100 GPU.

3D Reconstruction Image to 3D +1

Paper
Add Code

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

no code implementations • 15 Nov 2023 • Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang

We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion.

3D Generation Denoising +2

Paper
Add Code

Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model

no code implementations • 10 Nov 2023 • Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi

Text-to-3D with diffusion models has achieved remarkable progress in recent years.

Text to 3D

Paper
Add Code

In-Domain GAN Inversion for Faithful Reconstruction and Editability

no code implementations • 25 Sep 2023 • Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen, Bolei Zhou

This work fills in this gap by proposing in-domain GAN inversion, which consists of a domain-guided encoder and a domain-regularized optimizer, to regularize the inverted code in the native latent space of the pre-trained GAN model.

Image Generation Image Reconstruction

Paper
Add Code

Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis

1 code implementation • 7 Sep 2023 • Jiapeng Zhu, Ceyuan Yang, Kecheng Zheng, Yinghao Xu, Zifan Shi, Yujun Shen

Due to the difficulty in scaling up, generative adversarial networks (GANs) seem to be falling from grace on the task of text-conditioned image synthesis.

Image Generation Philosophy +1

Paper
Code

Improving Out-of-Distribution Robustness of Classifiers via Generative Interpolation

no code implementations • 23 Jul 2023 • Haoyue Bai, Ceyuan Yang, Yinghao Xu, S. -H. Gary Chan, Bolei Zhou

data.

Data Augmentation

Paper
Add Code

Efficient 3D Articulated Human Generation with Layered Surface Volumes

no code implementations • 11 Jul 2023 • Yinghao Xu, Wang Yifan, Alexander W. Bergman, Menglei Chai, Bolei Zhou, Gordon Wetzstein

These layers are rendered using alpha compositing with fast differentiable rasterization, and they can be interpreted as a volumetric representation that allocates its capacity to a manifold of finite thickness around the template.

Paper
Add Code

3D generation on ImageNet

no code implementations • 2 Mar 2023 • Ivan Skorokhodov, Aliaksandr Siarohin, Yinghao Xu, Jian Ren, Hsin-Ying Lee, Peter Wonka, Sergey Tulyakov

Existing 3D-from-2D generators are typically designed for well-curated single-category datasets, where all the objects have (approximately) the same scale, 3D location, and orientation, and the camera always points to the center of the scene.

3D Generation

Paper
Add Code

Spatial Steerability of GANs via Self-Supervision from Discriminator

no code implementations • 20 Jan 2023 • Jianyuan Wang, Lalit Bhagat, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

In this work, we propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space or requiring extra annotations.

Image Generation Inductive Bias +1

Paper
Add Code

Learning 3D-aware Image Synthesis with Unknown Pose Distribution

no code implementations • CVPR 2023 • Zifan Shi, Yujun Shen, Yinghao Xu, Sida Peng, Yiyi Liao, Sheng Guo, Qifeng Chen, Dit-yan Yeung

Existing methods for 3D-aware image synthesis largely depend on the 3D pose distribution pre-estimated on the training set.

3D-Aware Image Synthesis

Paper
Add Code

GH-Feat: Learning Versatile Generative Hierarchical Features from GANs

no code implementations • 12 Jan 2023 • Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

In this work we investigate that such a generative feature learned from image synthesis exhibits great potentials in solving a wide range of computer vision tasks, including both generative ones and more importantly discriminative ones.

Face Verification Image Harmonization +3

Paper
Add Code

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

no code implementations • CVPR 2023 • Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Skorokhodov, Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen, Hsin-Ying Lee, Bolei Zhou, Sergey Tulyakov

Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects.

3D-Aware Image Synthesis Object

Paper
Add Code

Towards Smooth Video Composition

1 code implementation • 14 Dec 2022 • Qihang Zhang, Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Video generation requires synthesizing consistent and persistent frames with dynamic content over time.

Ranked #1 on Video Generation on YouTube Driving

Image Generation single-image-generation +2

Paper
Code

GLeaD: Improving GANs with A Generator-Leading Task

1 code implementation • CVPR 2023 • Qingyan Bai, Ceyuan Yang, Yinghao Xu, Xihui Liu, Yujiu Yang, Yujun Shen

Generative adversarial network (GAN) is formulated as a two-player game between a generator (G) and a discriminator (D), where D is asked to differentiate whether an image comes from real data or is produced by G. Under such a formulation, D plays as the rule maker and hence tends to dominate the competition.

domain classification Generative Adversarial Network +1

Paper
Code

Deep Generative Models on 3D Representations: A Survey

1 code implementation • 27 Oct 2022 • Zifan Shi, Sida Peng, Yinghao Xu, Andreas Geiger, Yiyi Liao, Yujun Shen

In this survey, we thoroughly review the ongoing developments of 3D generative models, including methods that employ 2D and 3D supervision.

3D-Aware Image Synthesis 3D Shape Generation

965

Paper
Code

Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator

no code implementations • 30 Sep 2022 • Zifan Shi, Yinghao Xu, Yujun Shen, Deli Zhao, Qifeng Chen, Dit-yan Yeung

We argue that, considering the two-player game in the formulation of GANs, only making the generator 3D-aware is not enough.

3D-Aware Image Synthesis domain classification +2

Paper
Add Code

Improving GANs with A Dynamic Discriminator

no code implementations • 20 Sep 2022 • Ceyuan Yang, Yujun Shen, Yinghao Xu, Deli Zhao, Bo Dai, Bolei Zhou

Two capacity adjusting schemes are developed for training GANs under different data regimes: i) given a sufficient amount of training data, the discriminator benefits from a progressively increased learning capacity, and ii) when the training data is limited, gradually decreasing the layer width mitigates the over-fitting issue of the discriminator.

3D-Aware Image Synthesis Data Augmentation

Paper
Add Code

Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation

1 code implementation • CVPR 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yinghao Xu, Rui Qian, Xinyi Lin, Xiaowei Zhou, Wayne Wu, Bo Dai, Bolei Zhou

To enhance the quality of synthesized gestures, we develop a contrastive learning strategy based on audio-text alignment for better audio representations.

Ranked #3 on Gesture Generation on TED Gesture Dataset

Contrastive Learning Gesture Generation

117

Paper
Code

High-fidelity GAN Inversion with Padding Space

1 code implementation • 21 Mar 2022 • Qingyan Bai, Yinghao Xu, Jiapeng Zhu, Weihao Xia, Yujiu Yang, Yujun Shen

In this work, we propose to involve the padding space of the generator to complement the latent space with spatial information.

Generative Adversarial Network Image Manipulation +1

Paper
Code

Region-Based Semantic Factorization in GANs

1 code implementation • 19 Feb 2022 • Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen

Despite the rapid advancement of semantic discovery in the latent space of Generative Adversarial Networks (GANs), existing approaches either are limited to finding global attributes or rely on a number of segmentation masks to identify local attributes.

Paper
Code

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation

no code implementations • 19 Jan 2022 • Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou

Moreover, to enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.

Paper
Add Code

3D-aware Image Synthesis via Learning Structural and Textural Representations

1 code implementation • CVPR 2022 • Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen, Bolei Zhou

The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis.

3D-Aware Image Synthesis Generative Adversarial Network

126

Paper
Code

Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

no code implementations • CVPR 2022 • Yinghao Xu, Fangyun Wei, Xiao Sun, Ceyuan Yang, Yujun Shen, Bo Dai, Bolei Zhou, Stephen Lin

Typically in recent work, the pseudo-labels are obtained by training a model on the labeled data, and then using confident predictions from the model to teach itself.

Action Recognition

Paper
Add Code

Improving GAN Equilibrium by Raising Spatial Awareness

1 code implementation • CVPR 2022 • Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

We further propose to align the spatial awareness of G with the attention map induced from D. Through this way we effectively lessen the information gap between D and G. Extensive results show that our method pushes the two-player game in GANs closer to the equilibrium, leading to a better synthesis performance.

Attribute Inductive Bias

157

Paper
Code

One-Shot Generative Domain Adaptation

no code implementations • ICCV 2023 • Ceyuan Yang, Yujun Shen, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Zhirong Wu, Bolei Zhou

We then equip the well-learned discriminator backbone with an attribute classifier to ensure that the generator captures the appropriate characters from the reference.

Attribute Domain Adaptation +1

Paper
Add Code

Improving Out-of-Distribution Robustness of Classifiers Through Interpolated Generative Models

no code implementations • 29 Sep 2021 • Haoyue Bai, Ceyuan Yang, Yinghao Xu, S.-H. Gary Chan, Bolei Zhou

In this paper, we employ interpolated generative models to generate OoD samples at training time via data augmentation.

Data Augmentation

Paper
Add Code

Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering

no code implementations • ICCV 2021 • Bangbang Yang, yinda zhang, Yinghao Xu, Yijin Li, Han Zhou, Hujun Bao, Guofeng Zhang, Zhaopeng Cui

In this paper, we present a novel neural scene rendering system, which learns an object-compositional neural radiance field and produces realistic rendering with editing capability for a clustered and real-world scene.

Neural Rendering Novel View Synthesis +1

Paper
Add Code

CompConv: A Compact Convolution Module for Efficient Feature Learning

no code implementations • 19 Jun 2021 • Chen Zhang, Yinghao Xu, Yujun Shen

Convolutional Neural Networks (CNNs) have achieved remarkable success in various computer vision tasks but rely on tremendous computational cost.

Paper
Add Code

Data-Efficient Instance Generation from Instance Discrimination

1 code implementation • NeurIPS 2021 • Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Meanwhile, the learned instance discrimination capability from the discriminator is in turn exploited to encourage the generator for diverse generation.

Ranked #6 on Image Generation on FFHQ 256 x 256 (FD metric)

2k Data Augmentation +1

100

Paper
Code

Decorating Your Own Bedroom: Locally Controlling Image Generation with Generative Adversarial Networks

no code implementations • 18 May 2021 • Chen Zhang, Yinghao Xu, Yujun Shen

Generative Adversarial Networks (GANs) have made great success in synthesizing high-quality images.

Image Generation

Paper
Add Code

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

3 code implementations • CVPR 2021 • Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, Xiaowei Zhou

To this end, we propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated.

Novel View Synthesis Representation Learning

3,321

Paper
Code

Generative Hierarchical Features from Synthesizing Images

1 code implementation • CVPR 2021 • Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

Generative Adversarial Networks (GANs) have recently advanced image synthesis by learning the underlying distribution of the observed data.

Face Verification Image Classification +2

157

Paper
Code

Unsupervised Landmark Learning from Unpaired Data

1 code implementation • 29 Jun 2020 • Yinghao Xu, Ceyuan Yang, Ziwei Liu, Bo Dai, Bolei Zhou

Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.

Paper
Code

Video Representation Learning with Visual Tempo Consistency

1 code implementation • 28 Jun 2020 • Ceyuan Yang, Yinghao Xu, Bo Dai, Bolei Zhou

Visual tempo, which describes how fast an action goes, has shown its potential in supervised action recognition.

Action Anticipation Action Detection +3

Paper
Code

Temporal Pyramid Network for Action Recognition

3 code implementations • CVPR 2020 • Ceyuan Yang, Yinghao Xu, Jianping Shi, Bo Dai, Bolei Zhou

Previous works often capture the visual tempo through sampling raw videos at multiple rates and constructing an input-level frame pyramid, which usually requires a costly multi-branch network to handle.

Ranked #105 on Action Recognition on Something-Something V2

Action Recognition

3,906

Paper
Code

Dense RepPoints: Representing Visual Objects with Dense Point Sets

2 code implementations • ECCV 2020 • Ze Yang, Yinghao Xu, Han Xue, Zheng Zhang, Raquel Urtasun, Li-Wei Wang, Stephen Lin, Han Hu

We present a new object representation, called Dense RepPoints, that utilizes a large set of points to describe an object at multiple levels, including both box level and pixel level.

Object Object Detection

296

Paper
Code

A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks

no code implementations • CVPR 2019 • Yinghao Xu, Xin Dong, Yudian Li, Hao Su

To reduce memory footprint and run-time latency, techniques such as neural net-work pruning and binarization have been explored separately.

Binarization Image Classification

Paper
Add Code

A Main/Subsidiary Network Framework for Simplifying Binary Neural Network

no code implementations • 11 Dec 2018 • Yinghao Xu, Xin Dong, Yudian Li, Hao Su

To reduce memory footprint and run-time latency, techniques such as neural network pruning and binarization have been explored separately.

Binarization Image Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.