Search Results for author: Varun Jampani

Found 107 papers, 55 papers with code

Shaping Realities: Enhancing 3D Generative AI with Fabrication Constraints

no code implementations • 15 Apr 2024 • Faraz Faruqi, Yingtao Tian, Vrushank Phadnis, Varun Jampani, Stefanie Mueller

This workshop paper highlights the limitations of generative AI tools in translating digital creations into the physical world and proposes new augmentations to generative AI tools for creating physically viable 3D models.

Paper
Add Code

Probing the 3D Awareness of Visual Foundation Models

1 code implementation • 12 Apr 2024 • Mohamed El Banani, Amit Raj, Kevis-Kokitsi Maninis, Abhishek Kar, Yuanzhen Li, Michael Rubinstein, Deqing Sun, Leonidas Guibas, Justin Johnson, Varun Jampani

Given that such models can classify, delineate, and localize objects in 2D, we ask whether they also represent their 3D structure?

192

Paper
Code

ZeST: Zero-Shot Material Transfer from a Single Image

no code implementations • 9 Apr 2024 • Ta-Ying Cheng, Prafull Sharma, Andrew Markham, Niki Trigoni, Varun Jampani

We propose ZeST, a method for zero-shot material transfer to an object in the input image given a material exemplar image.

Object

Paper
Add Code

MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation

no code implementations • 4 Apr 2024 • Hanzhe Hu, Zhizhuo Zhou, Varun Jampani, Shubham Tulsiani

We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.

Denoising Depth Estimation +1

Paper
Add Code

3D Congealing: 3D-Aware Image Alignment in the Wild

no code implementations • 2 Apr 2024 • Yunzhi Zhang, Zizhang Li, Amit Raj, Andreas Engelhardt, Yuanzhen Li, Tingbo Hou, Jiajun Wu, Varun Jampani

The framework optimizes for the canonical representation together with the pose for each input image, and a per-image coordinate map that warps 2D pixel coordinates to the 3D canonical frame to account for the shape matching.

Pose Estimation

Paper
Add Code

WordRobe: Text-Guided Generation of Textured 3D Garments

no code implementations • 26 Mar 2024 • Astitva Srivastava, Pranav Manu, Amit Raj, Varun Jampani, Avinash Sharma

We achieve this by first learning a latent representation of 3D garments using a novel coarse-to-fine training strategy and a loss for latent disentanglement, promoting better latent interpolation.

Disentanglement text-guided-generation +1

Paper
Add Code

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

no code implementations • 18 Mar 2024 • Vikram Voleti, Chun-Han Yao, Mark Boss, Adam Letts, David Pankratz, Dmitry Tochilkin, Christian Laforte, Robin Rombach, Varun Jampani

In this work, we propose SV3D that adapts image-to-video diffusion model for novel multi-view synthesis and 3D generation, thereby leveraging the generalization and multi-view consistency of the video models, while further adding explicit camera control for NVS.

3D Generation 3D Reconstruction +2

Paper
Add Code

TripoSR: Fast 3D Object Reconstruction from a Single Image

1 code implementation • 4 Mar 2024 • Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, Yan-Pei Cao

This technical report introduces TripoSR, a 3D reconstruction model leveraging transformer architecture for fast feed-forward 3D generation, producing 3D mesh from a single image in under 0. 5 seconds.

3D Generation 3D Object Reconstruction From A Single Image +2

3,709

Paper
Code

SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

no code implementations • 18 Jan 2024 • Andreas Engelhardt, Amit Raj, Mark Boss, Yunzhi Zhang, Abhishek Kar, Yuanzhen Li, Deqing Sun, Ricardo Martin Brualla, Jonathan T. Barron, Hendrik P. A. Lensch, Varun Jampani

We present SHINOBI, an end-to-end framework for the reconstruction of shape, material, and illumination from object images captured with varying lighting, pose, and background.

Inverse Rendering Object

Paper
Add Code

Dress-Me-Up: A Dataset & Method for Self-Supervised 3D Garment Retargeting

no code implementations • 6 Jan 2024 • Shanthika Naik, Kunwar Singh, Astitva Srivastava, Dhawal Sirikonda, Amit Raj, Varun Jampani, Avinash Sharma

We propose a novel self-supervised framework for retargeting non-parameterized 3D garments onto 3D human avatars of arbitrary shapes and poses, enabling 3D virtual try-on (VTON).

Virtual Try-on

Paper
Add Code

ZeroShape: Regression-based Zero-shot Shape Reconstruction

no code implementations • 21 Dec 2023 • Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg

In contrast, the traditional approach to this problem is regression-based, where deterministic models are trained to directly regress the object shape.

3D Shape Reconstruction Computational Efficiency +1

Paper
Add Code

DiffusionLight: Light Probes for Free by Painting a Chrome Ball

1 code implementation • 14 Dec 2023 • Pakkapon Phongthawee, Worameth Chinchuthakun, Nontaphat Sinsunthithet, Amit Raj, Varun Jampani, Pramook Khungurn, Supasorn Suwajanakorn

To address this problem, we leverage diffusion models trained on billions of standard images to render a chrome ball into the input image.

Lighting Estimation

434

Paper
Code

HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models

no code implementations • 11 Dec 2023 • Xiaogang Peng, Yiming Xie, Zizhao Wu, Varun Jampani, Deqing Sun, Huaizu Jiang

We also develop an affordance prediction diffusion model (APDM) to predict the contacting area between the human and object during the interactions driven by the textual prompt.

Human-Object Interaction Detection Object

Paper
Add Code

NeRFiller: Completing Scenes via Generative 3D Inpainting

no code implementations • 7 Dec 2023 • Ethan Weber, Aleksander Hołyński, Varun Jampani, Saurabh Saxena, Noah Snavely, Abhishek Kar, Angjoo Kanazawa

In contrast to related works, we focus on completing scenes rather than deleting foreground objects, and our approach does not require tight 2D object masks or text.

3D Inpainting

Paper
Add Code

Alchemist: Parametric Control of Material Properties with Diffusion Models

no code implementations • 5 Dec 2023 • Prafull Sharma, Varun Jampani, Yuanzhen Li, Xuhui Jia, Dmitry Lagun, Fredo Durand, William T. Freeman, Mark Matthews

We propose a method to control material attributes of objects like roughness, metallic, albedo, and transparency in real images.

Paper
Add Code

UniGS: Unified Representation for Image Generation and Segmentation

1 code implementation • 4 Dec 2023 • Lu Qi, Lehan Yang, Weidong Guo, Yu Xu, Bo Du, Varun Jampani, Ming-Hsuan Yang

On the other hand, the progressive dichotomy module can efficiently decode the synthesized colormap to high-quality entity-level masks in a depth-first binary search without knowing the cluster numbers.

Image Generation Segmentation

669

Paper
Code

One-Shot Open Affordance Learning with Foundation Models

no code implementations • 29 Nov 2023 • Gen Li, Deqing Sun, Laura Sevilla-Lara, Varun Jampani

We introduce One-shot Open Affordance Learning (OOAL), where a model is trained with just one example per base object category, but is expected to identify novel objects and affordances.

Paper
Add Code

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

1 code implementation • 28 Nov 2023 • Junyi Zhang, Charles Herrmann, Junhwa Hur, Eric Chen, Varun Jampani, Deqing Sun, Ming-Hsuan Yang

This paper identifies the importance of being geometry-aware for semantic correspondence and reveals a limitation of the features of current foundation models under simple post-processing.

Ranked #1 on Semantic correspondence on PF-PASCAL

Animal Pose Estimation Semantic correspondence

Paper
Code

Exploring Attribute Variations in Style-based GANs using Diffusion Models

no code implementations • 27 Nov 2023 • Rishubh Parihar, Prasanna Balaji, Raghav Magazine, Sarthak Vora, Tejan Karmali, Varun Jampani, R. Venkatesh Babu

We capitalize on disentangled latent spaces of pretrained GANs and train a Denoising Diffusion Probabilistic Model (DDPM) to learn the latent distribution for diverse edits.

Attribute Denoising

Paper
Add Code

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

2 code implementations • None 2023 • Andreas Blattmann, Tim Dockhorn, Sumith Kulal, Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram Voleti, Adam Letts, Varun Jampani, Robin Rombach

We then explore the impact of finetuning our base model on high-quality data and train a text-to-video model that is competitive with closed-source video generation.

Image Generation Image to Video Generation

22,480

Paper
Code

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

1 code implementation • 22 Nov 2023 • Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, Varun Jampani

Experiments on a wide range of subject and style combinations show that ZipLoRA can generate compelling results with meaningful improvements over baselines in subject and style fidelity while preserving the ability to recontextualize.

457

Paper
Code

OmniControl: Control Any Joint at Any Time for Human Motion Generation

1 code implementation • 12 Oct 2023 • Yiming Xie, Varun Jampani, Lei Zhong, Deqing Sun, Huaizu Jiang

We present a novel approach named OmniControl for incorporating flexible spatial control signals into a text-conditioned human motion generation model based on the diffusion process.

173

Paper
Code

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

2 code implementations • 13 Jul 2023 • Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman

By composing these weights into the diffusion model, coupled with fast finetuning, HyperDreamBooth can generate a person's face in various contexts and styles, with high subject details while also preserving the model's crucial knowledge of diverse styles and semantic modifications.

Diffusion Personalization Tuning Free

153

Paper
Code

Background Prompting for Improved Object Depth

no code implementations • 8 Jun 2023 • Manel Baradad, Yuanzhen Li, Forrester Cole, Michael Rubinstein, Antonio Torralba, William T. Freeman, Varun Jampani

To infer object depth on a real image, we place the segmented object into the learned background prompt and run off-the-shelf depth networks.

Object

Paper
Add Code

LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs

no code implementations • ICCV 2023 • Zezhou Cheng, Carlos Esteves, Varun Jampani, Abhishek Kar, Subhransu Maji, Ameesh Makadia

Consequently, there is growing interest in extending NeRF models to jointly optimize camera poses and scene representation, which offers an alternative to off-the-shelf SfM pipelines which have well-understood failure modes.

Pose Estimation

Paper
Add Code

KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models

no code implementations • 28 May 2023 • Zhiwei Jia, Pradyumna Narayana, Arjun R. Akula, Garima Pruthi, Hao Su, Sugato Basu, Varun Jampani

Image ad understanding is a crucial task with wide real-world applications.

Paper
Add Code

A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence

1 code implementation • NeurIPS 2023 • Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, Deqing Sun, Ming-Hsuan Yang

Text-to-image diffusion models have made significant advances in generating and editing high-quality images.

Ranked #3 on Semantic correspondence on SPair-71k

Representation Learning Semantic correspondence +1

209

Paper
Code

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models

1 code implementation • NeurIPS 2023 • Weixi Feng, Wanrong Zhu, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Xuehai He, Sugato Basu, Xin Eric Wang, William Yang Wang

When combined with a downstream image generation model, LayoutGPT outperforms text-to-image models/systems by 20-40% and achieves comparable performance as human users in designing visual layouts for numerical and spatial correctness.

Indoor Scene Synthesis Text-to-Image Generation

246

Paper
Code

Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners

1 code implementation • 18 May 2023 • Xuehai He, Weixi Feng, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang

Diffusion models, such as Stable Diffusion, have shown incredible performance on text-to-image generation.

Image-text matching Text Matching +1

Paper
Code

ContactArt: Learning 3D Interaction Priors for Category-level Articulated Object and Hand Poses Estimation

no code implementations • 2 May 2023 • Zehao Zhu, Jiashun Wang, Yuzhe Qin, Deqing Sun, Varun Jampani, Xiaolong Wang

We propose a new dataset and a novel approach to learning hand-object interaction priors for hand and articulated object pose estimation.

Hand Pose Estimation Object

Paper
Add Code

ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency

no code implementations • CVPR 2023 • Zixuan Huang, Varun Jampani, Anh Thai, Yuanzhen Li, Stefan Stojanov, James M. Rehg

We present ShapeClipper, a novel method that reconstructs 3D object shapes from real-world single-view RGB images.

Paper
Add Code

NoisyTwins: Class-Consistent and Diverse Image Generation through StyleGANs

1 code implementation • CVPR 2023 • Harsh Rangwani, Lavish Bansal, Kartik Sharma, Tejan Karmali, Varun Jampani, R. Venkatesh Babu

We find that one reason for degradation is the collapse of latents for each class in the $\mathcal{W}$ latent space.

Ranked #1 on Conditional Image Generation on ImageNet-LT

Conditional Image Generation

Paper
Code

ASIC: Aligning Sparse in-the-wild Image Collections

no code implementations • ICCV 2023 • Kamal Gupta, Varun Jampani, Carlos Esteves, Abhinav Shrivastava, Ameesh Makadia, Noah Snavely, Abhishek Kar

We present a self-supervised technique that directly optimizes on a sparse collection of images of a particular object/object category to obtain consistent dense correspondences across the collection.

Object

Paper
Add Code

DreamBooth3D: Subject-Driven Text-to-3D Generation

no code implementations • ICCV 2023 • Amit Raj, Srinivas Kaza, Ben Poole, Michael Niemeyer, Nataniel Ruiz, Ben Mildenhall, Shiran Zada, Kfir Aberman, Michael Rubinstein, Jonathan Barron, Yuanzhen Li, Varun Jampani

We present DreamBooth3D, an approach to personalize text-to-3D generative models from as few as 3-6 casually captured images of a subject.

3D Generation Text to 3D

Paper
Add Code

LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding

no code implementations • CVPR 2023 • Gen Li, Varun Jampani, Deqing Sun, Laura Sevilla-Lara

A key step to acquire this skill is to identify what part of the object affords each action, which is called affordance grounding.

Object

Paper
Add Code

Polynomial Neural Fields for Subband Decomposition and Manipulation

1 code implementation • 9 Feb 2023 • Guandao Yang, Sagie Benaim, Varun Jampani, Kyle Genova, Jonathan T. Barron, Thomas Funkhouser, Bharath Hariharan, Serge Belongie

We use this framework to design Fourier PNFs, which match state-of-the-art performance in signal representation tasks that use neural fields.

Paper
Code

Debiasing Vision-Language Models via Biased Prompts

1 code implementation • 31 Jan 2023 • Ching-Yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka

Machine learning models have been shown to inherit biases from their training datasets.

Paper
Code

Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble

1 code implementation • CVPR 2023 • Chun-Han Yao, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani

Automatically estimating 3D skeleton, shape, camera viewpoints, and part articulation from sparse in-the-wild image ensembles is a severely under-constrained and challenging problem.

Paper
Code

MetaCLUE: Towards Comprehensive Visual Metaphors Research

no code implementations • CVPR 2023 • Arjun R. Akula, Brendan Driscoll, Pradyumna Narayana, Soravit Changpinyo, Zhiwei Jia, Suyash Damle, Garima Pruthi, Sugato Basu, Leonidas Guibas, William T. Freeman, Yuanzhen Li, Varun Jampani

Towards this goal, we introduce MetaCLUE, a set of vision tasks on visual metaphor.

Image Generation Question Answering +1

Paper
Add Code

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

1 code implementation • 9 Dec 2022 • Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang

In this work, we improve the compositional skills of T2I models, specifically more accurate attribute binding and better image compositions.

Attribute Image Generation

294

Paper
Code

Subsidiary Prototype Alignment for Universal Domain Adaptation

no code implementations • 28 Oct 2022 • Jogendra Nath Kundu, Suvaansh Bhambri, Akshay Kulkarni, Hiran Sarkar, Varun Jampani, R. Venkatesh Babu

Universal Domain Adaptation (UniDA) deals with the problem of knowledge transfer between two datasets with domain-shift as well as category-shift.

Object Recognition Single Particle Analysis +2

Paper
Add Code

CPL: Counterfactual Prompt Learning for Vision and Language Models

no code implementations • 19 Oct 2022 • Xuehai He, Diji Yang, Weixi Feng, Tsu-Jui Fu, Arjun Akula, Varun Jampani, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang

Prompt tuning is a new few-shot transfer learning technique that only tunes the learnable prompt for pre-trained vision and language models such as CLIP.

counterfactual Visual Question Answering

Paper
Add Code

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

10 code implementations • CVPR 2023 • Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman

Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes.

Diffusion Personalization Image Generation

9,876

Paper
Code

Improving GANs for Long-Tailed Data through Group Spectral Regularization

1 code implementation • 21 Aug 2022 • Harsh Rangwani, Naman Jaswani, Tejan Karmali, Varun Jampani, R. Venkatesh Babu

Deep long-tailed learning aims to train useful deep networks on practical, real-world imbalanced distributions, wherein most labels of the tail classes are associated with a few samples.

Ranked #1 on Image Generation on LSUN

Conditional Image Generation

Paper
Code

Hierarchical Semantic Regularization of Latent Spaces in StyleGANs

no code implementations • 7 Aug 2022 • Tejan Karmali, Rishubh Parihar, Susmit Agrawal, Harsh Rangwani, Varun Jampani, Maneesh Singh, R. Venkatesh Babu

The quality of the generated images is predicated on two assumptions; (a) The richness of the hierarchical representations learnt by the generator, and, (b) The linearity and smoothness of the style spaces.

Attribute

Paper
Add Code

Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation

3 code implementations • 27 Jul 2022 • Jogendra Nath Kundu, Suvaansh Bhambri, Akshay Kulkarni, Hiran Sarkar, Varun Jampani, R. Venkatesh Babu

The prime challenge in unsupervised domain adaptation (DA) is to mitigate the domain shift between the source and target domains.

Source-Free Domain Adaptation Unsupervised Domain Adaptation

Paper
Code

LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery

no code implementations • 7 Jul 2022 • Chun-Han Yao, Wei-Chih Hung, Yuanzhen Li, Michael Rubinstein, Ming-Hsuan Yang, Varun Jampani

In this work, we propose a practical problem setting to estimate 3D pose and shape of animals given only a few (10-30) in-the-wild images of a particular animal species (say, horse).

Paper
Add Code

Balancing Discriminability and Transferability for Source-Free Domain Adaptation

1 code implementation • 16 Jun 2022 • Jogendra Nath Kundu, Akshay Kulkarni, Suvaansh Bhambri, Deepesh Mehta, Shreyas Kulkarni, Varun Jampani, R. Venkatesh Babu

Conventional domain adaptation (DA) techniques aim to improve domain transferability by learning domain-invariant representations; while concurrently preserving the task-discriminability knowledge gathered from the labeled source data.

Semantic Segmentation Source-Free Domain Adaptation

Paper
Code

SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections

1 code implementation • 31 May 2022 • Mark Boss, Andreas Engelhardt, Abhishek Kar, Yuanzhen Li, Deqing Sun, Jonathan T. Barron, Hendrik P. A. Lensch, Varun Jampani

Our method works on in-the-wild online image collections of an object and produces relightable 3D assets for several use-cases such as AR/VR.

Inverse Rendering Novel View Synthesis +1

Paper
Code

Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues

no code implementations • 21 Apr 2022 • Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg

We present a novel 3D shape reconstruction method which learns to predict an implicit 3D shape representation from a single RGB image.

3D Shape Reconstruction 3D Shape Representation +1

Paper
Add Code

An Extendable, Efficient and Effective Transformer-based Object Detector

1 code implementation • 17 Apr 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang

Transformers have been widely used in numerous vision problems especially for visual recognition and detection.

Decoder Image Classification +5

300

Paper
Code

LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity

no code implementations • 6 Apr 2022 • Tejan Karmali, Abhinav Atrishi, Sai Sree Harsha, Susmit Agrawal, Varun Jampani, R. Venkatesh Babu

Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image, which are further used to learn landmarks in a semi-supervised manner.

Self-Supervised Learning

Paper
Add Code

Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation

no code implementations • NeurIPS 2021 • Jogendra Nath Kundu, Siddharth Seth, Anirudh Jamkhandi, Pradyumna YM, Varun Jampani, Anirban Chakraborty, R. Venkatesh Babu

To this end, we cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target.

Ranked #5 on Unsupervised 3D Human Pose Estimation on Human3.6M

Decoder Relation +2

Paper
Add Code

Aligning Silhouette Topology for Self-Adaptive 3D Human Pose Recovery

no code implementations • NeurIPS 2021 • Mugalodi Rakesh, Jogendra Nath Kundu, Varun Jampani, R. Venkatesh Babu

Articulation-centric 2D/3D pose supervision forms the core training objective in most existing 3D human pose estimation techniques.

3D Human Pose Estimation

Paper
Add Code

Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation

no code implementations • CVPR 2022 • Jogendra Nath Kundu, Siddharth Seth, Pradyumna YM, Varun Jampani, Anirban Chakraborty, R. Venkatesh Babu

The advances in monocular 3D human pose estimation are dominated by supervised techniques that require large-scale 2D/3D pose annotations.

Ranked #8 on Unsupervised 3D Human Pose Estimation on Human3.6M

Monocular 3D Human Pose Estimation Unsupervised 3D Human Pose Estimation +2

Paper
Add Code

Amplitude Spectrum Transformation for Open Compound Domain Adaptive Semantic Segmentation

no code implementations • 9 Feb 2022 • Jogendra Nath Kundu, Akshay Kulkarni, Suvaansh Bhambri, Varun Jampani, R. Venkatesh Babu

However, we find that latent features derived from the Fourier-based amplitude spectrum of deep CNN features hold a more tractable mapping with domain discrimination.

Disentanglement Domain Adaptation +1

Paper
Add Code

SOMSI: Spherical Novel View Synthesis With Soft Occlusion Multi-Sphere Images

no code implementations • CVPR 2022 • Tewodros Habtegebrial, Christiano Gava, Marcel Rogge, Didier Stricker, Varun Jampani

We propose a novel MSI representation called Soft Occlusion MSI (SOMSI) that enables modelling high-dimensional appearance features in MSI while retaining the fast rendering times of a standard MSI.

Novel View Synthesis

Paper
Add Code

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

1 code implementation • NeurIPS 2021 • Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Ce Liu, Deva Ramanan

The surface embeddings are implemented as coordinate-based MLPs that are fit to each video via consistency and contrastive reconstruction losses. Experimental results show that ViSER compares favorably against prior work on challenging videos of humans with loose clothing and unusual poses as well as animals videos from DAVIS and YTVOS.

3D Shape Reconstruction from Videos

Paper
Code

Robust Visual Reasoning via Language Guided Neural Module Networks

no code implementations • NeurIPS 2021 • Arjun Akula, Varun Jampani, Soravit Changpinyo, Song-Chun Zhu

Neural module networks (NMN) are a popular approach for solving multi-modal tasks such as visual question answering (VQA) and visual referring expression recognition (REF).

Question Answering Referring Expression +2

Paper
Add Code

Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition

1 code implementation • NeurIPS 2021 • Mark Boss, Varun Jampani, Raphael Braun, Ce Liu, Jonathan T. Barron, Hendrik P. A. Lensch

Decomposing a scene into its shape, reflectance and illumination is a fundamental problem in computer vision and graphics.

Novel View Synthesis

Paper
Code

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

1 code implementation • ICLR 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang

Transformers are transforming the landscape of computer vision, especially for recognition tasks.

Ranked #12 on Object Detection on COCO 2017 val

Decoder Image Classification +3

300

Paper
Code

Approximate Bijective Correspondence for isolating factors of variation

1 code implementation • 29 Sep 2021 • Kieran A Murphy, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

We propose a novel algorithm that relies on a weak form of supervision where the data is partitioned into sets according to certain \textit{inactive} factors of variation.

Contrastive Learning Data Augmentation +1

32,939

Paper
Code

SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware Inpainting

no code implementations • ICCV 2021 • Varun Jampani, Huiwen Chang, Kyle Sargent, Abhishek Kar, Richard Tucker, Michael Krainin, Dominik Kaeser, William T. Freeman, David Salesin, Brian Curless, Ce Liu

We present SLIDE, a modular and unified system for single image 3D photography that uses a simple yet effective soft layering strategy to better preserve appearance details in novel views.

Image Matting

Paper
Add Code

Generalize then Adapt: Source-Free Domain Adaptive Semantic Segmentation

1 code implementation • ICCV 2021 • Jogendra Nath Kundu, Akshay Kulkarni, Amit Singh, Varun Jampani, R. Venkatesh Babu

Unsupervised domain adaptation (DA) has gained substantial interest in semantic segmentation.

Ranked #4 on Domain Generalization on GTA5-to-Cityscapes

Domain Generalization Pseudo Label +3

Paper
Code

Discovering 3D Parts from Image Collections

no code implementations • ICCV 2021 • Chun-Han Yao, Wei-Chih Hung, Varun Jampani, Ming-Hsuan Yang

Reasoning 3D shapes from 2D images is an essential yet challenging task, especially when only single-view images are at our disposal.

Object

Paper
Add Code

Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold

2 code implementations • 10 Jun 2021 • Kieran Murphy, Carlos Esteves, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

Single image pose estimation is a fundamental problem in many vision and robotics tasks, and existing deep learning approaches suffer by not completely modeling and handling: i) uncertainty about the predictions, and ii) symmetric objects with multiple (sometimes infinite) correct poses.

3D Pose Estimation 3D Rotation Estimation

32,943

Paper
Code

LASR: Learning Articulated Shape Reconstruction from a Monocular Video

1 code implementation • CVPR 2021 • Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu

Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images.

3D Shape Reconstruction from Videos Object

166

Paper
Code

AutoFlow: Learning a Better Training Set for Optical Flow

1 code implementation • CVPR 2021 • Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu

Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications.

Optical Flow Estimation

115

Paper
Code

Decoupled Dynamic Filter Networks

1 code implementation • CVPR 2021 • Jingkai Zhou, Varun Jampani, Zhixiong Pi, Qiong Liu, Ming-Hsuan Yang

Inspired by recent advances in attention, DDF decouples a depth-wise dynamic filter into spatial and channel dynamic filters.

Ranked #13 on Semantic Segmentation on MCubeS

Image Classification Semantic Segmentation

211

Paper
Code

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation

2 code implementations • CVPR 2021 • Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, Joongkyu Kim

By integrating the SGC and GPA together, we propose the Adaptive Superpixel-guided Network (ASGNet), which is a lightweight model and adapts to object scale and shape variation.

Ranked #58 on Few-Shot Semantic Segmentation on COCO-20i (5-shot)

Clustering Few-Shot Semantic Segmentation +1

112

Paper
Code

Learning ABCs: Approximate Bijective Correspondence for isolating factors of variation with weak supervision

1 code implementation • CVPR 2022 • Kieran A. Murphy, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

We propose a novel algorithm that utilizes a weak form of supervision where the data is partitioned into sets according to certain inactive (common) factors of variation which are invariant across elements of each set.

Data Augmentation Pose Transfer

32,944

Paper
Code

Leveraging affinity cycle consistency to isolate factors of variation in learned representations

no code implementations • 1 Jan 2021 • Kieran A Murphy, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

In this work, we operate in the setting where limited information is known about the data in the form of groupings, or set membership, and the task is to learn representations which isolate the factors of variation that are common across the groupings.

Pose Transfer Representation Learning

Paper
Add Code

Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

1 code implementation • ICCV 2021 • Andrew Liu, Richard Tucker, Varun Jampani, Ameesh Makadia, Noah Snavely, Angjoo Kanazawa

We introduce the problem of perpetual view generation - long-range generation of novel views corresponding to an arbitrarily long camera trajectory given a single image.

Image Generation Perpetual View Generation +1

32,943

Paper
Code

NeRD: Neural Reflectance Decomposition from Image Collections

1 code implementation • ICCV 2021 • Mark Boss, Raphael Braun, Varun Jampani, Jonathan T. Barron, Ce Liu, Hendrik P. A. Lensch

This problem is inherently more challenging when the illumination is not a single light source under laboratory conditions but is instead an unconstrained environmental illumination.

Ranked #5 on Image Relighting on Stanford-ORB

Depth Prediction Image Relighting +3

243

Paper
Code

Improving Deep Stereo Network Generalization with Geometric Priors

no code implementations • 25 Aug 2020 • Jialiang Wang, Varun Jampani, Deqing Sun, Charles Loop, Stan Birchfield, Jan Kautz

End-to-end deep learning methods have advanced stereo vision in recent years and obtained excellent results when the training and test data are similar.

Paper
Add Code

Generative View Synthesis: From Single-view Semantics to Novel-view Images

1 code implementation • NeurIPS 2020 • Tewodros Habtegebrial, Varun Jampani, Orazio Gallo, Didier Stricker

We propose to push the envelope further, and introduce Generative View Synthesis (GVS), which can synthesize multiple photorealistic views of a scene given a single semantic map.

Image Generation Translation

Paper
Code

DeepGMR: Learning Latent Gaussian Mixture Models for Registration

2 code implementations • ECCV 2020 • Wentao Yuan, Ben Eckart, Kihwan Kim, Varun Jampani, Dieter Fox, Jan Kautz

Point cloud registration is a fundamental problem in 3D computer vision, graphics and robotics.

Point Cloud Registration

650

Paper
Code

Appearance Consensus Driven Self-Supervised Human Mesh Recovery

no code implementations • ECCV 2020 • Jogendra Nath Kundu, Mugalodi Rakesh, Varun Jampani, Rahul Mysore Venkatesh, R. Venkatesh Babu

We present a self-supervised human mesh recovery framework to infer human pose and shape from monocular images in the absence of any paired supervision.

3D Pose Estimation Human Mesh Recovery

Paper
Add Code

Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition

no code implementations • 29 Jun 2020 • Hassan Abu Alhaija, Siva Karthik Mustikovela, Justus Thies, Varun Jampani, Matthias Nießner, Andreas Geiger, Carsten Rother

Neural rendering techniques promise efficient photo-realistic image synthesis while at the same time providing rich control over scene parameters by learning the physical image formation process.

Image-to-Image Translation Intrinsic Image Decomposition +1

Paper
Add Code

From Image Collections to Point Clouds with Self-supervised Shape and Pose Networks

1 code implementation • CVPR 2020 • K L Navaneet, Ansu Mathew, Shashank Kashyap, Wei-Chih Hung, Varun Jampani, R. Venkatesh Babu

We learn both 3D point cloud reconstruction and pose estimation networks in a self-supervised manner, making use of differentiable point cloud renderer to train with 2D supervision.

3D Object Reconstruction From A Single Image 3D Point Cloud Reconstruction +2

Paper
Code

Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis

no code implementations • CVPR 2020 • Jogendra Nath Kundu, Siddharth Seth, Varun Jampani, Mugalodi Rakesh, R. Venkatesh Babu, Anirban Chakraborty

Camera captured human pose is an outcome of several sources of variation.

Ranked #5 on Weakly-supervised 3D Human Pose Estimation on Human3.6M

3D Pose Estimation Disentanglement +4

Paper
Add Code

Self-Supervised Viewpoint Learning From Image Collections

2 code implementations • CVPR 2020 • Siva Karthik Mustikovela, Varun Jampani, Shalini De Mello, Sifei Liu, Umar Iqbal, Carsten Rother, Jan Kautz

Training deep neural networks to estimate the viewpoint of objects requires large labeled training datasets.

Object Viewpoint Estimation

214

Paper
Code

Two-shot Spatially-varying BRDF and Shape Estimation

1 code implementation • CVPR 2020 • Mark Boss, Varun Jampani, Kihwan Kim, Hendrik P. A. Lensch, Jan Kautz

Extensive experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.

Vocal Bursts Valence Prediction

Paper
Code

Self-supervised Single-view 3D Reconstruction via Semantic Consistency

1 code implementation • ECCV 2020 • Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

To the best of our knowledge, we are the first to try and solve the single-view reconstruction problem without a category-specific template mesh or semantic keypoints.

3D Reconstruction Object +1

226

Paper
Code

SENSE: a Shared Encoder Network for Scene-flow Estimation

1 code implementation • ICCV 2019 • Huaizu Jiang, Deqing Sun, Varun Jampani, Zhaoyang Lv, Erik Learned-Miller, Jan Kautz

We introduce a compact network for holistic scene flow estimation, called SENSE, which shares common encoder features among four closely-related tasks: optical flow estimation, disparity estimation from stereo, occlusion estimation, and semantic segmentation.

Disparity Estimation Occlusion Estimation +3

Paper
Code

Learning Propagation for Arbitrarily-structured Data

no code implementations • ICCV 2019 • Sifei Liu, Xueting Li, Varun Jampani, Shalini De Mello, Jan Kautz

We experiment with semantic segmentation networks, where we use our propagation module to jointly train on different data -- images, superpixels and point clouds.

Point Cloud Segmentation Segmentation +2

Paper
Add Code

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

4 code implementations • ICCV 2019 • Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler

Here, we propose a new two-stream CNN architecture for semantic segmentation that explicitly wires shape information as a separate processing branch, i. e. shape stream, that processes information in parallel to the classical stream.

Ranked #24 on Semantic Segmentation on Cityscapes test

Image Segmentation Semantic Segmentation

8,283

Paper
Code

SCOPS: Self-Supervised Co-Part Segmentation

1 code implementation • CVPR 2019 • Wei-Chih Hung, Varun Jampani, Sifei Liu, Pavlo Molchanov, Ming-Hsuan Yang, Jan Kautz

Parts provide a good intermediate representation of objects that is robust with respect to the camera, pose and appearance variations.

Ranked #4 on Unsupervised Keypoint Estimation on CUB

Object Segmentation +4

218

Paper
Code

Pixel-Adaptive Convolutional Neural Networks

2 code implementations • CVPR 2019 • Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, Jan Kautz

In addition, we also demonstrate that PAC can be used as a drop-in replacement for convolution layers in pre-trained networks, resulting in consistent performance improvements.

507

Paper
Code

Superpixel Sampling Networks

2 code implementations • ECCV 2018 • Varun Jampani, Deqing Sun, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz

Superpixels provide an efficient low/mid-level representation of image data, which greatly reduces the number of image primitives for subsequent vision tasks.

Segmentation Superpixels

342

Paper
Code

Learning Superpixels With Segmentation-Aware Affinity Loss

no code implementations • CVPR 2018 • Wei-Chih Tu, Ming-Yu Liu, Varun Jampani, Deqing Sun, Shao-Yi Chien, Ming-Hsuan Yang, Jan Kautz

Specifically, we propose a new loss function that takes the segmentation error into account for affinity learning.

Segmentation Superpixels

Paper
Add Code

Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation

1 code implementation • CVPR 2019 • Anurag Ranjan, Varun Jampani, Lukas Balles, Kihwan Kim, Deqing Sun, Jonas Wulff, Michael J. Black

We address the unsupervised learning of several interconnected problems in low-level vision: single view depth prediction, camera motion estimation, optical flow, and segmentation of a video into the static scene and moving regions.

Ranked #66 on Monocular Depth Estimation on KITTI Eigen split

Depth Prediction Monocular Depth Estimation +3

488

Paper
Code

Switchable Temporal Propagation Network

1 code implementation • ECCV 2018 • Sifei Liu, Guangyu Zhong, Shalini De Mello, Jinwei Gu, Varun Jampani, Ming-Hsuan Yang, Jan Kautz

Our approach is based on a temporal propagation network (TPN), which models the transition-related affinity between a pair of frames in a purely data-driven manner.

Video Compression

176

Paper
Code

Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization

1 code implementation • 18 Apr 2018 • Jonathan Tremblay, Aayush Prakash, David Acuna, Mark Brophy, Varun Jampani, Cem Anil, Thang To, Eric Cameracci, Shaad Boochoon, Stan Birchfield

We present a system for training deep neural networks for object detection using synthetic images.

Object object-detection +1

457

Paper
Code

SPLATNet: Sparse Lattice Networks for Point Cloud Processing

2 code implementations • CVPR 2018 • Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, Jan Kautz

We present a network architecture for processing point clouds that directly operates on a collection of points represented as a sparse set of samples in a high-dimensional lattice.

Ranked #30 on Semantic Segmentation on ScanNet

3D Part Segmentation 3D Semantic Segmentation

266

Paper
Code

On the Integration of Optical Flow and Action Recognition

no code implementations • 22 Dec 2017 • Laura Sevilla-Lara, Yiyi Liao, Fatma Guney, Varun Jampani, Andreas Geiger, Michael J. Black

Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better.

Action Recognition Optical Flow Estimation +1

Paper
Add Code

Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation

5 code implementations • CVPR 2018 • Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, Jan Kautz

Finally, the two input images are warped and linearly fused to form each intermediate frame.

Ranked #3 on Video Frame Interpolation on MSU Video Frame Interpolation

Optical Flow Estimation Video Frame Interpolation +1

2,968

Paper
Code

Learning Inference Models for Computer Vision

no code implementations • 31 Aug 2017 • Varun Jampani

We propose inference techniques for both generative and discriminative vision models.

Bayesian Inference

Paper
Add Code

Semantic Video CNNs through Representation Warping

1 code implementation • ICCV 2017 • Raghudeep Gadde, Varun Jampani, Peter V. Gehler

A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training.

Optical Flow Estimation Semantic Segmentation

Paper
Code

Video Propagation Networks

no code implementations • CVPR 2017 • Varun Jampani, Raghudeep Gadde, Peter V. Gehler

We propose a 'Video Propagation Network' that processes video frames in an adaptive manner.

Ranked #72 on Semi-Supervised Video Object Segmentation on DAVIS 2016

Segmentation Semantic Segmentation +5

Paper
Add Code

Efficient 2D and 3D Facade Segmentation using Auto-Context

no code implementations • 21 Jun 2016 • Raghudeep Gadde, Varun Jampani, Renaud Marlet, Peter V. Gehler

This paper introduces a fast and efficient segmentation technique for 2D images and 3D point clouds of building facades.

Segmentation

Paper
Add Code

Optical Flow with Semantic Segmentation and Localized Layers

no code implementations • CVPR 2016 • Laura Sevilla-Lara, Deqing Sun, Varun Jampani, Michael J. Black

Existing optical flow methods make generic, spatially homogeneous, assumptions about the spatial structure of the flow.

Optical Flow Estimation Scene Segmentation +1

Paper
Add Code

Superpixel Convolutional Networks using Bilateral Inceptions

1 code implementation • 20 Nov 2015 • Raghudeep Gadde, Varun Jampani, Martin Kiefel, Daniel Kappler, Peter V. Gehler

We introduce a new 'bilateral inception' module that can be inserted in existing CNN architectures and performs bilateral filtering, at multiple feature-scales, between superpixels in an image.

Image Segmentation Segmentation +2

Paper
Code

Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks

no code implementations • CVPR 2016 • Varun Jampani, Martin Kiefel, Peter V. Gehler

The ability to learn more general forms of high-dimensional filters can be used in several diverse applications.

Image Segmentation Semantic Segmentation

Paper
Add Code

Permutohedral Lattice CNNs

no code implementations • 20 Dec 2014 • Martin Kiefel, Varun Jampani, Peter V. Gehler

This paper presents a convolutional layer that is able to process sparse input features.

Position

Paper
Add Code

Consensus Message Passing for Layered Graphical Models

no code implementations • 27 Oct 2014 • Varun Jampani, S. M. Ali Eslami, Daniel Tarlow, Pushmeet Kohli, John Winn

Generative models provide a powerful framework for probabilistic reasoning.

Paper
Add Code

The Informed Sampler: A Discriminative Approach to Bayesian Inference in Generative Computer Vision Models

1 code implementation • 4 Feb 2014 • Varun Jampani, Sebastian Nowozin, Matthew Loper, Peter V. Gehler

Computer vision is hard because of a large variability in lighting, shape, and texture; in addition the image signal is non-additive due to occlusion.

Bayesian Inference

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.