Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain $X$ to a target domain $Y$ in the absence of paired examples. Our goal is to learn a mapping $G: X \rightarrow Y$ such that the distribution of images from $G(X)$ is indistinguishable from the distribution $Y$ using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping $F: Y \rightarrow X$ and introduce a cycle consistency loss to push $F(G(X)) \approx X$ (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

PDF Abstract ICCV 2017 PDF ICCV 2017 Abstract

Results from the Paper


 Ranked #1 on Image-to-Image Translation on zebra2horse (Frechet Inception Distance metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Multimodal Unsupervised Image-To-Image Translation Cats-and-Dogs CycleGAN CIS 0.076 # 3
IS 0.813 # 3
Facial Expression Translation CelebA CycleGAN AMT 34.6 # 3
Image-to-Image Translation Cityscapes Labels-to-Photo CycleGAN Class IOU 0.11 # 2
Per-class Accuracy 17% # 2
Per-pixel Accuracy 52% # 9
Image-to-Image Translation Cityscapes Photo-to-Labels CycleGAN Per-pixel Accuracy 58% # 2
Per-class Accuracy 22% # 2
Class IOU 0.16 # 2
Multimodal Unsupervised Image-To-Image Translation Edge-to-Handbags CycleGAN Quality 40.8% # 3
Diversity 0.012 # 4
Multimodal Unsupervised Image-To-Image Translation Edge-to-Shoes CycleGAN Quality 36.0% # 4
Diversity 0.010 # 4
Multimodal Unsupervised Image-To-Image Translation EPFL NIR-VIS cycGAN PSNR 17.38 # 2
Unsupervised Image-To-Image Translation Freiburg Forest Dataset cycGAN PSNR 18.57 # 2
Image-to-Image Translation horse2zebra CycleGAN Frechet Inception Distance 89.7 # 2
Number of params 28.2M # 3
Image-to-Image Translation photo2vangogh CycleGAN Frechet Inception Distance 151.4 # 1
Number of params 28.2M # 3
Image-to-Image Translation RaFD CycleGAN Classification Error 5.99% # 3
Image-to-Image Translation vangogh2photo CycleGAN Frechet Inception Distance 163.4 # 2
Number of Params 28.2M # 3
Image-to-Image Translation zebra2horse CycleGAN Frechet Inception Distance 110.5 # 1
Number of params 28.2M # 3

Methods