Large Scale GAN Training for High Fidelity Natural Image Synthesis

ICLR 2019  ·  Andrew Brock, Jeff Donahue, Karen Simonyan ·

Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128x128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.5 and Frechet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.6.

PDF Abstract ICLR 2019 PDF ICLR 2019 Abstract

Results from the Paper


 Ranked #1 on Image Generation on CIFAR-10 (NFE metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Conditional Image Generation ArtBench-10 (32x32) BigGAN + DiffAug FID 4.055 # 3
Image Generation CIFAR-10 BigGAN (DINOv2) NFE 1 # 1
FD 326.66 # 7
Conditional Image Generation CIFAR-10 BigGAN Inception score 9.22 # 6
FID 14.73 # 13
Image Generation CIFAR-10 BigGAN Inception score 9.22 # 21
FID 14.73 # 110
Image Generation ImageNet 128x128 BigGAN FID 8.7 # 13
IS 98.8 # 8
Conditional Image Generation ImageNet 128x128 BigGAN-deep FID 5.7 # 7
Inception score 124.5 # 7
Image Generation ImageNet 128x128 BigGAN-deep FID 5.7 # 9
IS 124.5 # 5
Conditional Image Generation ImageNet 128x128 BigGAN FID 8.7 # 14
Inception score 98.8 # 11
Image Generation ImageNet 256x256 BigGAN-deep FID 8.1 # 40

Methods