Paper Reading: Resolution-robust Large Mask Inpainting with Fourier Convolutions

This paper1 proposed a method for high-resolution images inpainting with large missing areas. The authors suggested the challenge of the task is the lack of large receptive fields of conventional convolution neural networks. Three contributions were claimed but I find the fast Fourier Convolutions (FFC)2 to be the most essential component, which was claimed by this paper that it allows for the image-wide receptive field that covers an entire image. The experiment results were very good. It’s funny that the FFC paper came out in 2020 but with only a few citations.

Summary

Goal: Develop modern image inpainting system

Challenges:

  • large missing areas
  • complex geometric structures
  • high-resolution images

Reasons: lack of an effective receptive field in both the inpainting network and the loss function

Solutions:

  • fast Fourier convolution
  • high receptive field perceptual loss
  • large training masks

Methodology

scheme

Fast Fourier Convolutions2

image-20211019173512461

Loss Functions

Final loss: \(\mathcal{L}_{final}=\kappa L_{Adv}+\alpha \mathcal{L}_{HRFRL} + \beta \mathcal{L}_{DiscPL} + \gamma R_1\)

  • $L_{Adv}$: Adversarial loss

  • $\mathcal{L}_{HRFRL}$: High receptive field perceptual loss

    \(\mathcal{L}_{HRFRL}(x,\hat{x})=\mathcal{M}([\phi_{HRF}(x)-\phi_{HRF}(\hat{x})]^2)\),

    where $\phi_{HRF}$ is a pre-trained network, $\mathcal{M}$ is the sequential two-stage mean operation (interlayer mean of intra-layer means).

  • $\mathcal{L}_{DiscPL}$: discriminator-based perceptual loss (or feature matching loss)3

  • $R_1$: Gradient penalty4

Generation of Masks

image-20211019170856702

Experiments

Ablation Study on Fast Fourier Convolutions

image-20211019174309601

image-20211019173624223

image-20211019173637378

Ablation Study on Losses

image-20211019174435621

Ablation Study on Masks

image-20211019174446320