Shortcuts

mmedit.models.editors.pggan

Package Contents

Classes

ProgressiveGrowingGAN

Progressive Growing Unconditional GAN.

PGGANDiscriminator

Discriminator for PGGAN.

PGGANGenerator

Generator for PGGAN.

EqualizedLR

Equalized Learning Rate.

EqualizedLRConvDownModule

Equalized LR (Conv + Downsample) Module.

EqualizedLRConvModule

Equalized LR ConvModule.

EqualizedLRConvUpModule

Equalized LR (Upsample + Conv) Module.

EqualizedLRLinearModule

Equalized LR LinearModule.

MiniBatchStddevLayer

Minibatch standard deviation.

PGGANNoiseTo2DFeat

Base class for all neural network modules.

PixelNorm

Pixel Normalization.

Functions

equalized_lr(module[, name, gain, mode, lr_mul])

Equalized Learning Rate.

class mmedit.models.editors.pggan.ProgressiveGrowingGAN(generator, discriminator, data_preprocessor, nkimgs_per_scale, noise_size=None, interp_real=None, transition_kimgs: int = 600, prev_stage: int = 0, ema_config: Optional[Dict] = None)[source]

Bases: mmedit.models.base_models.BaseGAN

Progressive Growing Unconditional GAN.

In this GAN model, we implement progressive growing training schedule, which is proposed in Progressive Growing of GANs for improved Quality, Stability and Variation, ICLR 2018.

We highly recommend to use GrowScaleImgDataset for saving computational load in data pre-processing.

Notes for using PGGAN:

  1. In official implementation, Tero uses gradient penalty with norm_mode="HWC"

  2. We do not implement minibatch_repeats where has been used in official Tensorflow implementation.

Notes for resuming progressive growing GANs: Users should specify the prev_stage in train_cfg. Otherwise, the model is possible to reset the optimizer status, which will bring inferior performance. For example, if your model is resumed from the 256 stage, you should set train_cfg=dict(prev_stage=256).

Parameters
  • generator (dict) – Config for generator.

  • discriminator (dict) – Config for discriminator.

forward(inputs: mmedit.utils.typing.ForwardInputs, data_samples: Optional[list] = None, mode: Optional[str] = None) mmedit.utils.typing.SampleList[source]

Sample images from noises by using the generator.

Parameters
  • batch_inputs (ForwardInputs) – Dict containing the necessary information (e.g. noise, num_batches, mode) to generate image.

  • data_samples (Optional[list]) – Data samples collated by data_preprocessor. Defaults to None.

  • mode (Optional[str]) – mode is not used in ProgressiveGrowingGAN. Defaults to None.

Returns

A list of EditDataSample contain generated results.

Return type

SampleList

train_discriminator(inputs: torch.Tensor, data_samples: List[mmedit.structures.EditDataSample], optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor][source]

Train discriminator.

Parameters
  • inputs (dict) – Inputs from dataloader.

  • data_samples (List[EditDataSample]) – Data samples from dataloader. Do not used in generator’s training.

  • optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.

Returns

A dict of tensor for logging.

Return type

Dict[str, Tensor]

disc_loss(disc_pred_fake: torch.Tensor, disc_pred_real: torch.Tensor, fake_data: torch.Tensor, real_data: torch.Tensor) Tuple[torch.Tensor, dict][source]

Get disc loss. PGGAN use WGAN-GP’s loss and discriminator shift loss to train the discriminator.

Parameters
  • disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.

  • disc_pred_real (Tensor) – Discriminator’s prediction of the real images.

  • fake_data (Tensor) – Generated images, used to calculate gradient penalty.

  • real_data (Tensor) – Real images, used to calculate gradient penalty.

Returns

Loss value and a dict of log variables.

Return type

Tuple[Tensor, dict]

train_generator(inputs: torch.Tensor, data_samples: List[mmedit.structures.EditDataSample], optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor][source]

Train generator.

Parameters
  • inputs (dict) – Inputs from dataloader.

  • data_samples (List[EditDataSample]) – Data samples from dataloader. Do not used in generator’s training.

  • optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.

Returns

A dict of tensor for logging.

Return type

Dict[str, Tensor]

gen_loss(disc_pred_fake: torch.Tensor) Tuple[torch.Tensor, dict][source]

Generator loss for PGGAN. PGGAN use WGAN’s loss to train the generator.

Parameters
  • disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.

  • recon_imgs (Tensor) – Reconstructive images.

Returns

Loss value and a dict of log variables.

Return type

Tuple[Tensor, dict]

train_step(data: dict, optim_wrapper: mmengine.optim.OptimWrapperDict)[source]

Train step function.

This function implements the standard training iteration for asynchronous adversarial training. Namely, in each iteration, we first update discriminator and then compute loss for generator with the newly updated discriminator.

As for distributed training, we use the reducer from ddp to synchronize the necessary params in current computational graph.

Parameters
  • data_batch (dict) – Input data from dataloader.

  • optimizer (dict) – Dict contains optimizer for generator and discriminator.

  • ddp_reducer (Reducer | None, optional) – Reducer from ddp. It is used to prepare for backward() in ddp. Defaults to None.

  • running_status (dict | None, optional) – Contains necessary basic information for training, e.g., iteration number. Defaults to None.

Returns

Contains ‘log_vars’, ‘num_samples’, and ‘results’.

Return type

dict

class mmedit.models.editors.pggan.PGGANDiscriminator(in_scale, label_size=0, base_channels=8192, max_channels=512, in_channels=3, channel_decay=1.0, mbstd_cfg=dict(group_size=4), fused_convdown=True, conv_module_cfg=None, fused_convdown_cfg=None, fromrgb_layer_cfg=None, downsample_cfg=None)[source]

Bases: torch.nn.Module

Discriminator for PGGAN.

Parameters
  • in_scale (int) – The scale of the input image.

  • label_size (int, optional) – Size of the label vector. Defaults to 0.

  • base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Defaults to 8192.

  • max_channels (int, optional) – Maximum channels for the feature maps in the discriminator block. Defaults to 512.

  • in_channels (int, optional) – Number of channels in input images. Defaults to 3.

  • channel_decay (float, optional) – Decay for channels of feature maps. Defaults to 1.0.

  • mbstd_cfg (dict, optional) – Configs for minibatch-stddev layer. Defaults to dict(group_size=4).

  • fused_convdown (bool, optional) – Whether use fused downconv. Defaults to True.

  • conv_module_cfg (dict, optional) – Config for the convolution module used in this generator. Defaults to None.

  • fused_convdown_cfg (dict, optional) – Config for the fused downconv module used in this discriminator. Defaults to None.

  • fromrgb_layer_cfg (dict, optional) – Config for the fromrgb layer. Defaults to None.

  • downsample_cfg (dict, optional) – Config for the downsampling operation. Defaults to None.

_default_fromrgb_cfg
_default_conv_module_cfg
_default_convdown_cfg
_num_out_channels(log_scale: int) int[source]

Calculate the number of output channels of the current network from logarithm of current scale.

Parameters

log_scale (int) – The logarithm of the current scale.

Returns

The number of output channels.

Return type

int

_get_fromrgb_layer(in_channels: int, log2_scale: int) torch.nn.Module[source]

Get the ‘fromrgb’ layer from logarithm of current scale.

Parameters
  • in_channels (int) – The number of input channels.

  • log2_scale (int) – The logarithm of the current scale.

Returns

The built from-rgb layer.

Return type

nn.Module

_get_convdown_block(in_channels: int, log2_scale: int) torch.nn.Module[source]

Get the downsample layer from logarithm of current scale.

Parameters
  • in_channels (int) – The number of input channels.

  • log2_scale (int) – The logarithm of the current scale.

Returns

The built Conv layer.

Return type

nn.Module

forward(x, transition_weight=1.0, curr_scale=- 1)[source]

Forward function.

Parameters
  • x (torch.Tensor) – Input image tensor.

  • transition_weight (float, optional) – The weight used in resolution transition. Defaults to 1.0.

  • curr_scale (int, optional) – The scale for the current inference or training. Defaults to -1.

Returns

Predict score for the input image.

Return type

Tensor

class mmedit.models.editors.pggan.PGGANGenerator(noise_size, out_scale, label_size=0, base_channels=8192, channel_decay=1.0, max_channels=512, fused_upconv=True, conv_module_cfg=None, fused_upconv_cfg=None, upsample_cfg=None)[source]

Bases: torch.nn.Module

Generator for PGGAN.

Parameters
  • noise_size (int) – Size of the input noise vector.

  • out_scale (int) – Output scale for the generated image.

  • label_size (int, optional) – Size of the label vector. Defaults to 0.

  • base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Defaults to 8192.

  • channel_decay (float, optional) – Decay for channels of feature maps. Defaults to 1.0.

  • max_channels (int, optional) – Maximum channels for the feature maps in the generator block. Defaults to 512.

  • fused_upconv (bool, optional) – Whether use fused upconv. Defaults to True.

  • conv_module_cfg (dict, optional) – Config for the convolution module used in this generator. Defaults to None.

  • fused_upconv_cfg (dict, optional) – Config for the fused upconv module used in this generator. Defaults to None.

  • upsample_cfg (dict, optional) – Config for the upsampling operation. Defaults to None.

_default_fused_upconv_cfg
_default_conv_module_cfg
_default_upsample_cfg
_get_torgb_layer(in_channels: int)[source]

Get the to-rgb layer based on in_channels.

Parameters

in_channels (int) – Number of input channels.

Returns

To-rgb layer.

Return type

nn.Module

_num_out_channels(log_scale: int)[source]

Calculate the number of output channels based on logarithm of current scale.

Parameters

log_scale (int) – The logarithm of the current scale.

Returns

The current number of output channels.

Return type

int

_get_upconv_block(in_channels, log_scale)[source]

Get the conv block for upsampling.

Parameters
  • in_channels (int) – The number of input channels.

  • log_scale (int) – The logarithmic of the current scale.

Returns

The conv block for upsampling.

Return type

nn.Module

forward(noise, label=None, num_batches=0, return_noise=False, transition_weight=1.0, curr_scale=- 1)[source]

Forward function.

Parameters
  • noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a torch.Tensor or offer a callable function to sample a batch of noise data. Otherwise, the None indicates to use the default noise sampler.

  • label (Tensor, optional) – Label vector with shape [N, C]. Defaults to None.

  • num_batches (int, optional) – The number of batch size. Defaults to 0.

  • return_noise (bool, optional) – If True, noise_batch will be returned in a dict with fake_img. Defaults to False.

  • transition_weight (float, optional) – The weight used in resolution transition. Defaults to 1.0.

  • curr_scale (int, optional) – The scale for the current inference or training. Defaults to -1.

Returns

If not return_noise, only the output image

will be returned. Otherwise, a dict contains fake_img and noise_batch will be returned.

Return type

torch.Tensor | dict

class mmedit.models.editors.pggan.EqualizedLR(name='weight', gain=2 ** 0.5, mode='fan_in', lr_mul=1.0)[source]

Equalized Learning Rate.

This trick is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation

The general idea is to dynamically rescale the weight in training instead of in initializing so that the variance of the responses in each layer is guaranteed with some statistical properties.

Note that this function is always combined with a convolution module which is initialized with \(\mathcal{N}(0, 1)\).

Parameters
  • name (str | optional) – The name of weights. Defaults to ‘weight’.

  • mode (str, optional) – The mode of computing fan which is the same as kaiming_init in pytorch. You can choose one from [‘fan_in’, ‘fan_out’]. Defaults to ‘fan_in’.

compute_weight(module)[source]

Compute weight with equalized learning rate.

Parameters

module (nn.Module) – A module that is wrapped with equalized lr.

Returns

Updated weight.

Return type

torch.Tensor

__call__(module, inputs)[source]

Standard interface for forward pre hooks.

static apply(module, name, gain=2 ** 0.5, mode='fan_in', lr_mul=1.0)[source]

Apply function.

This function is to register an equalized learning rate hook in an nn.Module.

Parameters
  • module (nn.Module) – Module to be wrapped.

  • name (str | optional) – The name of weights. Defaults to ‘weight’.

  • mode (str, optional) – The mode of computing fan which is the same as kaiming_init in pytorch. You can choose one from [‘fan_in’, ‘fan_out’]. Defaults to ‘fan_in’.

Returns

Module that is registered with equalized lr hook.

Return type

nn.Module

class mmedit.models.editors.pggan.EqualizedLRConvDownModule(*args, downsample=dict(type='fused_pool'), **kwargs)[source]

Bases: EqualizedLRConvModule

Equalized LR (Conv + Downsample) Module.

In this module, we inherit EqualizedLRConvModule and adopt downsampling after convolution. As for downsampling, we provide two modes of “avgpool” and “fused_pool”. “avgpool” denotes the commonly used average pooling operation, while “fused_pool” represents fusing downsampling and convolution. The fusion is modified from the official Tensorflow implementation in: https://github.com/tkarras/progressive_growing_of_gans/blob/master/networks.py#L109

Parameters

downsample (dict | None, optional) – Config for downsampling operation. If None, downsampling is ignored. Currently, we support the types of [“avgpool”, “fused_pool”]. Defaults to dict(type=’fused_pool’).

forward(x, **kwargs)[source]

Forward function.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Normalized tensor.

Return type

torch.Tensor

static fused_avgpool_hook(module, inputs)[source]

Standard interface for forward pre hooks.

class mmedit.models.editors.pggan.EqualizedLRConvModule(*args, equalized_lr_cfg=dict(mode='fan_in'), **kwargs)[source]

Bases: mmcv.cnn.bricks.ConvModule

Equalized LR ConvModule.

In this module, we inherit default mmcv.cnn.ConvModule and adopt equalized lr in convolution. The equalized learning rate is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation

Note that, the initialization of self.conv will be overwritten as \(\mathcal{N}(0, 1)\).

Parameters

equalized_lr_cfg (dict | None, optional) – Config for EqualizedLR. If None, equalized learning rate is ignored. Defaults to dict(mode=’fan_in’).

_init_conv_weights()[source]

Initialize conv weights as described in PGGAN.

class mmedit.models.editors.pggan.EqualizedLRConvUpModule(*args, upsample=dict(type='nearest', scale_factor=2), **kwargs)[source]

Bases: EqualizedLRConvModule

Equalized LR (Upsample + Conv) Module.

In this module, we inherit EqualizedLRConvModule and adopt upsampling before convolution. As for upsampling, in addition to the sampling layer in MMCV, we also offer the “fused_nn” type. “fused_nn” denotes fusing upsampling and convolution. The fusion is modified from the official Tensorflow implementation in: https://github.com/tkarras/progressive_growing_of_gans/blob/master/networks.py#L86

Parameters
  • upsample (dict | None, optional) – Config for upsampling operation. If

  • None

  • as (you should set it) –

  • Tensorflow (the official PGGAN in) –

  • as

  • ``dict

  • ``dict

forward(x, **kwargs)[source]

Forward function.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Forward results.

Return type

Tensor

static fused_nn_hook(module, inputs)[source]

Standard interface for forward pre hooks.

class mmedit.models.editors.pggan.EqualizedLRLinearModule(*args, equalized_lr_cfg=dict(mode='fan_in'), **kwargs)[source]

Bases: torch.nn.Linear

Equalized LR LinearModule.

In this module, we adopt equalized lr in nn.Linear. The equalized learning rate is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation

Note that, the initialization of self.weight will be overwritten as \(\mathcal{N}(0, 1)\).

Parameters

equalized_lr_cfg (dict | None, optional) – Config for EqualizedLR. If None, equalized learning rate is ignored. Defaults to dict(mode=’fan_in’).

_init_linear_weights()[source]

Initialize linear weights as described in PGGAN.

class mmedit.models.editors.pggan.MiniBatchStddevLayer(group_size=4, eps=1e-08, gather_all_batch=False)[source]

Bases: torch.nn.Module

Minibatch standard deviation.

Parameters
  • group_size (int, optional) – The size of groups in batch dimension. Defaults to 4.

  • eps (float, optional) – Epsilon value to avoid computation error. Defaults to 1e-8.

  • gather_all_batch (bool, optional) – Whether gather batch from all GPUs. Defaults to False.

forward(x)[source]

Forward function.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Forward results.

Return type

Tensor

class mmedit.models.editors.pggan.PGGANNoiseTo2DFeat(noise_size, out_channels, act_cfg=dict(type='LeakyReLU', negative_slope=0.2), norm_cfg=dict(type='PixelNorm'), normalize_latent=True, order=('linear', 'act', 'norm'))[source]

Bases: torch.nn.Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call to(), etc.

Note

As per the example above, an __init__() call to the parent class must be made before assignment on the child.

Variables

training (bool) – Boolean represents whether this module is in training or evaluation mode.

forward(x)[source]

Forward function.

Parameters

x (Tensor) – Input noise tensor with shape (n, c).

Returns

Forward results with shape (n, c, 4, 4).

Return type

Tensor

class mmedit.models.editors.pggan.PixelNorm(in_channels=None, eps=1e-06)[source]

Bases: torch.nn.Module

Pixel Normalization.

This module is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation

Parameters

eps (float, optional) – Epsilon value. Defaults to 1e-6.

_abbr_ = pn
forward(x)[source]

Forward function.

Parameters

x (torch.Tensor) – Tensor to be normalized.

Returns

Normalized tensor.

Return type

torch.Tensor

mmedit.models.editors.pggan.equalized_lr(module, name='weight', gain=2 ** 0.5, mode='fan_in', lr_mul=1.0)[source]

Equalized Learning Rate.

This trick is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation

The general idea is to dynamically rescale the weight in training instead of in initializing so that the variance of the responses in each layer is guaranteed with some statistical properties.

Note that this function is always combined with a convolution module which is initialized with \(\mathcal{N}(0, 1)\).

Parameters
  • module (nn.Module) – Module to be wrapped.

  • name (str | optional) – The name of weights. Defaults to ‘weight’.

  • mode (str, optional) – The mode of computing fan which is the same as kaiming_init in pytorch. You can choose one from [‘fan_in’, ‘fan_out’]. Defaults to ‘fan_in’.

Returns

Module that is registered with equalized lr hook.

Return type

nn.Module

Read the Docs v: latest
Versions
master
latest
stable
zyh-re-docs
zyh-doc-notfound-extend
zyh-api-rendering
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.