Shortcuts

mmedit.models.editors.sagan

Package Contents

Classes

SAGAN

Impelmentation of Self-Attention Generative Adversarial Networks.

ProjDiscriminator

Discriminator for SNGAN / Proj-GAN. The implementation is refer to

SNGANGenerator

Generator for SNGAN / Proj-GAN. The implementation refers to

SNGANDiscHeadResBlock

The first ResBlock used in discriminator of sngan / proj-gan. Compared

SNGANDiscResBlock

resblock used in discriminator of sngan / proj-gan.

SNGANGenResBlock

ResBlock used in Generator of SNGAN / Proj-GAN.

class mmedit.models.editors.sagan.SAGAN(generator: ModelType, discriminator: Optional[ModelType] = None, data_preprocessor: Optional[Union[dict, mmengine.Config]] = None, generator_steps: int = 1, discriminator_steps: int = 1, noise_size: Optional[int] = 128, num_classes: Optional[int] = None, ema_config: Optional[Dict] = None)[source]

Bases: mmedit.models.base_models.BaseConditionalGAN

Impelmentation of Self-Attention Generative Adversarial Networks.

<https://arxiv.org/abs/1805.08318>`_ (SAGAN), Spectral Normalization for Generative Adversarial Networks (SNGAN), and cGANs with Projection Discriminator (Proj-GAN).

Detailed architecture can be found in SNGANGenerator and ProjDiscriminator

Parameters
  • generator (ModelType) – The config or model of the generator.

  • discriminator (Optional[ModelType]) – The config or model of the discriminator. Defaults to None.

  • data_preprocessor (Optional[Union[dict, Config]]) – The pre-process config or GenDataPreprocessor.

  • generator_steps (int) – Number of times the generator was completely updated before the discriminator is updated. Defaults to 1.

  • discriminator_steps (int) – Number of times the discriminator was completely updated before the generator is updated. Defaults to 1.

  • noise_size (Optional[int]) – Size of the input noise vector. Default to 128.

  • num_classes (Optional[int]) – The number classes you would like to generate. Defaults to None.

  • ema_config (Optional[Dict]) – The config for generator’s exponential moving average setting. Defaults to None.

disc_loss(disc_pred_fake: torch.Tensor, disc_pred_real: torch.Tensor) Tuple[torch.Tensor, dict][source]

Get disc loss. SAGAN, SNGAN and Proj-GAN use hinge loss to train the discriminator.

Parameters
  • disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.

  • disc_pred_real (Tensor) – Discriminator’s prediction of the real images.

Returns

Loss value and a dict of log variables.

Return type

Tuple[Tensor, dict]

gen_loss(disc_pred_fake: torch.Tensor) Tuple[torch.Tensor, dict][source]

Get disc loss. SAGAN, SNGAN and Proj-GAN use hinge loss to train the generator.

Parameters

disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.

Returns

Loss value and a dict of log variables.

Return type

Tuple[Tensor, dict]

train_discriminator(inputs: dict, data_samples: List[mmedit.structures.EditDataSample], optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor][source]

Train discriminator.

Parameters
  • inputs (dict) – Inputs from dataloader.

  • data_samples (List[EditDataSample]) – Data samples from dataloader.

  • optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.

Returns

A dict of tensor for logging.

Return type

Dict[str, Tensor]

train_generator(inputs: dict, data_samples: List[mmedit.structures.EditDataSample], optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor][source]

Train generator.

Parameters
  • inputs (dict) – Inputs from dataloader.

  • data_samples (List[EditDataSample]) – Data samples from dataloader. Do not used in generator’s training.

  • optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.

Returns

A dict of tensor for logging.

Return type

Dict[str, Tensor]

class mmedit.models.editors.sagan.ProjDiscriminator(input_scale, num_classes=0, base_channels=128, input_channels=3, attention_cfg=dict(type='SelfAttentionBlock'), attention_after_nth_block=- 1, channels_cfg=None, downsample_cfg=None, from_rgb_cfg=dict(type='SNGANDiscHeadResBlock'), blocks_cfg=dict(type='SNGANDiscResBlock'), act_cfg=dict(type='ReLU'), with_spectral_norm=True, sn_style='torch', sn_eps=1e-12, init_cfg=dict(type='BigGAN'), pretrained=None)[source]

Bases: torch.nn.Module

Discriminator for SNGAN / Proj-GAN. The implementation is refer to https://github.com/pfnet-research/sngan_projection/tree/master/dis_models

The overall structure of the projection discriminator can be split into a from_rgb layer, a group of ResBlocks, a linear decision layer, and a projection layer. To support defining custom layers, we introduce from_rgb_cfg and blocks_cfg.

The design of the model structure is highly corresponding to the output resolution. Therefore, we provide channels_cfg and downsample_cfg to control the input channels and the downsample behavior of the intermedia blocks.

downsample_cfg: In default config of SNGAN / Proj-GAN, whether to apply

downsample in each intermedia blocks is quite flexible and corresponding to the resolution of the output image. Therefore, we support user to define the downsample_cfg by themselves, and to control the structure of the discriminator.

channels_cfg: In default config of SNGAN / Proj-GAN, the number of

ResBlocks and the channels of those blocks are corresponding to the resolution of the output image. Therefore, we allow user to define channels_cfg for try their own models. We also provide a default config to allow users to build the model only from the output resolution.

Parameters
  • input_scale (int) – The scale of the input image.

  • num_classes (int, optional) – The number classes you would like to generate. If num_classes=0, no label projection would be used. Default to 0.

  • base_channels (int, optional) – The basic channel number of the discriminator. The other layers contains channels based on this number. Defaults to 128.

  • input_channels (int, optional) – Channels of the input image. Defaults to 3.

  • attention_cfg (dict, optional) – Config for the self-attention block. Default to dict(type='SelfAttentionBlock').

  • attention_after_nth_block (int | list[int], optional) – Self-attention block would be added after which ConvBlock (including the head block). If int is passed, only one attention block would be added. If list is passed, self-attention blocks would be added after multiple ConvBlocks. To be noted that if the input is smaller than 1, self-attention corresponding to this index would be ignored. Default to 0.

  • channels_cfg (list | dict[list], optional) – Config for input channels of the intermedia blocks. If list is passed, each element of the list means the input channels of current block is how many times compared to the base_channels. For block i, the input and output channels should be channels_cfg[i] and channels_cfg[i+1] If dict is provided, the key of the dict should be the output scale and corresponding value should be a list to define channels. Default: Please refer to _defualt_channels_cfg.

  • downsample_cfg (list[bool] | dict[list], optional) – Config for downsample behavior of the intermedia layers. If a list is passed, downsample_cfg[idx] == True means apply downsample in idx-th block, and vice versa. If dict is provided, the key dict should be the input scale of the image and corresponding value should be a list ti define the downsample behavior. Default: Please refer to _default_downsample_cfg.

  • from_rgb_cfg (dict, optional) – Config for the first layer to convert rgb image to feature map. Defaults to dict(type='SNGANDiscHeadResBlock').

  • blocks_cfg (dict, optional) – Config for the intermedia blocks. Defaults to dict(type='SNGANDiscResBlock')

  • act_cfg (dict, optional) – Activation config for the final output layer. Defaults to dict(type='ReLU').

  • with_spectral_norm (bool, optional) – Whether use spectral norm for all conv blocks or not. Default to True.

  • sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.

  • sn_eps (float, optional) – eps for spectral normalization operation. Defaults to 1e-12.

  • init_cfg (dict, optional) – Config for weight initialization. Default to dict(type='BigGAN').

  • pretrained (str | dict , optional) – Path for the pretrained model or dict containing information for pretained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.

_defualt_channels_cfg
_defualt_downsample_cfg
forward(x, label=None)[source]

Forward function. If self.num_classes is larger than 0, label projection would be used.

Parameters
  • x (torch.Tensor) – Fake or real image tensor.

  • label (torch.Tensor, options) – Label correspond to the input image. Noted that, if self.num_classed is larger than 0, label should not be None. Default to None.

Returns

Prediction for the reality of the input image.

Return type

torch.Tensor

init_weights(pretrained=None, strict=True)[source]

Init weights for SNGAN-Proj and SAGAN. If pretrained=None and weight initialization would follow the INIT_TYPE in init_cfg=dict(type=INIT_TYPE).

For SNGAN-Proj (INIT_TYPE.upper() in ['SNGAN', 'SNGAN-PROJ', 'GAN-PROJ']), we follow the initialization method in the official Chainer’s implementation (https://github.com/pfnet-research/sngan_projection).

For SAGAN (INIT_TYPE.upper() == 'SAGAN'), we follow the initialization method in official tensorflow’s implementation (https://github.com/brain-research/self-attention-gan).

Besides the reimplementation of the official code’s initialization, we provide BigGAN’s and Pytorch-StudioGAN’s style initialization (INIT_TYPE.upper() == BIGGAN and INIT_TYPE.upper() == STUDIO). Please refer to https://github.com/ajbrock/BigGAN-PyTorch and https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.

Parameters

pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.

class mmedit.models.editors.sagan.SNGANGenerator(output_scale, num_classes=0, base_channels=64, out_channels=3, input_scale=4, noise_size=128, attention_cfg=dict(type='SelfAttentionBlock'), attention_after_nth_block=0, channels_cfg=None, blocks_cfg=dict(type='SNGANGenResBlock'), act_cfg=dict(type='ReLU'), use_cbn=True, auto_sync_bn=True, with_spectral_norm=False, with_embedding_spectral_norm=None, sn_style='torch', norm_eps=0.0001, sn_eps=1e-12, init_cfg=dict(type='BigGAN'), pretrained=None, rgb_to_bgr=False)[source]

Bases: torch.nn.Module

Generator for SNGAN / Proj-GAN. The implementation refers to https://github.com/pfnet-research/sngan_projection/tree/master/gen_models

In our implementation, we have two notable design. Namely, channels_cfg and blocks_cfg.

channels_cfg: In default config of SNGAN / Proj-GAN, the number of

ResBlocks and the channels of those blocks are corresponding to the resolution of the output image. Therefore, we allow user to define channels_cfg to try their own models. We also provide a default config to allow users to build the model only from the output resolution.

block_cfg: In reference code, the generator consists of a group of

ResBlock. However, in our implementation, to make this model more generalize, we support defining blocks_cfg by users and loading the blocks by calling the build_module method.

Parameters
  • output_scale (int) – Output scale for the generated image.

  • num_classes (int, optional) – The number classes you would like to generate. This arguments would influence the structure of the intermedia blocks and label sampling operation in forward (e.g. If num_classes=0, ConditionalNormalization layers would degrade to unconditional ones.). This arguments would be passed to intermedia blocks by overwrite their config. Defaults to 0.

  • base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Default to 64.

  • out_channels (int, optional) – Channels of the output images. Default to 3.

  • input_scale (int, optional) – Input scale for the features. Defaults to 4.

  • noise_size (int, optional) – Size of the input noise vector. Default to 128.

  • attention_cfg (dict, optional) – Config for the self-attention block. Default to dict(type='SelfAttentionBlock').

  • attention_after_nth_block (int | list[int], optional) – Self attention block would be added after which ConvBlock. If int is passed, only one attention block would be added. If list is passed, self-attention blocks would be added after multiple ConvBlocks. To be noted that if the input is smaller than 1, self-attention corresponding to this index would be ignored. Default to 0.

  • channels_cfg (list | dict[list], optional) – Config for input channels of the intermedia blocks. If list is passed, each element of the list means the input channels of current block is how many times compared to the base_channels. For block i, the input and output channels should be channels_cfg[i] and channels_cfg[i+1] If dict is provided, the key of the dict should be the output scale and corresponding value should be a list to define channels. Default: Please refer to _defualt_channels_cfg.

  • blocks_cfg (dict, optional) – Config for the intermedia blocks. Defaults to dict(type='SNGANGenResBlock')

  • act_cfg (dict, optional) – Activation config for the final output layer. Defaults to dict(type='ReLU').

  • use_cbn (bool, optional) – Whether use conditional normalization. This argument would pass to norm layers. Defaults to True.

  • auto_sync_bn (bool, optional) – Whether convert Batch Norm to Synchronized ones when Distributed training is on. Defaults to True.

  • with_spectral_norm (bool, optional) – Whether use spectral norm for conv blocks or not. Default to False.

  • with_embedding_spectral_norm (bool, optional) – Whether use spectral norm for embedding layers in normalization blocks or not. If not specified (set as None), with_embedding_spectral_norm would be set as the same value as with_spectral_norm. Defaults to None.

  • sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.

  • norm_eps (float, optional) – eps for Normalization layers (both conditional and non-conditional ones). Default to 1e-4.

  • sn_eps (float, optional) – eps for spectral normalization operation. Defaults to 1e-12.

  • init_cfg (string, optional) – Config for weight initialization. Defaults to dict(type='BigGAN').

  • pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.

  • rgb_to_bgr (bool, optional) – Whether to reformat the output channels with order bgr. We provide several pre-trained BigGAN weights whose output channels order is rgb. You can set this argument to True to use the weights.

_default_channels_cfg
forward(noise, num_batches=0, label=None, return_noise=False)[source]

Forward function.

Parameters
  • noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a torch.Tensor or offer a callable function to sample a batch of noise data. Otherwise, the None indicates to use the default noise sampler.

  • num_batches (int, optional) – The number of batch size. Defaults to 0.

  • label (torch.Tensor | callable | None) – You can directly give a batch of label through a torch.Tensor or offer a callable function to sample a batch of label data. Otherwise, the None indicates to use the default label sampler.

  • return_noise (bool, optional) – If True, noise_batch will be returned in a dict with fake_img. Defaults to False.

Returns

If not return_noise, only the output

image will be returned. Otherwise, a dict contains fake_image, noise_batch and label_batch would be returned.

Return type

torch.Tensor | dict

init_weights(pretrained=None, strict=True)[source]

Init weights for SNGAN-Proj and SAGAN. If pretrained=None, weight initialization would follow the INIT_TYPE in init_cfg=dict(type=INIT_TYPE).

For SNGAN-Proj, (INIT_TYPE.upper() in ['SNGAN', 'SNGAN-PROJ', 'GAN-PROJ']), we follow the initialization method in the official Chainer’s implementation (https://github.com/pfnet-research/sngan_projection).

For SAGAN (INIT_TYPE.upper() == 'SAGAN'), we follow the initialization method in official tensorflow’s implementation (https://github.com/brain-research/self-attention-gan).

Besides the reimplementation of the official code’s initialization, we provide BigGAN’s and Pytorch-StudioGAN’s style initialization (INIT_TYPE.upper() == BIGGAN and INIT_TYPE.upper() == STUDIO). Please refer to https://github.com/ajbrock/BigGAN-PyTorch and https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.

Parameters

pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.

class mmedit.models.editors.sagan.SNGANDiscHeadResBlock(in_channels, out_channels, conv_cfg=None, act_cfg=dict(type='ReLU'), with_spectral_norm=True, sn_eps=1e-12, sn_style='torch', init_cfg=dict(type='BigGAN'))[source]

Bases: torch.nn.Module

The first ResBlock used in discriminator of sngan / proj-gan. Compared to SNGANDisResBlock, this module has a different forward order.

Parameters
  • in_channels (int) – Input channels.

  • out_channels (int) – Output channels.

  • downsample (bool, optional) – whether apply downsample operation in this module. default to false.

  • conv_cfg (dict | none) – config for conv blocks of this module. if pass none, would use _default_conv_cfg. default to none.

  • act_cfg (dict, optional) – config for activate function. default to dict(type='relu').

  • with_spectral_norm (bool, optional) – whether use spectral norm for conv blocks and norm layers. default to true.

  • sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.

  • sn_eps (float, optional) – eps for spectral normalization operation. Default to 1e-12.

  • init_cfg (dict, optional) – Config for weight initialization. Default to dict(type='BigGAN').

_default_conv_cfg
forward(x: torch.Tensor) torch.Tensor[source]

Forward function.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Forward results.

Return type

Tensor

forward_shortcut(x: torch.Tensor) torch.Tensor[source]

Forward the shortcut branch.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Forward results.

Return type

Tensor

init_weights()[source]

Initialize weights.

class mmedit.models.editors.sagan.SNGANDiscResBlock(in_channels, out_channels, hidden_channels=None, downsample=False, act_cfg=dict(type='ReLU'), conv_cfg=None, with_spectral_norm=True, sn_style='torch', sn_eps=1e-12, init_cfg=dict(type='BigGAN'))[source]

Bases: torch.nn.Module

resblock used in discriminator of sngan / proj-gan.

Parameters
  • in_channels (int) – input channels.

  • out_channels (int) – output channels.

  • hidden_channels (int, optional) – input channels of the second conv layer of the block. if none is given, would be set as out_channels. Defaults to none.

  • downsample (bool, optional) – whether apply downsample operation in this module. Defaults to false.

  • act_cfg (dict, optional) – config for activate function. default to dict(type='relu').

  • conv_cfg (dict | none) – config for conv blocks of this module. if pass none, would use _default_conv_cfg. default to none.

  • with_spectral_norm (bool, optional) – whether use spectral norm for conv blocks and norm layers. Defaults to true.

  • sn_eps (float, optional) – eps for spectral normalization operation. Default to 1e-12.

  • sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.

  • init_cfg (dict, optional) – Config for weight initialization. Defaults to dict(type='BigGAN').

_default_conv_cfg
forward(x)[source]

Forward function.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Forward results.

Return type

Tensor

forward_shortcut(x: torch.Tensor) torch.Tensor[source]

Forward the shortcut branch.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Forward results.

Return type

Tensor

init_weights()[source]

Initialize weights.

class mmedit.models.editors.sagan.SNGANGenResBlock(in_channels, out_channels, hidden_channels=None, num_classes=0, use_cbn=True, use_norm_affine=False, act_cfg=dict(type='ReLU'), norm_cfg=dict(type='BN'), upsample_cfg=dict(type='nearest', scale_factor=2), upsample=True, auto_sync_bn=True, conv_cfg=None, with_spectral_norm=False, with_embedding_spectral_norm=None, sn_style='torch', norm_eps=0.0001, sn_eps=1e-12, init_cfg=dict(type='BigGAN'))[source]

Bases: torch.nn.Module

ResBlock used in Generator of SNGAN / Proj-GAN.

Parameters
  • in_channels (int) – Input channels.

  • out_channels (int) – Output channels.

  • hidden_channels (int, optional) – Input channels of the second Conv layer of the block. If None is given, would be set as out_channels. Default to None.

  • num_classes (int, optional) – Number of classes would like to generate. This argument would pass to norm layers and influence the structure and behavior of the normalization process. Default to 0.

  • use_cbn (bool, optional) – Whether use conditional normalization. This argument would pass to norm layers. Default to True.

  • use_norm_affine (bool, optional) – Whether use learnable affine parameters in norm operation when cbn is off. Default False.

  • act_cfg (dict, optional) – Config for activate function. Default to dict(type='ReLU').

  • upsample_cfg (dict, optional) – Config for the upsample method. Default to dict(type='nearest', scale_factor=2).

  • upsample (bool, optional) – Whether apply upsample operation in this module. Default to True.

  • auto_sync_bn (bool, optional) – Whether convert Batch Norm to Synchronized ones when Distributed training is on. Default to True.

  • conv_cfg (dict | None) – Config for conv blocks of this module. If pass None, would use _default_conv_cfg. Default to None.

  • with_spectral_norm (bool, optional) – Whether use spectral norm for conv blocks and norm layers. Default to True.

  • with_embedding_spectral_norm (bool, optional) – Whether use spectral norm for embedding layers in normalization blocks or not. If not specified (set as None), with_embedding_spectral_norm would be set as the same value as with_spectral_norm. Default to None.

  • sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.

  • norm_eps (float, optional) – eps for Normalization layers (both conditional and non-conditional ones). Default to 1e-4.

  • sn_eps (float, optional) – eps for spectral normalization operation. Default to 1e-12.

  • init_cfg (dict, optional) – Config for weight initialization. Default to dict(type='BigGAN').

_default_conv_cfg
forward(x, y=None)[source]

Forward function.

Parameters
  • x (Tensor) – Input tensor with shape (n, c, h, w).

  • y (Tensor) – Input label with shape (n, ). Default None.

Returns

Forward results.

Return type

Tensor

forward_shortcut(x)[source]

Forward the shortcut branch.

Parameters

x (Tensor) – Input tensor with shape (n, c, h, w).

Returns

Forward results.

Return type

Tensor

init_weights()[source]

Initialize weights for the model.

Read the Docs v: latest
Versions
master
latest
stable
zyh-re-docs
zyh-doc-notfound-extend
zyh-api-rendering
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.