Shortcuts

mmedit.datasets.transforms.aug_shape

Module Contents

Classes

Flip

Flip the input data with a probability.

RandomRotation

Rotate the image by a randomly-chosen angle, measured in degree.

RandomTransposeHW

Randomly transpose images in H and W dimensions with a probability.

Resize

Resize data to a specific size for training or resize the images to fit

NumpyPad

Numpy Padding.

class mmedit.datasets.transforms.aug_shape.Flip(keys, flip_ratio=0.5, direction='horizontal')[source]

Bases: mmcv.transforms.BaseTransform

Flip the input data with a probability.

Reverse the order of elements in the given data with a specific direction. The shape of the data is preserved, but the elements are reordered. Required keys are the keys in attributes “keys”, added or modified keys are “flip”, “flip_direction” and the keys in attributes “keys”. It also supports flipping a list of images with the same flip.

Required Keys:

  • [KEYS]

Modified Keys:

  • [KEYS]

Parameters
  • keys (Union[str, List[str]]) – The images to be flipped.

  • flip_ratio (float) – The probability to flip the images. Default: 0.5.

  • direction (str) – Flip images horizontally or vertically. Options are “horizontal” | “vertical”. Default: “horizontal”.

_directions = ['horizontal', 'vertical'][source]
transform(results)[source]

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_shape.RandomRotation(keys, degrees)[source]

Bases: mmcv.transforms.BaseTransform

Rotate the image by a randomly-chosen angle, measured in degree.

Parameters
  • keys (list[str]) – The images to be rotated.

  • degrees (tuple[float] | tuple[int] | float | int) – If it is a tuple, it represents a range (min, max). If it is a float or int, the range is constructed as (-degrees, degrees).

transform(results)[source]

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_shape.RandomTransposeHW(keys, transpose_ratio=0.5)[source]

Bases: mmcv.transforms.BaseTransform

Randomly transpose images in H and W dimensions with a probability.

(TransposeHW = horizontal flip + anti-clockwise rotation by 90 degrees) When used with horizontal/vertical flips, it serves as a way of rotation augmentation. It also supports randomly transposing a list of images.

Required keys are the keys in attributes “keys”, added or modified keys are “transpose” and the keys in attributes “keys”.

Parameters
  • keys (list[str]) – The images to be transposed.

  • transpose_ratio (float) – The probability to transpose the images. Default: 0.5.

transform(results)[source]

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_shape.Resize(keys: Union[str, List[str]] = 'img', scale=None, keep_ratio=False, size_factor=None, max_size=None, interpolation='bilinear', backend=None, output_keys=None)[source]

Bases: mmcv.transforms.BaseTransform

Resize data to a specific size for training or resize the images to fit the network input regulation for testing.

When used for resizing images to fit network input regulation, the case is that a network may have several downsample and then upsample operation, then the input height and width should be divisible by the downsample factor of the network. For example, the network would downsample the input for 5 times with stride 2, then the downsample factor is 2^5 = 32 and the height and width should be divisible by 32.

Required keys are the keys in attribute “keys”, added or modified keys are “keep_ratio”, “scale_factor”, “interpolation” and the keys in attribute “keys”.

Required Keys:

  • Required keys are the keys in attribute “keys”

Modified Keys:

  • Modified the keys in attribute “keys” or save as new key ([OUT_KEY])

Added Keys:

  • [OUT_KEY]_shape

  • keep_ratio

  • scale_factor

  • interpolation

All keys in “keys” should have the same shape. “test_trans” is used to record the test transformation to align the input’s shape.

Parameters
  • keys (str | list[str]) – The image(s) to be resized.

  • scale (float | tuple[int]) – If scale is tuple[int], target spatial size (h, w). Otherwise, target spatial size is scaled by input size. Note that when it is used, size_factor and max_size are useless. Default: None

  • keep_ratio (bool) – If set to True, images will be resized without changing the aspect ratio. Otherwise, it will resize images to a given size. Default: False. Note that it is used together with scale.

  • size_factor (int) – Let the output shape be a multiple of size_factor. Default:None. Note that when it is used, scale should be set to None and keep_ratio should be set to False.

  • max_size (int) – The maximum size of the longest side of the output. Default:None. Note that it is used together with size_factor.

  • interpolation (str) – Algorithm used for interpolation: “nearest” | “bilinear” | “bicubic” | “area” | “lanczos”. Default: “bilinear”.

  • backend (str | None) – The image resize backend type. Options are cv2, pillow, None. If backend is None, the global imread_backend specified by mmcv.use_backend() will be used. Default: None.

  • output_keys (list[str] | None) – The resized images. Default: None Note that if it is not None, its length should be equal to keys.

_resize(img)[source]

Resize function.

Parameters

img (np.ndarray) – Image.

Returns

Resized image.

Return type

img (np.ndarray)

transform(results: Dict) Dict[source]

Transform function to resize images.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_shape.NumpyPad(keys, padding, **kwargs)[source]

Bases: mmcv.transforms.BaseTransform

Numpy Padding.

In this augmentation, numpy padding is adopted to customize padding augmentation. Please carefully read the numpy manual in: https://numpy.org/doc/stable/reference/generated/numpy.pad.html

If you just hope a single dimension to be padded, you must set padding like this:

padding = ((2, 2), (0, 0), (0, 0))

In this case, if you adopt an input with three dimension, only the first dimension will be padded.

Parameters
  • keys (Union[str, List[str]]) – The images to be padded.

  • padding (int | tuple(int)) – Please refer to the args pad_width in numpy.pad.

transform(results)[source]

Call function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__() str[source]

Return repr(self).

Read the Docs v: latest
Versions
master
latest
stable
zyh-re-docs
zyh-doc-notfound-extend
zyh-api-rendering
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.