Shortcuts

mmedit.datasets.transforms.aug_pixel

Module Contents

Classes

BinarizeImage

Binarize image.

Clip

Clip the pixels.

ColorJitter

An interface for torch color jitter so that it can be invoked in

RandomAffine

Apply random affine to input images.

RandomMaskDilation

Randomly dilate binary masks.

UnsharpMasking

Apply unsharp masking to an image or a sequence of images.

class mmedit.datasets.transforms.aug_pixel.BinarizeImage(keys, binary_thr, a_min=0, a_max=1, dtype=np.uint8)[source]

Bases: mmcv.transforms.BaseTransform

Binarize image.

Parameters
  • keys (Sequence[str]) – The images to be binarized.

  • binary_thr (float) – Threshold for binarization.

  • a_min (int) – Lower limits of pixel value.

  • a_max (int) – Upper limits of pixel value.

  • dtype (np.dtype) – Set the data type of the output. Default: np.uint8

_binarize(img)[source]

Binarize image.

Parameters

img (np.ndarray) – Input image.

Returns

Output image.

Return type

img (np.ndarray)

transform(results)[source]

The transform function of BinarizeImage.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_pixel.Clip(keys, a_min=0, a_max=255)[source]

Bases: mmcv.transforms.BaseTransform

Clip the pixels.

Modified keys are the attributes specified in “keys”.

Parameters
  • keys (list[str]) – The keys whose values are clipped.

  • a_min (int) – Lower limits of pixel value.

  • a_max (int) – Upper limits of pixel value.

_clip(input_)[source]
transform(results)[source]

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict with the values of the specified keys are rounded

and clipped.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_pixel.ColorJitter(keys, channel_order='rgb', **kwargs)[source]

Bases: mmcv.transforms.BaseTransform

An interface for torch color jitter so that it can be invoked in mmediting pipeline.

Randomly change the brightness, contrast and saturation of an image. Modified keys are the attributes specified in “keys”.

Required Keys:

  • [KEYS]

Modified Keys:

  • [KEYS]

Parameters
  • keys (list[str]) – The images to be resized.

  • channel_order (str) – Order of channel, candidates are ‘bgr’ and ‘rgb’. Default: ‘rgb’.

Notes

**kwards follows the args list of torchvision.transforms.ColorJitter.

brightness (float or tuple of float (min, max)): How much to jitter

brightness. brightness_factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness] or the given [min, max]. Should be non negative numbers.

contrast (float or tuple of float (min, max)): How much to jitter

contrast. contrast_factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast] or the given [min, max]. Should be non negative numbers.

saturation (float or tuple of float (min, max)): How much to jitter

saturation. saturation_factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation] or the given [min, max]. Should be non negative numbers.

hue (float or tuple of float (min, max)): How much to jitter hue.

hue_factor is chosen uniformly from [-hue, hue] or the given [min, max]. Should have 0<= hue <= 0.5 or -0.5 <= min <= max <= 0.5.

_color_jitter(image, this_seed)[source]

Color Jitter Function.

Parameters
  • image (np.ndarray) – Image.

  • this_seed (int) – Seed of torch.

Returns

The output image.

Return type

image (np.ndarray)

transform(results: Dict) Dict[source]

The transform function of ColorJitter.

Parameters

results (dict) – The result dict.

Returns

The result dict.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_pixel.RandomAffine(keys, degrees, translate=None, scale=None, shear=None, flip_ratio=None)[source]

Bases: mmcv.transforms.BaseTransform

Apply random affine to input images.

This class is adopted from https://github.com/pytorch/vision/blob/v0.5.0/torchvision/transforms/ transforms.py#L1015 It should be noted that in https://github.com/Yaoyi-Li/GCA-Matting/blob/master/dataloader/ data_generator.py#L70 random flip is added. See explanation of flip_ratio below. Required keys are the keys in attribute “keys”, modified keys are keys in attribute “keys”.

Parameters
  • keys (Sequence[str]) – The images to be affined.

  • degrees (float | tuple[float]) – Range of degrees to select from. If it is a float instead of a tuple like (min, max), the range of degrees will be (-degrees, +degrees). Set to 0 to deactivate rotations.

  • translate (tuple, optional) – Tuple of maximum absolute fraction for horizontal and vertical translations. For example translate=(a, b), then horizontal shift is randomly sampled in the range -img_width * a < dx < img_width * a and vertical shift is randomly sampled in the range -img_height * b < dy < img_height * b. Default: None.

  • scale (tuple, optional) – Scaling factor interval, e.g (a, b), then scale is randomly sampled from the range a <= scale <= b. Default: None.

  • shear (float | tuple[float], optional) – Range of shear degrees to select from. If shear is a float, a shear parallel to the x axis and a shear parallel to the y axis in the range (-shear, +shear) will be applied. Else if shear is a tuple of 2 values, a x-axis shear and a y-axis shear in (shear[0], shear[1]) will be applied. Default: None.

  • flip_ratio (float, optional) – Probability of the image being flipped. The flips in horizontal direction and vertical direction are independent. The image may be flipped in both directions. Default: None.

static _get_params(degrees, translate, scale_ranges, shears, flip_ratio, img_size)[source]

Get parameters for affine transformation.

Returns

Params to be passed to the affine transformation.

Return type

paras (tuple)

static _get_inverse_affine_matrix(center, angle, translate, scale, shear, flip)[source]

Helper method to compute inverse matrix for affine transformation.

As it is explained in PIL.Image.rotate, we need compute INVERSE of affine transformation matrix: M = T * C * RSS * C^-1 where T is translation matrix:

[1, 0, tx | 0, 1, ty | 0, 0, 1];

C is translation matrix to keep center:

[1, 0, cx | 0, 1, cy | 0, 0, 1];

RSS is rotation with scale and shear matrix.

It is different from the original function in torchvision. 1. The order are changed to flip -> scale -> rotation -> shear. 2. x and y have different scale factors. RSS(shear, a, scale, f) =

[ cos(a + shear)*scale_x*f -sin(a + shear)*scale_y 0] [ sin(a)*scale_x*f cos(a)*scale_y 0] [ 0 0 1]

Thus, the inverse is M^-1 = C * RSS^-1 * C^-1 * T^-1.

transform(results)[source]

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_pixel.RandomMaskDilation(keys, binary_thr=0.0, kernel_min=9, kernel_max=49)[source]

Bases: mmcv.transforms.BaseTransform

Randomly dilate binary masks.

Parameters
  • keys (Sequence[str]) – The images to be resized.

  • binary_thr (float) – Threshold for obtaining binary mask. Default: 0.

  • kernel_min (int) – Min size of dilation kernel. Default: 9.

  • kernel_max (int) – Max size of dilation kernel. Default: 49.

_random_dilate(img)[source]
transform(results)[source]

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

class mmedit.datasets.transforms.aug_pixel.UnsharpMasking(kernel_size, sigma, weight, threshold, keys)[source]

Bases: mmcv.transforms.BaseTransform

Apply unsharp masking to an image or a sequence of images.

Parameters
  • kernel_size (int) – The kernel_size of the Gaussian kernel.

  • sigma (float) – The standard deviation of the Gaussian.

  • weight (float) – The weight of the “details” in the final output.

  • threshold (float) – Pixel differences larger than this value are regarded as “details”.

  • keys (list[str]) – The keys whose values are processed.

Added keys are “xxx_unsharp”, where “xxx” are the attributes specified in “keys”.

_unsharp_masking(imgs)[source]

Unsharp masking function.

transform(results)[source]

transform function.

Parameters

results (dict) – A dict containing the necessary information and data for augmentation.

Returns

A dict containing the processed data and information.

Return type

dict

__repr__()[source]

Return repr(self).

Read the Docs v: latest
Versions
master
latest
stable
zyh-re-docs
zyh-doc-notfound-extend
zyh-api-rendering
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.