Shortcuts

mmedit.models.data_preprocessors

Package Contents

Classes

EditDataPreprocessor

Basic data pre-processor used for collating and copying data to the

GenDataPreprocessor

Image pre-processor for generative models. This class provide

MattorPreprocessor

DataPreprocessor for matting models.

Functions

split_batch(batch_tensor, padded_sizes)

reverse operation of stack_batch.

stack_batch(tensor_list[, pad_size_divisor, pad_args])

Stack multiple tensors to form a batch and pad the images to the max

class mmedit.models.data_preprocessors.EditDataPreprocessor(mean: Sequence[Union[float, int]] = (0, 0, 0), std: Sequence[Union[float, int]] = (255, 255, 255), pad_size_divisor: int = 1, input_view=(- 1, 1, 1), output_view=None, pad_args: dict = dict())[源代码]

Bases: mmengine.model.BaseDataPreprocessor

Basic data pre-processor used for collating and copying data to the target device in mmediting.

EditDataPreprocessor performs data pre-processing according to the following steps:

  • Collates the data sampled from dataloader.

  • Copies data to the target device.

  • Stacks the input tensor at the first dimension.

and post-processing of the output tensor of model.

TODO: Most editing methods have crop inputs to a same size, batched padding

will be faster.

参数
  • mean (Sequence[float or int]) – The pixel mean of R, G, B channels. Defaults to (0, 0, 0). If mean and std are not specified, ImgDataPreprocessor will normalize images to [0, 1].

  • std (Sequence[float or int]) – The pixel standard deviation of R, G, B channels. (255, 255, 255). If mean and std are not specified, ImgDataPreprocessor will normalize images to [0, 1].

  • pad_size_divisor (int) – The size of padded image should be divisible by pad_size_divisor. Defaults to 1.

  • input_view (Tuple | List) – Tensor view of mean and std for input (without batch). Defaults to (-1, 1, 1) for (C, H, W).

  • output_view (Tuple | List | None) – Tensor view of mean and std for output (without batch). If None, output_view=input_view. Defaults: None.

  • pad_args (dict) – Args of F.pad. Default: dict().

forward(data: Sequence[dict], training: bool = False) Tuple[torch.Tensor, Optional[list]]

Pre-process the data into the model input format.

After the data pre-processing of collate_data(), forward will stack the input tensor list to a batch tensor at the first dimension.

参数
  • data (Sequence[dict]) – data sampled from dataloader.

  • training (bool) – Whether to enable training time augmentation. Default: False.

返回

Data in the same format as the model input.

返回类型

Tuple[torch.Tensor, Optional[list]]

destructor(batch_tensor: torch.Tensor)

Destructor of data processor. Destruct padding, normalization and dissolve batch.

参数

batch_tensor (Tensor) – Batched output.

返回

Destructed output.

返回类型

Tensor

mmedit.models.data_preprocessors.split_batch(batch_tensor: torch.Tensor, padded_sizes: torch.Tensor)[源代码]

reverse operation of stack_batch.

参数
  • batch_tensor (Tensor) – The 4D-tensor or 5D-tensor. Tensor.dim == tensor_list[0].dim + 1

  • padded_sizes (Tensor) – The padded sizes of each tensor.

返回

A list of tensors with the same dim.

返回类型

tensor_list (List[Tensor])

mmedit.models.data_preprocessors.stack_batch(tensor_list: List[torch.Tensor], pad_size_divisor: int = 1, pad_args: dict = dict())[源代码]

Stack multiple tensors to form a batch and pad the images to the max shape use the right bottom padding mode in these images.

If pad_size_divisor > 0, add padding to ensure the shape of each dim is divisible by pad_size_divisor.

参数
  • tensor_list (List[Tensor]) – A list of tensors with the same dim.

  • pad_size_divisor (int) – If pad_size_divisor > 0, add padding to ensure the shape of each dim is divisible by pad_size_divisor. This depends on the model, and many models need to be divisible by 32. Defaults to 1

  • pad_args (dict) – The padding args.

返回

The 4D-tensor or 5D-tensor. Tensor.dim == tensor_list[0].dim + 1 padded_sizes (Tensor): The padded sizes of each tensor.

返回类型

batch_tensor (Tensor)

class mmedit.models.data_preprocessors.GenDataPreprocessor(mean: Sequence[Union[float, int]] = (127.5, 127.5, 127.5), std: Sequence[Union[float, int]] = (127.5, 127.5, 127.5), pad_size_divisor: int = 1, pad_value: Union[float, int] = 0, bgr_to_rgb: bool = False, rgb_to_bgr: bool = False, non_image_keys: Optional[Tuple[str, List[str]]] = None, non_concentate_keys: Optional[Tuple[str, List[str]]] = None)[源代码]

Bases: mmengine.model.ImgDataPreprocessor

Image pre-processor for generative models. This class provide normalization and bgr to rgb conversion for image tensor inputs. The input of this classes should be dict which keys are inputs and data_samples.

Besides to process tensor inputs, this class support dict as inputs. - If the value is Tensor and the corresponding key is not contained in _NON_IMAGE_KEYS, it will be processed as image tensor. - If the value is Tensor and the corresponding key belongs to _NON_IMAGE_KEYS, it will not remains unchanged. - If value is string or integer, it will not remains unchanged.

参数
  • mean (Sequence[float or int], optional) – The pixel mean of image channels. If bgr_to_rgb=True it means the mean value of R, G, B channels. If it is not specified, images will not be normalized. Defaults None.

  • std (Sequence[float or int], optional) – The pixel standard deviation of image channels. If bgr_to_rgb=True it means the standard deviation of R, G, B channels. If it is not specified, images will not be normalized. Defaults None.

  • pad_size_divisor (int) – The size of padded image should be divisible by pad_size_divisor. Defaults to 1.

  • pad_value (float or int) – The padded pixel value. Defaults to 0.

  • bgr_to_rgb (bool) – whether to convert image from BGR to RGB. Defaults to False.

  • rgb_to_bgr (bool) – whether to convert image from RGB to RGB. Defaults to False.

_NON_IMAGE_KEYS = ['noise']
_NON_CONCENTATE_KEYS = ['num_batches', 'mode', 'sample_kwargs', 'eq_cfg']
cast_data(data: CastData) CastData

Copying data to the target device.

参数

data (dict) – Data returned by DataLoader.

返回

Inputs and data sample at target device.

返回类型

CollatedResult

_preprocess_image_tensor(inputs: torch.Tensor) torch.Tensor

Process image tensor.

参数

inputs (Tensor) – List of image tensor to process.

返回

Processed and stacked image tensor.

返回类型

Tensor

process_dict_inputs(batch_inputs: dict) dict

Preprocess dict type inputs.

参数

batch_inputs (dict) – Input dict.

返回

Preprocessed dict.

返回类型

dict

forward(data: dict, training: bool = False) dict

Performs normalization、padding and bgr2rgb conversion based on BaseDataPreprocessor.

参数
  • data (dict) – Input data to process.

  • training (bool) – Whether to enable training time augmentation. This is ignored for GenDataPreprocessor. Defaults to False.

返回

Data in the same format as the model input.

返回类型

dict

destructor(batch_tensor: torch.Tensor)

Destructor of data processor. Destruct padding, normalization and dissolve batch.

参数

batch_tensor (Tensor) – Batched output.

返回

Destructed output.

返回类型

Tensor

class mmedit.models.data_preprocessors.MattorPreprocessor(mean: float = [123.675, 116.28, 103.53], std: float = [58.395, 57.12, 57.375], bgr_to_rgb: bool = True, proc_inputs: str = 'normalize', proc_trimap: str = 'rescale_to_zero_one', proc_gt: str = 'rescale_to_zero_one')[源代码]

Bases: mmengine.model.BaseDataPreprocessor

DataPreprocessor for matting models.

See base class BaseDataPreprocessor for detailed information.

Workflow as follow :

  • Collate and move data to the target device.

  • Convert inputs from bgr to rgb if the shape of input is (3, H, W).

  • Normalize image with defined std and mean.

  • Stack inputs to batch_inputs.

参数
  • mean (Sequence[float or int]) – The pixel mean of R, G, B channels. Defaults to [123.675, 116.28, 103.53].

  • std (Sequence[float or int]) – The pixel standard deviation of R, G, B channels. [58.395, 57.12, 57.375].

  • bgr_to_rgb (bool) – whether to convert image from BGR to RGB. Defaults to True.

  • proc_inputs (str) – Methods to process inputs. Default: ‘normalize’. Available options are normalize.

  • proc_trimap (str) – Methods to process gt tensors. Default: ‘rescale_to_zero_one’. Available options are rescale_to_zero_one and as-is.

  • proc_gt (str) – Methods to process gt tensors. Default: ‘rescale_to_zero_one’. Available options are rescale_to_zero_one and ignore.

_proc_inputs(inputs: List[torch.Tensor])
_proc_trimap(trimaps: List[torch.Tensor])
_proc_gt(data_samples, key)
forward(data: Sequence[dict], training: bool = False) Tuple[torch.Tensor, list]

Pre-process input images, trimaps, ground-truth as configured.

参数
  • data (Sequence[dict]) – data sampled from dataloader.

  • training (bool) – Whether to enable training time augmentation. Default: False.

返回

Batched inputs and list of data samples.

返回类型

Tuple[torch.Tensor, list]

collate_data(data: Sequence[dict]) Tuple[list, list, list]

Collating and moving data to the target device.

See base class BaseDataPreprocessor for detailed information.

Read the Docs v: latest
Versions
master
latest
stable
zyh-doc-notfound-extend
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.