mmedit.models.data_preprocessors
¶
Package Contents¶
Classes¶
Basic data pre-processor used for collating and copying data to the |
|
Image pre-processor for generative models. This class provide |
|
DataPreprocessor for matting models. |
Functions¶
|
reverse operation of |
|
Stack multiple tensors to form a batch and pad the images to the max |
- class mmedit.models.data_preprocessors.EditDataPreprocessor(mean: Sequence[Union[float, int]] = (0, 0, 0), std: Sequence[Union[float, int]] = (255, 255, 255), pad_size_divisor: int = 1, input_view=(- 1, 1, 1), output_view=None, pad_args: dict = dict())[source]¶
Bases:
mmengine.model.BaseDataPreprocessor
Basic data pre-processor used for collating and copying data to the target device in mmediting.
EditDataPreprocessor
performs data pre-processing according to the following steps:Collates the data sampled from dataloader.
Copies data to the target device.
Stacks the input tensor at the first dimension.
and post-processing of the output tensor of model.
- TODO: Most editing methods have crop inputs to a same size, batched padding
will be faster.
- Parameters
mean (Sequence[float or int]) – The pixel mean of R, G, B channels. Defaults to (0, 0, 0). If
mean
andstd
are not specified, ImgDataPreprocessor will normalize images to [0, 1].std (Sequence[float or int]) – The pixel standard deviation of R, G, B channels. (255, 255, 255). If
mean
andstd
are not specified, ImgDataPreprocessor will normalize images to [0, 1].pad_size_divisor (int) – The size of padded image should be divisible by
pad_size_divisor
. Defaults to 1.input_view (Tuple | List) – Tensor view of mean and std for input (without batch). Defaults to (-1, 1, 1) for (C, H, W).
output_view (Tuple | List | None) – Tensor view of mean and std for output (without batch). If None, output_view=input_view. Defaults: None.
pad_args (dict) – Args of F.pad. Default: dict().
- forward(data: Sequence[dict], training: bool = False) Tuple[torch.Tensor, Optional[list]] [source]¶
Pre-process the data into the model input format.
After the data pre-processing of
collate_data()
,forward
will stack the input tensor list to a batch tensor at the first dimension.- Parameters
data (Sequence[dict]) – data sampled from dataloader.
training (bool) – Whether to enable training time augmentation. Default: False.
- Returns
Data in the same format as the model input.
- Return type
Tuple[torch.Tensor, Optional[list]]
- mmedit.models.data_preprocessors.split_batch(batch_tensor: torch.Tensor, padded_sizes: torch.Tensor)[source]¶
reverse operation of
stack_batch
.- Parameters
batch_tensor (Tensor) – The 4D-tensor or 5D-tensor. Tensor.dim == tensor_list[0].dim + 1
padded_sizes (Tensor) – The padded sizes of each tensor.
- Returns
A list of tensors with the same dim.
- Return type
tensor_list (List[Tensor])
- mmedit.models.data_preprocessors.stack_batch(tensor_list: List[torch.Tensor], pad_size_divisor: int = 1, pad_args: dict = dict())[source]¶
Stack multiple tensors to form a batch and pad the images to the max shape use the right bottom padding mode in these images.
If
pad_size_divisor > 0
, add padding to ensure the shape of each dim is divisible bypad_size_divisor
.- Parameters
tensor_list (List[Tensor]) – A list of tensors with the same dim.
pad_size_divisor (int) – If
pad_size_divisor > 0
, add padding to ensure the shape of each dim is divisible bypad_size_divisor
. This depends on the model, and many models need to be divisible by 32. Defaults to 1pad_args (dict) – The padding args.
- Returns
The 4D-tensor or 5D-tensor. Tensor.dim == tensor_list[0].dim + 1 padded_sizes (Tensor): The padded sizes of each tensor.
- Return type
batch_tensor (Tensor)
- class mmedit.models.data_preprocessors.GenDataPreprocessor(mean: Sequence[Union[float, int]] = (127.5, 127.5, 127.5), std: Sequence[Union[float, int]] = (127.5, 127.5, 127.5), pad_size_divisor: int = 1, pad_value: Union[float, int] = 0, bgr_to_rgb: bool = False, rgb_to_bgr: bool = False, non_image_keys: Optional[Tuple[str, List[str]]] = None, non_concentate_keys: Optional[Tuple[str, List[str]]] = None)[source]¶
Bases:
mmengine.model.ImgDataPreprocessor
Image pre-processor for generative models. This class provide normalization and bgr to rgb conversion for image tensor inputs. The input of this classes should be dict which keys are inputs and data_samples.
Besides to process tensor inputs, this class support dict as inputs. - If the value is Tensor and the corresponding key is not contained in
_NON_IMAGE_KEYS
, it will be processed as image tensor. - If the value is Tensor and the corresponding key belongs to_NON_IMAGE_KEYS
, it will not remains unchanged. - If value is string or integer, it will not remains unchanged.- Parameters
mean (Sequence[float or int], optional) – The pixel mean of image channels. If
bgr_to_rgb=True
it means the mean value of R, G, B channels. If it is not specified, images will not be normalized. Defaults None.std (Sequence[float or int], optional) – The pixel standard deviation of image channels. If
bgr_to_rgb=True
it means the standard deviation of R, G, B channels. If it is not specified, images will not be normalized. Defaults None.pad_size_divisor (int) – The size of padded image should be divisible by
pad_size_divisor
. Defaults to 1.pad_value (float or int) – The padded pixel value. Defaults to 0.
bgr_to_rgb (bool) – whether to convert image from BGR to RGB. Defaults to False.
rgb_to_bgr (bool) – whether to convert image from RGB to RGB. Defaults to False.
- _NON_IMAGE_KEYS = ['noise']¶
- _NON_CONCENTATE_KEYS = ['num_batches', 'mode', 'sample_kwargs', 'eq_cfg']¶
- cast_data(data: CastData) CastData [source]¶
Copying data to the target device.
- Parameters
data (dict) – Data returned by
DataLoader
.- Returns
Inputs and data sample at target device.
- Return type
CollatedResult
- _preprocess_image_tensor(inputs: torch.Tensor) torch.Tensor [source]¶
Process image tensor.
- Parameters
inputs (Tensor) – List of image tensor to process.
- Returns
Processed and stacked image tensor.
- Return type
Tensor
- process_dict_inputs(batch_inputs: dict) dict [source]¶
Preprocess dict type inputs.
- Parameters
batch_inputs (dict) – Input dict.
- Returns
Preprocessed dict.
- Return type
dict
- forward(data: dict, training: bool = False) dict [source]¶
Performs normalization、padding and bgr2rgb conversion based on
BaseDataPreprocessor
.- Parameters
data (dict) – Input data to process.
training (bool) – Whether to enable training time augmentation. This is ignored for
GenDataPreprocessor
. Defaults to False.
- Returns
Data in the same format as the model input.
- Return type
dict
- class mmedit.models.data_preprocessors.MattorPreprocessor(mean: float = [123.675, 116.28, 103.53], std: float = [58.395, 57.12, 57.375], bgr_to_rgb: bool = True, proc_inputs: str = 'normalize', proc_trimap: str = 'rescale_to_zero_one', proc_gt: str = 'rescale_to_zero_one')[source]¶
Bases:
mmengine.model.BaseDataPreprocessor
DataPreprocessor for matting models.
See base class
BaseDataPreprocessor
for detailed information.Workflow as follow :
Collate and move data to the target device.
Convert inputs from bgr to rgb if the shape of input is (3, H, W).
Normalize image with defined std and mean.
Stack inputs to batch_inputs.
- Parameters
mean (Sequence[float or int]) – The pixel mean of R, G, B channels. Defaults to [123.675, 116.28, 103.53].
std (Sequence[float or int]) – The pixel standard deviation of R, G, B channels. [58.395, 57.12, 57.375].
bgr_to_rgb (bool) – whether to convert image from BGR to RGB. Defaults to True.
proc_inputs (str) – Methods to process inputs. Default: ‘normalize’. Available options are
normalize
.proc_trimap (str) – Methods to process gt tensors. Default: ‘rescale_to_zero_one’. Available options are
rescale_to_zero_one
andas-is
.proc_gt (str) – Methods to process gt tensors. Default: ‘rescale_to_zero_one’. Available options are
rescale_to_zero_one
andignore
.
- forward(data: Sequence[dict], training: bool = False) Tuple[torch.Tensor, list] [source]¶
Pre-process input images, trimaps, ground-truth as configured.
- Parameters
data (Sequence[dict]) – data sampled from dataloader.
training (bool) – Whether to enable training time augmentation. Default: False.
- Returns
Batched inputs and list of data samples.
- Return type
Tuple[torch.Tensor, list]