docarray.document.mixins.image module#
- class docarray.document.mixins.image.ImageDataMixin[source]#
Bases:
object
Provide helper functions for
Document
to support image data.- set_image_tensor_channel_axis(original_channel_axis, new_channel_axis)[source]#
Move the channel axis of the image
tensor
inplace.- Parameters:
original_channel_axis (
int
) – the original axis of the channelnew_channel_axis (
int
) – the new axis of the channel
- Return type:
T
- Returns:
itself after processed
- load_pil_image_to_datauri(image)[source]#
Convert a pillow image into a datauri with header data:image/png.
- Parameters:
image (PILImage) – a pillow image
- Returns:
itself after processed
- convert_blob_to_image_tensor(width=None, height=None, channel_axis=-1)[source]#
Convert an image
blob
to a ndarraytensor
.- Parameters:
width (
Optional
[int
]) – the width of the image tensor.height (
Optional
[int
]) – the height of the tensor.channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axis
- Return type:
T
- Returns:
itself after processed
- convert_image_tensor_to_uri(channel_axis=-1, image_format='png')[source]#
Assuming
tensor
is a _valid_ image, seturi
accordingly- Parameters:
channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axisimage_format (
str
) – either png or jpeg
- Return type:
T
- Returns:
itself after processed
- convert_image_tensor_to_blob(channel_axis=-1, image_format='png')[source]#
Assuming
tensor
is a _valid_ image, setblob
accordingly- Parameters:
channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axisimage_format (
str
) – either png or jpeg
- Return type:
T
- Returns:
itself after processed
- set_image_tensor_resample(ratio, channel_axis=-1)[source]#
Resample the image
tensor
into different size inplace.- Parameters:
ratio (
float
) – scale ratio of the resampled image tensor.channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axis
- Return type:
T
- Returns:
itself after processed
- set_image_tensor_shape(shape, channel_axis=-1)[source]#
Resample the image
tensor
into different size inplace.If your current image tensor has shape
[H,W,C]
, then the new tensor will be[*shape, C]
- Parameters:
shape (
Tuple
[int
,int
]) – the new shape of the image tensor.channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axis
- Return type:
T
- Returns:
itself after processed
- save_image_tensor_to_file(file, channel_axis=-1, image_format='png')[source]#
Save
tensor
into a file- Parameters:
file (
Union
[str
,BinaryIO
]) – File or filename to which the data is saved.channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axisimage_format (
str
) – either png or jpeg
- Return type:
T
- Returns:
itself after processed
- load_uri_to_image_tensor(width=None, height=None, channel_axis=-1, **kwargs)[source]#
Convert the image-like
uri
intotensor
- Parameters:
width (
Optional
[int
]) – the width of the image tensor.height (
Optional
[int
]) – the height of the tensor.channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axiskwargs – keyword arguments to pass to :meth:_uri_to_blob such as timeout
- Return type:
T
- Returns:
itself after processed
- set_image_tensor_inv_normalization(channel_axis=-1, img_mean=(0.485, 0.456, 0.406), img_std=(0.229, 0.224, 0.225))[source]#
Inverse the normalization of a float32 image
tensor
into a uint8 imagetensor
inplace.- Parameters:
channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axisimg_mean (
Tuple
[float
]) – the mean of all imagesimg_std (
Tuple
[float
]) – the standard deviation of all images
- Return type:
T
- Returns:
itself after processed
- set_image_tensor_normalization(channel_axis=-1, img_mean=(0.485, 0.456, 0.406), img_std=(0.229, 0.224, 0.225))[source]#
Normalize a uint8 image
tensor
into a float32 imagetensor
inplace.Applies normalization to the color channels of the images. By default, the normalization uses mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225], which are standard values computed on millions of images. If you want to train from scratch on your own dataset, you can calculate the new mean and std. Otherwise, using the Imagenet pretrained model with its own mean and std is recommended.
- Parameters:
channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axisimg_mean (
Tuple
[float
]) – the means of all images: [mean_r, mean_g, mean_b]img_std (
Tuple
[float
]) – the standard deviations of all images: [std_r, std_g, std_b]
- Return type:
T
- Returns:
itself after processed
Warning
Please do NOT generalize this function to gray scale, black/white image, it does not make any sense for non RGB image. if you look at their MNIST examples, the mean and stddev are 1-dimensional (since the inputs are greyscale– no RGB channels).
- convert_image_tensor_to_sliding_windows(window_shape=(64, 64), strides=None, padding=False, channel_axis=-1, as_chunks=False)[source]#
Convert
tensor
into a sliding window view with the given window shapetensor
inplace.- Parameters:
window_shape (
Tuple
[int
,int
]) – desired output size. If size is a sequence like (h, w), the output size will be matched to this. If size is an int, the output will have the same height and width as the target_size.strides (
Optional
[Tuple
[int
,int
]]) – the strides between two neighboring sliding windows. strides is a sequence like (h, w), in which denote the strides on the vertical and the horizontal axis. When not given, using window_shapepadding (
bool
) – If False, only patches which are fully contained in the input image are included. If True, all patches whose starting point is inside the input are included, and areas outside the input default to zero. The padding argument has no effect on the size of each patch, it determines how many patches are extracted. Default is False.channel_axis (
int
) – the axis id of the color channel,-1
indicates the color channel info at the last axis.as_chunks (
bool
) – If set, each sliding window will be stored in the chunk of the current Document
- Return type:
T
- Returns:
Document itself after processed