Title: | Models, Datasets and Transformations for Images |
---|---|
Description: | Provides access to datasets, models and preprocessing facilities for deep learning with images. Integrates seamlessly with the 'torch' package and it's 'API' borrows heavily from 'PyTorch' vision package. |
Authors: | Daniel Falbel [aut, cre], Christophe Regouby [ctb], RStudio [cph] |
Maintainer: | Daniel Falbel <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.6.0.9000 |
Built: | 2024-11-16 06:06:29 UTC |
Source: | https://github.com/mlverse/torchvision |
Loads an image using jpeg
, or png
packages depending on the
file extension.
base_loader(path)
base_loader(path)
path |
path to the image to load from |
CIFAR10 Dataset.
Downloads and prepares the CIFAR100 dataset.
cifar10_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE ) cifar100_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE )
cifar10_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE ) cifar100_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE )
root |
(string): Root directory of dataset where directory
|
train |
(bool, optional): If TRUE, creates dataset from training set, otherwise creates from test set. |
transform |
(callable, optional): A function/transform that takes in an PIL image
and returns a transformed version. E.g, |
target_transform |
(callable, optional): A function/transform that takes in the target and transforms it. |
download |
(bool, optional): If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. |
Draws bounding boxes on top of one image tensor
draw_bounding_boxes( image, boxes, labels = NULL, colors = NULL, fill = FALSE, width = 1, font = c("serif", "plain"), font_size = 10 )
draw_bounding_boxes( image, boxes, labels = NULL, colors = NULL, fill = FALSE, width = 1, font = c("serif", "plain"), font_size = 10 )
image |
: Tensor of shape (C x H x W) and dtype uint8. |
boxes |
: Tensor of size (N, 4) containing bounding boxes in (xmin, ymin, xmax, ymax) format. Note that
the boxes are absolute coordinates with respect to the image. In other words: |
labels |
: character vector containing the labels of bounding boxes. |
colors |
: character vector containing the colors of the boxes or single color for all boxes. The color can be represented as strings e.g. "red" or "#FF00FF". By default, viridis colors are generated for boxes. |
fill |
: If |
width |
: Width of text shift to the bounding box. |
font |
: NULL for the current font family, or a character vector of length 2 for Hershey vector fonts. |
font_size |
: The requested font size in points. |
torch_tensor of size (C, H, W) of dtype uint8: Image Tensor with bounding boxes plotted.
Other image display:
draw_keypoints()
,
draw_segmentation_masks()
,
tensor_image_browse()
,
tensor_image_display()
,
vision_make_grid()
if (torch::torch_is_installed()) { ## Not run: image <- torch::torch_randint(170, 250, size = c(3, 360, 360))$to(torch::torch_uint8()) x <- torch::torch_randint(low = 1, high = 160, size = c(12,1)) y <- torch::torch_randint(low = 1, high = 260, size = c(12,1)) boxes <- torch::torch_cat(c(x, y, x + 20, y + 10), dim = 2) bboxed <- draw_bounding_boxes(image, boxes, colors = "black", fill = TRUE) tensor_image_browse(bboxed) ## End(Not run) }
if (torch::torch_is_installed()) { ## Not run: image <- torch::torch_randint(170, 250, size = c(3, 360, 360))$to(torch::torch_uint8()) x <- torch::torch_randint(low = 1, high = 160, size = c(12,1)) y <- torch::torch_randint(low = 1, high = 260, size = c(12,1)) boxes <- torch::torch_cat(c(x, y, x + 20, y + 10), dim = 2) bboxed <- draw_bounding_boxes(image, boxes, colors = "black", fill = TRUE) tensor_image_browse(bboxed) ## End(Not run) }
Draws Keypoints, an object describing a body part (like rightArm or leftShoulder), on given RGB tensor image.
draw_keypoints( image, keypoints, connectivity = NULL, colors = NULL, radius = 2, width = 3 )
draw_keypoints( image, keypoints, connectivity = NULL, colors = NULL, radius = 2, width = 3 )
image |
: Tensor of shape (3, H, W) and dtype uint8 |
keypoints |
: Tensor of shape (N, K, 2) the K keypoints location for each of the N detected poses instance, |
connectivity |
: Vector of pair of keypoints to be connected (currently unavailable) |
colors |
: character vector containing the colors of the boxes or single color for all boxes. The color can be represented as strings e.g. "red" or "#FF00FF". By default, viridis colors are generated for keypoints |
radius |
: radius of the plotted keypoint. |
width |
: width of line connecting keypoints. |
Image Tensor of dtype uint8 with keypoints drawn.
Other image display:
draw_bounding_boxes()
,
draw_segmentation_masks()
,
tensor_image_browse()
,
tensor_image_display()
,
vision_make_grid()
if (torch::torch_is_installed()) { ## Not run: image <- torch::torch_randint(190, 255, size = c(3, 360, 360))$to(torch::torch_uint8()) keypoints <- torch::torch_randint(low = 60, high = 300, size = c(4, 5, 2)) keypoint_image <- draw_keypoints(image, keypoints) tensor_image_browse(keypoint_image) ## End(Not run) }
if (torch::torch_is_installed()) { ## Not run: image <- torch::torch_randint(190, 255, size = c(3, 360, 360))$to(torch::torch_uint8()) keypoints <- torch::torch_randint(low = 60, high = 300, size = c(4, 5, 2)) keypoint_image <- draw_keypoints(image, keypoints) tensor_image_browse(keypoint_image) ## End(Not run) }
Draw segmentation masks with their respective colors on top of a given RGB tensor image
draw_segmentation_masks(image, masks, alpha = 0.8, colors = NULL)
draw_segmentation_masks(image, masks, alpha = 0.8, colors = NULL)
image |
: torch_tensor of shape (3, H, W) and dtype uint8. |
masks |
: torch_tensor of shape (num_masks, H, W) or (H, W) and dtype bool. |
alpha |
: number between 0 and 1 denoting the transparency of the masks. |
colors |
: character vector containing the colors of the boxes or single color for all boxes. The color can be represented as strings e.g. "red" or "#FF00FF". By default, viridis colors are generated for masks |
torch_tensor of shape (3, H, W) and dtype uint8 of the image with segmentation masks drawn on top.
Other image display:
draw_bounding_boxes()
,
draw_keypoints()
,
tensor_image_browse()
,
tensor_image_display()
,
vision_make_grid()
if (torch::torch_is_installed()) { image <- torch::torch_randint(170, 250, size = c(3, 360, 360))$to(torch::torch_uint8()) mask <- torch::torch_tril(torch::torch_ones(c(360, 360)))$to(torch::torch_bool()) masked_image <- draw_segmentation_masks(image, mask, alpha = 0.2) tensor_image_browse(masked_image) }
if (torch::torch_is_installed()) { image <- torch::torch_randint(170, 250, size = c(3, 360, 360))$to(torch::torch_uint8()) mask <- torch::torch_tril(torch::torch_ones(c(360, 360)))$to(torch::torch_bool()) masked_image <- draw_segmentation_masks(image, mask, alpha = 0.2) tensor_image_browse(masked_image) }
A generic data loader for images stored in folders.
See Details
for more information.
image_folder_dataset( root, transform = NULL, target_transform = NULL, loader = NULL, is_valid_file = NULL )
image_folder_dataset( root, transform = NULL, target_transform = NULL, loader = NULL, is_valid_file = NULL )
root |
Root directory path. |
transform |
A function/transform that takes in an PIL image and returns
a transformed version. E.g, |
target_transform |
A function/transform that takes in the target and transforms it. |
loader |
A function to load an image given its path. |
is_valid_file |
A function that takes path of an Image file and check if the file is a valid file (used to check of corrupt files) |
This function assumes that the images for each class are contained
in subdirectories of root
. The names of these subdirectories are stored
in the classes
attribute of the returned object.
An example folder structure might look as follows:
root/dog/xxx.png root/dog/xxy.png root/dog/xxz.png root/cat/123.png root/cat/nsdf3.png root/cat/asd932_.png
Prepares the Kuzushiji-MNIST dataset and optionally downloads it.
kmnist_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE )
kmnist_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE )
root |
(string): Root directory of dataset where
|
train |
(bool, optional): If TRUE, creates dataset from |
transform |
(callable, optional): A function/transform that takes in an
PIL image and returns a transformed version. E.g, |
target_transform |
(callable, optional): A function/transform that takes in the target and transforms it. |
download |
(bool, optional): If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. |
Load an image located at path
using the {magick}
package.
magick_loader(path)
magick_loader(path)
path |
path to the image to load from. |
Prepares the MNIST dataset and optionally downloads it.
mnist_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE )
mnist_dataset( root, train = TRUE, transform = NULL, target_transform = NULL, download = FALSE )
root |
(string): Root directory of dataset where
|
train |
(bool, optional): If True, creates dataset from
|
transform |
(callable, optional): A function/transform that takes in an
PIL image and returns a transformed version. E.g,
|
target_transform |
(callable, optional): A function/transform that takes in the target and transforms it. |
download |
(bool, optional): If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. |
AlexNet model architecture from the One weird trick... paper.
model_alexnet(pretrained = FALSE, progress = TRUE, ...)
model_alexnet(pretrained = FALSE, progress = TRUE, ...)
pretrained |
(bool): If TRUE, returns a model pre-trained on ImageNet. |
progress |
(bool): If TRUE, displays a progress bar of the download to stderr. |
... |
other parameters passed to the model intializer. currently only
|
Other models:
model_inception_v3()
,
model_mobilenet_v2()
,
model_resnet
,
model_vgg
Architecture from Rethinking the Inception Architecture for Computer Vision The required minimum input size of the model is 75x75.
model_inception_v3(pretrained = FALSE, progress = TRUE, ...)
model_inception_v3(pretrained = FALSE, progress = TRUE, ...)
pretrained |
(bool): If |
progress |
(bool): If |
... |
Used to pass keyword arguments to the Inception module:
|
Important: In contrast to the other models the inception_v3 expects tensors with a size of N x 3 x 299 x 299, so ensure your images are sized accordingly.
Other models:
model_alexnet()
,
model_mobilenet_v2()
,
model_resnet
,
model_vgg
Constructs a MobileNetV2 architecture from MobileNetV2: Inverted Residuals and Linear Bottlenecks.
model_mobilenet_v2(pretrained = FALSE, progress = TRUE, ...)
model_mobilenet_v2(pretrained = FALSE, progress = TRUE, ...)
pretrained |
(bool): If TRUE, returns a model pre-trained on ImageNet. |
progress |
(bool): If TRUE, displays a progress bar of the download to stderr. |
... |
Other parameters passed to the model implementation. |
Other models:
model_alexnet()
,
model_inception_v3()
,
model_resnet
,
model_vgg
ResNet models implementation from Deep Residual Learning for Image Recognition and later related papers (see Functions)
model_resnet18(pretrained = FALSE, progress = TRUE, ...) model_resnet34(pretrained = FALSE, progress = TRUE, ...) model_resnet50(pretrained = FALSE, progress = TRUE, ...) model_resnet101(pretrained = FALSE, progress = TRUE, ...) model_resnet152(pretrained = FALSE, progress = TRUE, ...) model_resnext50_32x4d(pretrained = FALSE, progress = TRUE, ...) model_resnext101_32x8d(pretrained = FALSE, progress = TRUE, ...) model_wide_resnet50_2(pretrained = FALSE, progress = TRUE, ...) model_wide_resnet101_2(pretrained = FALSE, progress = TRUE, ...)
model_resnet18(pretrained = FALSE, progress = TRUE, ...) model_resnet34(pretrained = FALSE, progress = TRUE, ...) model_resnet50(pretrained = FALSE, progress = TRUE, ...) model_resnet101(pretrained = FALSE, progress = TRUE, ...) model_resnet152(pretrained = FALSE, progress = TRUE, ...) model_resnext50_32x4d(pretrained = FALSE, progress = TRUE, ...) model_resnext101_32x8d(pretrained = FALSE, progress = TRUE, ...) model_wide_resnet50_2(pretrained = FALSE, progress = TRUE, ...) model_wide_resnet101_2(pretrained = FALSE, progress = TRUE, ...)
pretrained |
(bool): If TRUE, returns a model pre-trained on ImageNet. |
progress |
(bool): If TRUE, displays a progress bar of the download to stderr. |
... |
Other parameters passed to the resnet model. |
model_resnet18()
: ResNet 18-layer model
model_resnet34()
: ResNet 34-layer model
model_resnet50()
: ResNet 50-layer model
model_resnet101()
: ResNet 101-layer model
model_resnet152()
: ResNet 152-layer model
model_resnext50_32x4d()
: ResNeXt-50 32x4d model from "Aggregated Residual Transformation for Deep Neural Networks"
with 32 groups having each a width of 4.
model_resnext101_32x8d()
: ResNeXt-101 32x8d model from "Aggregated Residual Transformation for Deep Neural Networks"
with 32 groups having each a width of 8.
model_wide_resnet50_2()
: Wide ResNet-50-2 model from "Wide Residual Networks"
with width per group of 128.
model_wide_resnet101_2()
: Wide ResNet-101-2 model from "Wide Residual Networks"
with width per group of 128.
Other models:
model_alexnet()
,
model_inception_v3()
,
model_mobilenet_v2()
,
model_vgg
VGG models implementations based on Very Deep Convolutional Networks For Large-Scale Image Recognition
model_vgg11(pretrained = FALSE, progress = TRUE, ...) model_vgg11_bn(pretrained = FALSE, progress = TRUE, ...) model_vgg13(pretrained = FALSE, progress = TRUE, ...) model_vgg13_bn(pretrained = FALSE, progress = TRUE, ...) model_vgg16(pretrained = FALSE, progress = TRUE, ...) model_vgg16_bn(pretrained = FALSE, progress = TRUE, ...) model_vgg19(pretrained = FALSE, progress = TRUE, ...) model_vgg19_bn(pretrained = FALSE, progress = TRUE, ...)
model_vgg11(pretrained = FALSE, progress = TRUE, ...) model_vgg11_bn(pretrained = FALSE, progress = TRUE, ...) model_vgg13(pretrained = FALSE, progress = TRUE, ...) model_vgg13_bn(pretrained = FALSE, progress = TRUE, ...) model_vgg16(pretrained = FALSE, progress = TRUE, ...) model_vgg16_bn(pretrained = FALSE, progress = TRUE, ...) model_vgg19(pretrained = FALSE, progress = TRUE, ...) model_vgg19_bn(pretrained = FALSE, progress = TRUE, ...)
pretrained |
(bool): If TRUE, returns a model pre-trained on ImageNet |
progress |
(bool): If TRUE, displays a progress bar of the download to stderr |
... |
other parameters passed to the VGG model implementation. |
model_vgg11()
: VGG 11-layer model (configuration "A")
model_vgg11_bn()
: VGG 11-layer model (configuration "A") with batch normalization
model_vgg13()
: VGG 13-layer model (configuration "B")
model_vgg13_bn()
: VGG 13-layer model (configuration "B") with batch normalization
model_vgg16()
: VGG 13-layer model (configuration "D")
model_vgg16_bn()
: VGG 13-layer model (configuration "D") with batch normalization
model_vgg19()
: VGG 19-layer model (configuration "E")
model_vgg19_bn()
: VGG 19-layer model (configuration "E") with batch normalization
Other models:
model_alexnet()
,
model_inception_v3()
,
model_mobilenet_v2()
,
model_resnet
Display image tensor into browser
tensor_image_browse(image, browser = getOption("browser"))
tensor_image_browse(image, browser = getOption("browser"))
image |
|
browser |
argument passed to browseURL |
Other image display:
draw_bounding_boxes()
,
draw_keypoints()
,
draw_segmentation_masks()
,
tensor_image_display()
,
vision_make_grid()
Display image tensor onto the X11 device
tensor_image_display(image, animate = TRUE)
tensor_image_display(image, animate = TRUE)
image |
|
animate |
support animations in the X11 display |
Other image display:
draw_bounding_boxes()
,
draw_keypoints()
,
draw_segmentation_masks()
,
tensor_image_browse()
,
vision_make_grid()
Prepares the Tiny ImageNet dataset and optionally downloads it.
tiny_imagenet_dataset(root, split = "train", download = FALSE, ...)
tiny_imagenet_dataset(root, split = "train", download = FALSE, ...)
root |
directory path to download the dataset. |
split |
dataset split, |
download |
whether to download or not the dataset. |
... |
other arguments passed to |
Adjust the brightness of an image
transform_adjust_brightness(img, brightness_factor)
transform_adjust_brightness(img, brightness_factor)
img |
A |
brightness_factor |
(float): How much to adjust the brightness. Can be any non negative number. 0 gives a black image, 1 gives the original image while 2 increases the brightness by a factor of 2. |
Other transforms:
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Adjust the contrast of an image
transform_adjust_contrast(img, contrast_factor)
transform_adjust_contrast(img, contrast_factor)
img |
A |
contrast_factor |
(float): How much to adjust the contrast. Can be any non negative number. 0 gives a solid gray image, 1 gives the original image while 2 increases the contrast by a factor of 2. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Also known as Power Law Transform. Intensities in RGB mode are adjusted based on the following equation:
transform_adjust_gamma(img, gamma, gain = 1)
transform_adjust_gamma(img, gamma, gain = 1)
img |
A |
gamma |
(float): Non negative real number, same as |
gain |
(float): The constant multiplier. |
See Gamma Correction for more details.
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
The image hue is adjusted by converting the image to HSV and cyclically shifting the intensities in the hue channel (H). The image is then converted back to original image mode.
transform_adjust_hue(img, hue_factor)
transform_adjust_hue(img, hue_factor)
img |
A |
hue_factor |
(float): How much to shift the hue channel. Should be in
|
hue_factor
is the amount of shift in H channel and must be in the
interval [-0.5, 0.5]
.
See Hue for more details.
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Adjust the color saturation of an image
transform_adjust_saturation(img, saturation_factor)
transform_adjust_saturation(img, saturation_factor)
img |
A |
saturation_factor |
(float): How much to adjust the saturation. 0 will give a black and white image, 1 will give the original image while 2 will enhance the saturation by a factor of 2. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Apply affine transformation on an image keeping image center invariant
transform_affine( img, angle, translate, scale, shear, resample = 0, fillcolor = NULL )
transform_affine( img, angle, translate, scale, shear, resample = 0, fillcolor = NULL )
img |
A |
angle |
(float or int): rotation angle value in degrees, counter-clockwise. |
translate |
(sequence of int) – horizontal and vertical translations (post-rotation translation) |
scale |
(float) – overall scale |
shear |
(float or sequence) – shear angle value in degrees between -180 to 180, clockwise direction. If a sequence is specified, the first value corresponds to a shear parallel to the x-axis, while the second value corresponds to a shear parallel to the y-axis. |
resample |
(int, optional): An optional resampling filter. See interpolation modes. |
fillcolor |
(tuple or int): Optional fill color (Tuple for RGB Image and int for grayscale) for the area outside the transform in the output image (Pillow>=5.0.0). This option is not supported for Tensor input. Fill value for the area outside the transform in the output image is always 0. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
The image can be a Magick Image or a torch Tensor, in which case it is
expected to have [..., H, W]
shape, where ... means an arbitrary number
of leading dimensions.
transform_center_crop(img, size)
transform_center_crop(img, size)
img |
A |
size |
(sequence or int): Desired output size of the crop. If size is
an int instead of sequence like c(h, w), a square crop (size, size) is
made. If provided a tuple or list of length 1, it will be interpreted as
|
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Randomly change the brightness, contrast and saturation of an image
transform_color_jitter( img, brightness = 0, contrast = 0, saturation = 0, hue = 0 )
transform_color_jitter( img, brightness = 0, contrast = 0, saturation = 0, hue = 0 )
img |
A |
brightness |
(float or tuple of float (min, max)): How much to jitter
brightness. |
contrast |
(float or tuple of float (min, max)): How much to jitter
contrast. |
saturation |
(float or tuple of float (min, max)): How much to jitter
saturation. |
hue |
(float or tuple of float (min, max)): How much to jitter hue.
|
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
dtype
and scale the values accordinglyConvert a tensor image to the given dtype
and scale the values accordingly
transform_convert_image_dtype(img, dtype = torch::torch_float())
transform_convert_image_dtype(img, dtype = torch::torch_float())
img |
A |
dtype |
(torch.dtype): Desired data type of the output. |
When converting from a smaller to a larger integer dtype
the maximum
values are not mapped exactly. If converted back and forth, this
mismatch has no effect.
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Crop the given image at specified location and output size
transform_crop(img, top, left, height, width)
transform_crop(img, top, left, height, width)
img |
A |
top |
(int): Vertical component of the top left corner of the crop box. |
left |
(int): Horizontal component of the top left corner of the crop box. |
height |
(int): Height of the crop box. |
width |
(int): Width of the crop box. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Crop the given image into four corners and the central crop. This transform returns a tuple of images and there may be a mismatch in the number of inputs and targets your Dataset returns.
transform_five_crop(img, size)
transform_five_crop(img, size)
img |
A |
size |
(sequence or int): Desired output size. If size is a sequence like c(h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Convert image to grayscale
transform_grayscale(img, num_output_channels)
transform_grayscale(img, num_output_channels)
img |
A |
num_output_channels |
(int): (1 or 3) number of channels desired for output image |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Horizontally flip a PIL Image or Tensor
transform_hflip(img)
transform_hflip(img)
img |
A |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Given transformation_matrix
and mean_vector
, will flatten the
torch_tensor
and subtract mean_vector
from it which is then followed by
computing the dot product with the transformation matrix and then reshaping
the tensor to its original shape.
transform_linear_transformation(img, transformation_matrix, mean_vector)
transform_linear_transformation(img, transformation_matrix, mean_vector)
img |
A |
transformation_matrix |
(Tensor): tensor |
mean_vector |
(Tensor): tensor D, D = C x H x W. |
whitening transformation: Suppose X is a column vector zero-centered data.
Then compute the data covariance matrix [D x D]
with torch.mm(X.t(), X),
perform SVD on this matrix and pass it as transformation_matrix
.
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Given mean: (mean[1],...,mean[n])
and std: (std[1],..,std[n])
for n
channels, this transform will normalize each channel of the input
torch_tensor
i.e.,
output[channel] = (input[channel] - mean[channel]) / std[channel]
transform_normalize(img, mean, std, inplace = FALSE)
transform_normalize(img, mean, std, inplace = FALSE)
img |
A |
mean |
(sequence): Sequence of means for each channel. |
std |
(sequence): Sequence of standard deviations for each channel. |
inplace |
(bool,optional): Bool to make this operation in-place. |
This transform acts out of place, i.e., it does not mutate the input tensor.
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
The image can be a Magick Image or a torch Tensor, in which case it is
expected to have [..., H, W]
shape, where ... means an arbitrary number
of leading dimensions.
transform_pad(img, padding, fill = 0, padding_mode = "constant")
transform_pad(img, padding, fill = 0, padding_mode = "constant")
img |
A |
padding |
(int or tuple or list): Padding on each border. If a single int is provided this is used to pad all borders. If tuple of length 2 is provided this is the padding on left/right and top/bottom respectively. If a tuple of length 4 is provided this is the padding for the left, right, top and bottom borders respectively. |
fill |
(int or str or tuple): Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant. Only int value is supported for Tensors. |
padding_mode |
Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant. Mode symmetric is not yet supported for Tensor inputs.
|
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Perspective transformation of an image
transform_perspective( img, startpoints, endpoints, interpolation = 2, fill = NULL )
transform_perspective( img, startpoints, endpoints, interpolation = 2, fill = NULL )
img |
A |
startpoints |
(list of list of ints): List containing four lists of two
integers corresponding to four corners
|
endpoints |
(list of list of ints): List containing four lists of two
integers corresponding to four corners
|
interpolation |
(int, optional) Desired interpolation. An integer
|
fill |
(int or str or tuple): Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant. Only int value is supported for Tensors. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Random affine transformation of the image keeping center invariant
transform_random_affine( img, degrees, translate = NULL, scale = NULL, shear = NULL, resample = 0, fillcolor = 0 )
transform_random_affine( img, degrees, translate = NULL, scale = NULL, shear = NULL, resample = 0, fillcolor = 0 )
img |
A |
degrees |
(sequence or float or int): Range of degrees to select from. If degrees is a number instead of sequence like c(min, max), the range of degrees will be (-degrees, +degrees). |
translate |
(tuple, optional): tuple of maximum absolute fraction for
horizontal and vertical translations. For example |
scale |
(tuple, optional): scaling factor interval, e.g c(a, b), then scale is randomly sampled from the range a <= scale <= b. Will keep original scale by default. |
shear |
(sequence or float or int, optional): Range of degrees to select
from. If shear is a number, a shear parallel to the x axis in the range
(-shear, +shear) will be applied. Else if shear is a tuple or list of 2
values a shear parallel to the x axis in the range |
resample |
(int, optional): An optional resampling filter. See interpolation modes. |
fillcolor |
(tuple or int): Optional fill color (Tuple for RGB Image and int for grayscale) for the area outside the transform in the output image (Pillow>=5.0.0). This option is not supported for Tensor input. Fill value for the area outside the transform in the output image is always 0. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Apply a list of transformations randomly with a given probability
transform_random_apply(img, transforms, p = 0.5)
transform_random_apply(img, transforms, p = 0.5)
img |
A |
transforms |
(list or tuple): list of transformations. |
p |
(float): probability. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Apply single transformation randomly picked from a list
transform_random_choice(img, transforms)
transform_random_choice(img, transforms)
img |
A |
transforms |
(list or tuple): list of transformations. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
The image can be a Magick Image or a Tensor, in which case it is expected
to have [..., H, W]
shape, where ... means an arbitrary number of leading
dimensions.
transform_random_crop( img, size, padding = NULL, pad_if_needed = FALSE, fill = 0, padding_mode = "constant" )
transform_random_crop( img, size, padding = NULL, pad_if_needed = FALSE, fill = 0, padding_mode = "constant" )
img |
A |
size |
(sequence or int): Desired output size. If size is a sequence like c(h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). |
padding |
(int or tuple or list): Padding on each border. If a single int is provided this is used to pad all borders. If tuple of length 2 is provided this is the padding on left/right and top/bottom respectively. If a tuple of length 4 is provided this is the padding for the left, right, top and bottom borders respectively. |
pad_if_needed |
(boolean): It will pad the image if smaller than the desired size to avoid raising an exception. Since cropping is done after padding, the padding seems to be done at a random offset. |
fill |
(int or str or tuple): Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant. Only int value is supported for Tensors. |
padding_mode |
Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant. Mode symmetric is not yet supported for Tensor inputs.
|
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
'Random Erasing Data Augmentation' by Zhong et al. See https://arxiv.org/pdf/1708.04896
transform_random_erasing( img, p = 0.5, scale = c(0.02, 0.33), ratio = c(0.3, 3.3), value = 0, inplace = FALSE )
transform_random_erasing( img, p = 0.5, scale = c(0.02, 0.33), ratio = c(0.3, 3.3), value = 0, inplace = FALSE )
img |
A |
p |
probability that the random erasing operation will be performed. |
scale |
range of proportion of erased area against input image. |
ratio |
range of aspect ratio of erased area. |
value |
erasing value. Default is 0. If a single int, it is used to erase all pixels. If a tuple of length 3, it is used to erase R, G, B channels respectively. If a str of 'random', erasing each pixel with random values. |
inplace |
boolean to make this transform inplace. Default set to FALSE. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Convert image to grayscale with a probability of p
.
transform_random_grayscale(img, p = 0.1)
transform_random_grayscale(img, p = 0.1)
img |
A |
p |
(float): probability that image should be converted to grayscale (default 0.1). |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Horizontally flip an image randomly with a given probability. The image can
be a Magick Image or a torch Tensor, in which case it is expected to have
[..., H, W]
shape, where ... means an arbitrary number of leading
dimensions
transform_random_horizontal_flip(img, p = 0.5)
transform_random_horizontal_flip(img, p = 0.5)
img |
A |
p |
(float): probability of the image being flipped. Default value is 0.5 |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Apply a list of transformations in a random order
transform_random_order(img, transforms)
transform_random_order(img, transforms)
img |
A |
transforms |
(list or tuple): list of transformations. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Performs a random perspective transformation of the given image with a given probability
transform_random_perspective( img, distortion_scale = 0.5, p = 0.5, interpolation = 2, fill = 0 )
transform_random_perspective( img, distortion_scale = 0.5, p = 0.5, interpolation = 2, fill = 0 )
img |
A |
distortion_scale |
(float): argument to control the degree of distortion and ranges from 0 to 1. Default is 0.5. |
p |
(float): probability of the image being transformed. Default is 0.5. |
interpolation |
(int, optional) Desired interpolation. An integer
|
fill |
(int or str or tuple): Pixel fill value for constant fill. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. This value is only used when the padding_mode is constant. Only int value is supported for Tensors. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Crop the given image to a random size and aspect ratio. The image can be a
Magick Image or a Tensor, in which case it is expected to have
[..., H, W]
shape, where ... means an arbitrary number of leading
dimensions
transform_random_resized_crop( img, size, scale = c(0.08, 1), ratio = c(3/4, 4/3), interpolation = 2 )
transform_random_resized_crop( img, size, scale = c(0.08, 1), ratio = c(3/4, 4/3), interpolation = 2 )
img |
A |
size |
(sequence or int): Desired output size. If size is a sequence like c(h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). |
scale |
(tuple of float): range of size of the origin size cropped |
ratio |
(tuple of float): range of aspect ratio of the origin aspect ratio cropped. |
interpolation |
(int, optional) Desired interpolation. An integer
|
A crop of random size (default: of 0.08 to 1.0) of the original size and a random aspect ratio (default: of 3/4 to 4/3) of the original aspect ratio is made. This crop is finally resized to given size. This is popularly used to train the Inception networks.
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Rotate the image by angle
transform_random_rotation( img, degrees, resample = 0, expand = FALSE, center = NULL, fill = NULL )
transform_random_rotation( img, degrees, resample = 0, expand = FALSE, center = NULL, fill = NULL )
img |
A |
degrees |
(sequence or float or int): Range of degrees to select from. If degrees is a number instead of sequence like c(min, max), the range of degrees will be (-degrees, +degrees). |
resample |
(int, optional): An optional resampling filter. See interpolation modes. |
expand |
(bool, optional): Optional expansion flag. If true, expands the output to make it large enough to hold the entire rotated image. If false or omitted, make the output image the same size as the input image. Note that the expand flag assumes rotation around the center and no translation. |
center |
(list or tuple, optional): Optional center of rotation, c(x, y). Origin is the upper left corner. Default is the center of the image. |
fill |
(n-tuple or int or float): Pixel fill value for area outside the rotated image. If int or float, the value is used for all bands respectively. Defaults to 0 for all bands. This option is only available for Pillow>=5.2.0. This option is not supported for Tensor input. Fill value for the area outside the transform in the output image is always 0. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
The image can be a PIL Image or a torch Tensor, in which case it is expected
to have [..., H, W]
shape, where ...
means an arbitrary number of
leading dimensions
transform_random_vertical_flip(img, p = 0.5)
transform_random_vertical_flip(img, p = 0.5)
img |
A |
p |
(float): probability of the image being flipped. Default value is 0.5 |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
The image can be a Magic Image or a torch Tensor, in which case it is
expected to have [..., H, W]
shape, where ... means an arbitrary number
of leading dimensions
transform_resize(img, size, interpolation = 2)
transform_resize(img, size, interpolation = 2)
img |
A |
size |
(sequence or int): Desired output size. If size is a sequence like c(h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). |
interpolation |
(int, optional) Desired interpolation. An integer
|
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Crop an image and resize it to a desired size
transform_resized_crop(img, top, left, height, width, size, interpolation = 2)
transform_resized_crop(img, top, left, height, width, size, interpolation = 2)
img |
A |
top |
(int): Vertical component of the top left corner of the crop box. |
left |
(int): Horizontal component of the top left corner of the crop box. |
height |
(int): Height of the crop box. |
width |
(int): Width of the crop box. |
size |
(sequence or int): Desired output size. If size is a sequence like c(h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). |
interpolation |
(int, optional) Desired interpolation. An integer
|
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
For RGB to Grayscale conversion, ITU-R 601-2 luma transform is performed which is L = R * 0.2989 + G * 0.5870 + B * 0.1140
transform_rgb_to_grayscale(img)
transform_rgb_to_grayscale(img)
img |
A |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Angular rotation of an image
transform_rotate( img, angle, resample = 0, expand = FALSE, center = NULL, fill = NULL )
transform_rotate( img, angle, resample = 0, expand = FALSE, center = NULL, fill = NULL )
img |
A |
angle |
(float or int): rotation angle value in degrees, counter-clockwise. |
resample |
(int, optional): An optional resampling filter. See interpolation modes. |
expand |
(bool, optional): Optional expansion flag. If true, expands the output to make it large enough to hold the entire rotated image. If false or omitted, make the output image the same size as the input image. Note that the expand flag assumes rotation around the center and no translation. |
center |
(list or tuple, optional): Optional center of rotation, c(x, y). Origin is the upper left corner. Default is the center of the image. |
fill |
(n-tuple or int or float): Pixel fill value for area outside the rotated image. If int or float, the value is used for all bands respectively. Defaults to 0 for all bands. This option is only available for Pillow>=5.2.0. This option is not supported for Tensor input. Fill value for the area outside the transform in the output image is always 0. |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_ten_crop()
,
transform_to_tensor()
,
transform_vflip()
Crop the given image into four corners and the central crop, plus the flipped version of these (horizontal flipping is used by default). This transform returns a tuple of images and there may be a mismatch in the number of inputs and targets your Dataset returns.
transform_ten_crop(img, size, vertical_flip = FALSE)
transform_ten_crop(img, size, vertical_flip = FALSE)
img |
A |
size |
(sequence or int): Desired output size. If size is a sequence like c(h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size). |
vertical_flip |
(bool): Use vertical flipping instead of horizontal |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_to_tensor()
,
transform_vflip()
Converts a Magick Image or array (H x W x C) in the range [0, 255]
to a
torch_tensor
of shape (C x H x W) in the range [0.0, 1.0]
. In the
other cases, tensors are returned without scaling.
transform_to_tensor(img)
transform_to_tensor(img)
img |
A |
Because the input image is scaled to [0.0, 1.0]
, this transformation
should not be used when transforming target image masks.
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_vflip()
Vertically flip a PIL Image or Tensor
transform_vflip(img)
transform_vflip(img)
img |
A |
Other transforms:
transform_adjust_brightness()
,
transform_adjust_contrast()
,
transform_adjust_gamma()
,
transform_adjust_hue()
,
transform_adjust_saturation()
,
transform_affine()
,
transform_center_crop()
,
transform_color_jitter()
,
transform_convert_image_dtype()
,
transform_crop()
,
transform_five_crop()
,
transform_grayscale()
,
transform_hflip()
,
transform_linear_transformation()
,
transform_normalize()
,
transform_pad()
,
transform_perspective()
,
transform_random_affine()
,
transform_random_apply()
,
transform_random_choice()
,
transform_random_crop()
,
transform_random_erasing()
,
transform_random_grayscale()
,
transform_random_horizontal_flip()
,
transform_random_order()
,
transform_random_perspective()
,
transform_random_resized_crop()
,
transform_random_rotation()
,
transform_random_vertical_flip()
,
transform_resize()
,
transform_resized_crop()
,
transform_rgb_to_grayscale()
,
transform_rotate()
,
transform_ten_crop()
,
transform_to_tensor()
Arranges a batch of (image) tensors in a grid, with optional padding between images. Expects a 4d mini-batch tensor of shape (B x C x H x W).
vision_make_grid( tensor, scale = TRUE, num_rows = 8, padding = 2, pad_value = 0 )
vision_make_grid( tensor, scale = TRUE, num_rows = 8, padding = 2, pad_value = 0 )
tensor |
tensor to arrange in grid. |
scale |
whether to normalize (min-max-scale) the input tensor. |
num_rows |
number of rows making up the grid (default 8). |
padding |
amount of padding between batch images (default 2). |
pad_value |
pixel value to use for padding. |
Other image display:
draw_bounding_boxes()
,
draw_keypoints()
,
draw_segmentation_masks()
,
tensor_image_browse()
,
tensor_image_display()