Title: | Additional Operators for Image Models |
---|---|
Description: | Implements additional operators for computer vision models, including operators necessary for image segmentation and object detection deep learning models. |
Authors: | Daniel Falbel [aut, cre], RStudio [cph] |
Maintainer: | Daniel Falbel <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.5.0.9000 |
Built: | 2024-11-02 05:12:58 UTC |
Source: | https://github.com/mlverse/torchvisionlib |
Ddescribed in Deformable ConvNets v2: More Deformable, Better Results
if mask
is not NULL
and performs Deformable Convolution, described in
Deformable Convolutional Networks
if mask
is NULL
.
ops_deform_conv2d( input, offset, weight, bias = NULL, stride = c(1, 1), padding = c(0, 0), dilation = c(1, 1), mask = NULL )
ops_deform_conv2d( input, offset, weight, bias = NULL, stride = c(1, 1), padding = c(0, 0), dilation = c(1, 1), mask = NULL )
input |
( |
offset |
( |
weight |
( |
bias |
( |
stride |
(int or |
padding |
(int or |
dilation |
(int or |
mask |
( |
Tensor[batch_sz, out_channels, out_h, out_w]
: result of convolution
if (torchvisionlib_is_installed()) { library(torch) input <- torch_rand(4, 3, 10, 10) kh <- kw <- 3 weight <- torch_rand(5, 3, kh, kw) # offset and mask should have the same spatial size as the output # of the convolution. In this case, for an input of 10, stride of 1 # and kernel size of 3, without padding, the output size is 8 offset <- torch_rand(4, 2 * kh * kw, 8, 8) mask <- torch_rand(4, kh * kw, 8, 8) out <- ops_deform_conv2d(input, offset, weight, mask = mask) print(out$shape) }
if (torchvisionlib_is_installed()) { library(torch) input <- torch_rand(4, 3, 10, 10) kh <- kw <- 3 weight <- torch_rand(5, 3, kh, kw) # offset and mask should have the same spatial size as the output # of the convolution. In this case, for an input of 10, stride of 1 # and kernel size of 3, without padding, the output size is 8 offset <- torch_rand(4, 2 * kh * kw, 8, 8) mask <- torch_rand(4, kh * kw, 8, 8) out <- ops_deform_conv2d(input, offset, weight, mask = mask) print(out$shape) }
Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU).
ops_nms(boxes, scores, iou_threshold)
ops_nms(boxes, scores, iou_threshold)
boxes |
|
scores |
|
iou_threshold |
|
NMS iteratively removes lower scoring boxes which have an IoU greater than
iou_threshold
with another (higher scoring) box.
If multiple boxes have the exact same score and satisfy the IoU criterion with respect to a reference box, the selected box is not guaranteed to be the same between CPU and GPU. This is similar to the behavior of argsort in PyTorch when repeated values are present.
int64 tensor with the indices of the elements that have been kept by NMS, sorted in decreasing order of scores
if (torchvisionlib_is_installed()) { ops_nms(torch::torch_rand(3, 4), torch::torch_rand(3), 0.5) }
if (torchvisionlib_is_installed()) { ops_nms(torch::torch_rand(3, 4), torch::torch_rand(3), 0.5) }
The (RoI) Align operator is mentioned in Light-Head R-CNN.
ops_ps_roi_align( input, boxes, output_size, spatial_scale = 1, sampling_ratio = -1 ) nn_ps_roi_align(output_size, spatial_scale = 1, sampling_ratio = -1)
ops_ps_roi_align( input, boxes, output_size, spatial_scale = 1, sampling_ratio = -1 ) nn_ps_roi_align(output_size, spatial_scale = 1, sampling_ratio = -1)
input |
( |
boxes |
( |
output_size |
(int or |
spatial_scale |
(float): a scaling factor that maps the box coordinates to the input coordinates. For example, if your boxes are defined on the scale of a 224x224 image and your input is a 112x112 feature map (resulting from a 0.5x scaling of the original image), you'll want to set this to 0.5. Default: 1.0 |
sampling_ratio |
(int): number of sampling points in the interpolation grid
used to compute the output value of each pooled output bin. If > 0,
then exactly |
Tensor[K, C / (output_size[1] * output_size[2]), output_size[1], output_size[2]]
:
The pooled RoIs
nn_ps_roi_align()
: The torch::nn_module()
wrapper for ops_ps_roi_align()
.
if (torchvisionlib_is_installed()) { library(torch) library(torchvisionlib) input <- torch_randn(1, 3, 28, 28) boxes <- list(torch_tensor(matrix(c(1,1,5,5), ncol = 4))) roi <- nn_ps_roi_align(output_size = c(1, 1)) roi(input, boxes) }
if (torchvisionlib_is_installed()) { library(torch) library(torchvisionlib) input <- torch_randn(1, 3, 28, 28) boxes <- list(torch_tensor(matrix(c(1,1,5,5), ncol = 4))) roi <- nn_ps_roi_align(output_size = c(1, 1)) roi(input, boxes) }
Checks if an installation of torchvisionlib was found.
Install additional libraries
torchvisionlib_is_installed() install_torchvisionlib(url = Sys.getenv("TORCHVISIONLIB_URL", unset = NA))
torchvisionlib_is_installed() install_torchvisionlib(url = Sys.getenv("TORCHVISIONLIB_URL", unset = NA))
url |
Url for the binaries. Can also be the file path to the binaries. |
Read JPEG's directly into torch tensors
vision_read_jpeg(path)
vision_read_jpeg(path)
path |
path to JPEG file |