--- title: "Indexing tensors" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Indexing tensors} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = identical(Sys.getenv("TORCH_TEST", unset = "0"), "1"), purl = FALSE ) ``` ```{r setup} library(torch) ``` In this article we describe the indexing operator for torch tensors and how it compares to the R indexing operator for arrays. Torch's indexing semantics are closer to numpy's semantics than R's. You will find a lot of similarities between this article and the `numpy` indexing article available [here](https://docs.scipy.org/doc/numpy-1.10.0/user/basics.indexing.html). ## Single element indexing Single element indexing for a 1-D tensors works mostly as expected. Like R, it is 1-based. Unlike R though, it accepts negative indices for indexing from the end of the array. (In R, negative indices are used to remove elements.) ```{r} x <- torch_tensor(1:10) x[1] x[-1] ``` You can also subset matrices and higher dimensions arrays using the same syntax: ```{r} x <- x$reshape(shape = c(2,5)) x x[1,3] x[1,-1] ``` Note that if one indexes a multidimensional tensor with fewer indices than dimensions, torch's behaviour differs from R, which flattens the array. In torch, the missing indices are considered complete slices `:`. ```{r} x[1] ``` ## Slicing and striding It is possible to slice and stride arrays to extract sub-arrays of the same number of dimensions, but of different sizes than the original. This is best illustrated by a few examples: ```{r} x <- torch_tensor(1:10) x x[2:5] x[1:(-7)] ``` You can also use the `1:10:2` syntax which means: In the range from 1 to 10, take every second item. For example: ```{r} x[1:5:2] ``` Another special syntax is the `N`, meaning the size of the specified dimension. ```{r} x[5:N] ``` > Note: the slicing behavior relies on [Non Standard Evaluation](https://adv-r.hadley.nz/evaluation.html#evaluation). It requires that the expression is passed to the `[` not exactly the resulting R vector. To allow dynamic dynamic indices, you can create a new slice using the `slc` function. For example: ```{r} x[1:5:2] ``` is equivalent to: ```{r} x[slc(start = 1, end = 5, step = 2)] ``` ## Getting the complete dimension Like in R, you can take all elements in a dimension by leaving an index empty. Consider a matrix: ```{r} x <- torch_randn(2, 3) x ``` The following syntax will give you the first row: ```{r} x[1,] ``` And this would give you the first 2 columns: ```{r} x[,1:2] ``` ## Dropping dimensions By default, when indexing by a single integer, this dimension will be dropped to avoid the singleton dimension: ```{r} x <- torch_randn(2, 3) x[1,]$shape ``` You can optionally use the `drop = FALSE` argument to avoid dropping the dimension. ```{r} x[1,,drop = FALSE]$shape ``` ## Adding a new dimension It's possible to add a new dimension to a tensor using index-like syntax: ```{r} x <- torch_tensor(c(10)) x$shape x[, newaxis]$shape x[, newaxis, newaxis]$shape ``` You can also use `NULL` instead of `newaxis`: ```{r} x[,NULL]$shape ``` ## Dealing with variable number of indices Sometimes we don't know how many dimensions a tensor has, but we do know what to do with the last available dimension, or the first one. To subsume all others, we can use `..`: ```{r} z <- torch_tensor(1:125)$reshape(c(5,5,5)) z[1,..] z[..,1] ``` ## Indexing with vectors Vector indexing is also supported but care must be taken regarding performance as, in general its much less performant than slice based indexing. > Note: Starting from version 0.5.0, vector indexing in torch follows R semantics, prior to that the behavior was similar to [numpy's advanced indexing](https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing). To use the old behavior, consider using `?torch_index`, `?torch_index_put` or `torch_index_put_`. ```{r} x <- torch_randn(4,4) x[c(1,3), c(1,3)] ``` You can also use boolean vectors, for example: ```{r} x[c(TRUE, FALSE, TRUE, FALSE), c(TRUE, FALSE, TRUE, FALSE)] ``` The above examples also work if the index were long or boolean tensors, instead of R vectors. It's also possible to index with multi-dimensional boolean tensors: ```{r} x <- torch_tensor(rbind( c(1,2,3), c(4,5,6) )) x[x>3] ```