PyTorch Grid Sample

PyTorch Grid Sample

  • [1. `torch.nn.functional.grid_sample`](#1. torch.nn.functional.grid_sample)
    • [1.1. Parameters](#1.1. Parameters)
    • [1.2. Returns](#1.2. Returns)
  • [2. Grid Sample Native Functions](#2. Grid Sample Native Functions)
    • [2.1. align_corners (bool, optional)](#2.1. align_corners (bool, optional))
  • References

1. torch.nn.functional.grid_sample

复制代码
torch.nn.functional.grid_sample(input, grid, mode='bilinear', padding_mode='zeros', align_corners=None)

https://docs.pytorch.org/docs/stable/generated/torch.nn.functional.grid_sample.html

Compute grid sample.

Given an input and a flow-field grid, computes the output using input values and pixel locations from grid.

使用 input 的值和 grid 中的像素位置计算 output

Currently, only spatial (4-D) and volumetric (5-D) input are supported.

In the spatial (4-D) case, for input with shape ( N , C , H in , W in ) (N, C, H_\text{in}, W_\text{in}) (N,C,Hin,Win) and grid with shape ( N , H out , W out , 2 ) (N, H_\text{out}, W_\text{out}, 2) (N,Hout,Wout,2), the output will have shape ( N , C , H out , W out ) (N, C, H_\text{out}, W_\text{out}) (N,C,Hout,Wout).

For each output location output[n, :, h, w], the size-2 vector grid[n, h, w] specifies input pixel locations x and y, which are used to interpolate the output value output[n, :, h, w].

对于输出位置 output[n, :, h, w],大小为 2 的向量 grid[n, h, w] 指定了 input 的像素位置 xy,用于插值输出值 output[n, :, h, w]

In the case of 5D inputs, grid[n, d, h, w] specifies the x, y, z pixel locations for interpolating output[n, :, d, h, w]. mode argument specifies nearest or bilinear interpolation method to sample the input pixels.

对于 5D 输入,grid[n, d, h, w] 指定了插值 output[n, :, d, h, w]xyz 像素位置。mode 参数指定了用于采样输入像素的 nearest (最近邻) 或 bilinear (双线性) 插值方法。

grid specifies the sampling pixel locations normalized by the input spatial dimensions. Therefore, it should have most values in the range of [-1, 1]. For example, values x = -1, y = -1 is the left-top pixel of input, and values x = 1, y = 1 is the right-bottom pixel of input.
grid 指定的采样像素位置是根据 input 的空间维度归一化的。因此,它的大多数值应该在 [-1, 1] 的范围内。例如,值为 x = -1, y = -1 对应 input 的左上像素,值为 x = 1, y = 1 对应 input 的右下像素。

If grid has values outside the range of [-1, 1], the corresponding outputs are handled as defined by padding_mode.

如果 grid 的值超出了 [-1, 1] 的范围,则根据 padding_mode 的定义处理相应的输出。

Options are

  • padding_mode="zeros": use 0 for out-of-bound grid locations (对于超出边界的网格位置,使用 0。)
  • padding_mode="border": use border values for out-of-bound grid locations (对于超出边界的网格位置,使用边界值。)
  • padding_mode="reflection": use values at locations reflected by the border for out-of-bound grid locations. For location far away from the border, it will keep being reflected until becoming in bound, e.g., (normalized) pixel location x = -3.5 reflects by border -1 and becomes x' = 1.5, then reflects by border 1 and becomes x'' = -0.5.
    对于超出边界的网格位置,使用沿边界反射后的值。对于远离边界的位置,会持续反射直到在边界内,例如归一化像素位置 x = -3.5 沿边界 -1 反射得到 x' = 1.5,然后沿边界 1 反射得到 x'' = -0.5

Note

This function is often used in conjunction with affine_grid to build Spatial Transformer Networks.

Spatial Transformer Networks
https://arxiv.org/abs/1506.02025

Note

When using the CUDA backend, this operation may induce nondeterministic behaviour in its backward pass that is not easily switched off. Please see the notes on Reproducibility for background.

在使用 CUDA 后端时,此操作的后向传播可能会引起不确定的行为,且不易关闭。
Note

NaN values in grid would be interpreted as -1.
grid 中的 NaN 值将被解释为 -1

1.1. Parameters

  • input (Tensor): input of shape ( N , C , H in , W in ) (N, C, H_\text{in}, W_\text{in}) (N,C,Hin,Win) (4-D case) or ( N , C , D in , H in , W in ) (N, C, D_\text{in}, H_\text{in}, W_\text{in}) (N,C,Din,Hin,Win) (5-D case)

  • grid (Tensor): flow-field of shape ( N , H out , W out , 2 ) (N, H_\text{out}, W_\text{out}, 2) (N,Hout,Wout,2) (4-D case) or ( N , D out , H out , W out , 3 ) (N, D_\text{out}, H_\text{out}, W_\text{out}, 3) (N,Dout,Hout,Wout,3) (5-D case)

  • mode (str): interpolation mode to calculate output values 'bilinear' | 'nearest' | 'bicubic'. Default: 'bilinear' Note: mode='bicubic' supports only 4-D input. When mode='bilinear' and the input is 5-D, the interpolation mode used internally will actually be trilinear. However, when the input is 4-D, the interpolation mode will legitimately be bilinear.

  • padding_mode (str): padding mode for outside grid values 'zeros' | 'border' | 'reflection'. Default: 'zeros'

  • align_corners (bool, optional): Geometrically, we consider the pixels of the input as squares rather than points.
    If set to True, the extrema (-1 and 1) are considered as referring to the center points of the input's corner pixels.
    If set to False, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic.
    This option parallels the align_corners option in interpolate, and so whichever option is used here should also be used there to resize the input image before grid sampling. Default: False

    legitimately [lɪ'dʒɪtɪmətlɪ]
    adv. 正当地
    geometrically [ˌdʒi:ə'metrɪklɪ]
    adv. 用几何学
    agnostic [æɡˈnɒstɪk]
    adj. 不可知论的
    n. 不可知论者

1.2. Returns

  • output (Tensor): output Tensor

Warning

When align_corners = True, the grid positions depend on the pixel size relative to the input image size, and so the locations sampled by grid_sample() will differ for the same input given at different resolutions (that is, after being upsampled or downsampled). The default behavior up to version 1.2.0 was align_corners = True. Since then, the default behavior has been changed to align_corners = False, in order to bring it in line with the default for interpolate().
Note
mode='bicubic' is implemented using the cubic convolution algorithm with α = − 0.75 \alpha=-0.75 α=−0.75. The constant α \alpha α might be different from packages to packages. For example, PIL and OpenCV use -0.5 and -0.75 respectively. This algorithm may "overshoot" the range of values it's interpolating. For example, it may produce negative values or values greater than 255 when interpolating input in [0, 255]. Clamp the results with torch.clamp() to ensure they are within the valid range.

2. Grid Sample Native Functions

aten/src/ATen/native/GridSampler.h
https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/GridSampler.h

aten/src/ATen/native/GridSampler.cpp
https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/GridSampler.cpp

2.1. align_corners (bool, optional)

Geometrically, we consider the pixels of the input as squares rather than points.

在几何上,我们将输入像素视为方形而非点。

If set to True, the extrema (-1 and 1) are considered as referring to the center points of the input's corner pixels.

如果设置为 True,则极限值 (-1 和 1) 被视为指向输入角像素的中心点。

If set to False, they are instead considered as referring to the corner points of the input's corner pixels, making the sampling more resolution agnostic.

如果设置为 False,则它们被视为指向输入角像素的角点,使采样更具分辨率无关性。

This option parallels the align_corners option in interpolate(), and so whichever option is used here should also be used there to resize the input image before grid sampling. Default: False

此选项平行于 interpolate() 中的 align_corners 选项,因此在此处使用的任何选项都应该在调整输入图像大小以进行网格采样时一并使用。默认值:False

Unnormalizes a coordinate from the -1 to +1 scale to its pixel index value, where we view each pixel as an area between (idx - 0.5) and (idx + 0.5).

if align_corners: -1 and +1 get sent to the center points of the input's corner pixels

复制代码
     -1 --> 0
     +1 --> (size - 1)
     scale_factor = ((size - 1) - 0) / ((+1) - (-1)) = (size - 1) / 2

if not align_corners: -1 and +1 get sent to the corner points of the input's corner pixels

复制代码
     -1 --> -0.5
     +1 --> (size - 1) + 0.5 == size - 0.5
     scale_factor = ((size - 0.5) - (-0.5)) / ((+1) - (-1)) = size / 2

References

1\] Yongqiang Cheng (程永强), \[2\] torch.nn.functional.grid_sample, \[3\] torch.nn.functional.grid_sample,

相关推荐
哈__2 小时前
CANN内存管理与资源优化
人工智能·pytorch
DeniuHe3 小时前
Pytorch中的直方图
pytorch
哈__4 小时前
CANN多模型并发部署方案
人工智能·pytorch
DeniuHe5 小时前
Pytorch中的众数
人工智能·pytorch·python
DeniuHe15 小时前
torch.distribution函数详解
pytorch
退休钓鱼选手18 小时前
[ Pytorch教程 ] 神经网络的基本骨架 torch.nn -Neural Network
pytorch·深度学习·神经网络
DeniuHe19 小时前
用 PyTorch 库创建了一个随机张量,并演示了多种张量取整和分解操作
pytorch
Network_Engineer1 天前
从零手写LSTM:从门控原理到PyTorch源码级实现
人工智能·pytorch·lstm
多恩Stone1 天前
【3D-AICG 系列-1】Trellis v1 和 Trellis v2 的区别和改进
人工智能·pytorch·python·算法·3d·aigc
2501_901147831 天前
PyTorch DDP官方文档学习笔记(核心干货版)
pytorch·笔记·学习·算法·面试