pytorch小记（一）：pytorch矩阵乘法：torch.matmul(x, y)

pytorch小记（一）：pytorch矩阵乘法：torch.matmul（x, y）/ x @ y

- - 代码
  - [代码 1：`torch.matmul(x, y)`](#代码 1：torch.matmul(x, y))
  - [代码 2：`y = y.view(4,1)` 再 `torch.matmul(x, y)`](#代码 2：y = y.view(4,1) 再 torch.matmul(x, y))
  - 总结：两种情况的区别

代码

python 复制代码

x = torch.tensor([[1,2,3,4], [5,6,7,8]])
y = torch.tensor([2, 3, 1, 0]) # y.shape == (4)
print(torch.matmul(x, y))
print(x @ y)

python 复制代码

>>>
tensor([11, 35])
tensor([11, 35])

python 复制代码

x = torch.tensor([[1,2,3,4], [5,6,7,8]])
y = torch.tensor([2, 3, 1, 0]) # y.shape == (4)
y = y.view(4,1)                # y.shape == (4, 1)
'''
tensor([[2],
        [3],
        [1],
        [0]])
'''
print(torch.matmul(x, y))
print(x @ y)

python 复制代码

>>>
tensor([[11],
        [35]])
tensor([[11],
        [35]])

在这段代码中，torch.matmul(x, y) 或者x @ y计算的是矩阵乘法或张量乘法。我们分两种情况详细分析：

代码 1：`torch.matmul(x, y)`

输入张量：

x 是一个 2D 张量，形状为 (2, 4)：
复制代码
```
tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])
```
y 是一个 1D 张量，形状为 (4,)：
复制代码
```
tensor([2, 3, 1, 0])
```

计算逻辑：

在 PyTorch 中，如果 matmul 的一个输入是 2D 张量，另一个是 1D 张量，计算规则是：

将 1D 张量 y 当作列向量 (4, 1)，与矩阵 x 进行矩阵乘法。
结果是一个 1D 张量，形状为 (2,)。

矩阵乘法公式：
result [ i ] = ∑ j x [ i , j ] ⋅ y [ j ] \text{result}[i] = \sum_j x[i, j] \cdot y[j] result[i]=j∑x[i,j]⋅y[j]

具体计算步骤：

对第一行 [1, 2, 3, 4]：
( 1 ⋅ 2 ) + ( 2 ⋅ 3 ) + ( 3 ⋅ 1 ) + ( 4 ⋅ 0 ) = 2 + 6 + 3 + 0 = 11 (1 \cdot 2) + (2 \cdot 3) + (3 \cdot 1) + (4 \cdot 0) = 2 + 6 + 3 + 0 = 11 (1⋅2)+(2⋅3)+(3⋅1)+(4⋅0)=2+6+3+0=11
对第二行 [5, 6, 7, 8]：
( 5 ⋅ 2 ) + ( 6 ⋅ 3 ) + ( 7 ⋅ 1 ) + ( 8 ⋅ 0 ) = 10 + 18 + 7 + 0 = 35 (5 \cdot 2) + (6 \cdot 3) + (7 \cdot 1) + (8 \cdot 0) = 10 + 18 + 7 + 0 = 35 (5⋅2)+(6⋅3)+(7⋅1)+(8⋅0)=10+18+7+0=35

输出结果：

python 复制代码

torch.matmul(x, y)
# tensor([11, 35])

代码 2：`y = y.view(4,1)` 再 `torch.matmul(x, y)`

输入张量：

x 是同一个 2D 张量，形状为 (2, 4)。
y 被重塑为 2D 张量，形状为 (4, 1)：
复制代码
```
tensor([[2],
        [3],
        [1],
        [0]])
```

计算逻辑：

在这种情况下，matmul 执行的是 矩阵乘法 ，两个输入的形状为 (2, 4) 和 (4, 1)：

矩阵乘法的规则是：前一个矩阵的列数必须等于后一个矩阵的行数。
结果张量的形状是 (2, 1)。

矩阵乘法公式：
result [ i , k ] = ∑ j x [ i , j ] ⋅ y [ j , k ] \text{result}[i, k] = \sum_j x[i, j] \cdot y[j, k] result[i,k]=j∑x[i,j]⋅y[j,k]

具体计算步骤：

对第一行 [1, 2, 3, 4] 和列向量 [[2], [3], [1], [0]]：
( 1 ⋅ 2 ) + ( 2 ⋅ 3 ) + ( 3 ⋅ 1 ) + ( 4 ⋅ 0 ) = 2 + 6 + 3 + 0 = 11 (1 \cdot 2) + (2 \cdot 3) + (3 \cdot 1) + (4 \cdot 0) = 2 + 6 + 3 + 0 = 11 (1⋅2)+(2⋅3)+(3⋅1)+(4⋅0)=2+6+3+0=11
对第二行 [5, 6, 7, 8] 和列向量 [[2], [3], [1], [0]]：
( 5 ⋅ 2 ) + ( 6 ⋅ 3 ) + ( 7 ⋅ 1 ) + ( 8 ⋅ 0 ) = 10 + 18 + 7 + 0 = 35 (5 \cdot 2) + (6 \cdot 3) + (7 \cdot 1) + (8 \cdot 0) = 10 + 18 + 7 + 0 = 35 (5⋅2)+(6⋅3)+(7⋅1)+(8⋅0)=10+18+7+0=35

输出结果：

python 复制代码

torch.matmul(x, y)
# tensor([[11],
#         [35]])

总结：两种情况的区别

y 是 1D 张量：
- torch.matmul(x, y) 返回一个 1D 张量 ，形状为 (2,)。
- 相当于将 y 当作列向量，与矩阵 x 做矩阵乘法。
y 是 2D 张量：
- torch.matmul(x, y) 返回一个 2D 张量 ，形状为 (2, 1)。
- 矩阵乘法严格遵守二维矩阵的维度规则。

两者的结果数值相同，但形状不同，主要是因为输入张量的维度不同，导致输出的维度也发生了变化。

pytorch小记（一）：pytorch矩阵乘法：torch.matmul(x, y)

pytorch小记（一）：pytorch矩阵乘法：torch.matmul（x, y）/ x @ y

代码

代码 1：torch.matmul(x, y)

输入张量：

计算逻辑：

输出结果：

代码 2：y = y.view(4,1) 再 torch.matmul(x, y)

输入张量：

计算逻辑：

输出结果：

总结：两种情况的区别

代码 1：`torch.matmul(x, y)`

代码 2：`y = y.view(4,1)` 再 `torch.matmul(x, y)`