Convolution operation and Grouped Convolution

filter is not the kernel,but the kernels.that's mean a filter include one or two or more kernels.that's depend the input feature map and the output feature maps. for example, if we have an image, the shape of image is (32,32), has 3 channels,that's RGB.so the input feature maps is (1,3,32,32).the format of input feature maps is (batch_size,in_channels,H_in,W_in),the output feature maps is(batch_size,out_channels,H_out,W_out),there is a formulation for out_H,out_W.

p is padding,default is 0. s is stride,default is 1.

so, we get the the Height and Width of output feature map,but how about the output channels?how do we get the output channels from the input channels.Or,In other words,what's the convolution operation?

first,i'll give the conclusion and explain it later.

so the weight size is (filters, kernels of filter,H_k,W_k),the format of weight vector is (C_out,C_in,H_k,W_k)

that's mean we have C_out filters, and each filter has C_in kernels.if you don't understand, look through this link,it will tell you the specific operations.

as we go deeper into the convolution this dimension of channels increases very rapidly thus increases complexity. The spatial dimensions(means height and weight) have some degree of effect on the complexity but in deeper layers, they are not really the cause of concern. Thus in bigger neural networks, the filter groups will dominate.so,the grouped convolution was proposed,you can access to this link for more details.

you can try this code for validation.

python 复制代码
import torch.nn as nn
import torch

# 假设输入特征图的大小为 (batch_size, in_channels, H, W)
batch_size = 1
in_channels = 4
out_channels = 2
H = 6
W = 6

# 定义1x1卷积层,输入通道数为in_channels,输出通道数为out_channels
conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=1, padding=0)

# 对输入特征图进行1x1卷积操作
x = torch.randn(batch_size, in_channels, H, W)
y = conv(x)

# 输入特征图的大小为 (batch_size, in_channels, H, W)
print(x.shape)  # torch.Size([1, 4, 6, 6])
# 输出特征图的大小为 (batch_size, out_channels, H, W)
print(y.size())   # torch.Size([1, 2, 6, 6])
# 获取卷积核的尺寸 (out_channels, in_channels // groups, *kernel_size)
weight_size = conv.weight.size()
print('卷积核的尺寸为:', weight_size)  # torch.Size([2, 4, 1, 1])
相关推荐
江上鹤.14814 小时前
Day40 复习日
人工智能·深度学习·机器学习
行如流水14 小时前
BLIP和BLIP2解析
深度学习
fegggye15 小时前
PyO3 Class 详解 - 在 Python 中使用 Rust 类
pytorch·rust
cskywit15 小时前
MobileMamba中的小波分析
人工智能·深度学习
HyperAI超神经15 小时前
【vLLM 学习】Prithvi Geospatial Mae
人工智能·python·深度学习·学习·大语言模型·gpu·vllm
会挠头但不秃18 小时前
深度学习(4)卷积神经网络
人工智能·神经网络·cnn
L.fountain19 小时前
图像自回归生成(Auto-regressive image generation)实战学习(一)
人工智能·深度学习·学习·计算机视觉·图像自回归
hxxjxw20 小时前
Pytorch分布式训练/多卡训练(六) —— Expert Parallelism (MoE的特殊策略)
人工智能·pytorch·python
لا معنى له20 小时前
学习笔记:卷积神经网络(CNN)
人工智能·笔记·深度学习·神经网络·学习·cnn
资源补给站20 小时前
论文13 | Nature: 数据驱动的地球系统科学的深度学习和过程理解
人工智能·深度学习