Convolution operation and Grouped Convolution

filter is not the kernel,but the kernels.that's mean a filter include one or two or more kernels.that's depend the input feature map and the output feature maps. for example, if we have an image, the shape of image is (32,32), has 3 channels,that's RGB.so the input feature maps is (1,3,32,32).the format of input feature maps is (batch_size,in_channels,H_in,W_in),the output feature maps is(batch_size,out_channels,H_out,W_out),there is a formulation for out_H,out_W.

p is padding,default is 0. s is stride,default is 1.

so, we get the the Height and Width of output feature map,but how about the output channels?how do we get the output channels from the input channels.Or,In other words,what's the convolution operation?

first,i'll give the conclusion and explain it later.

so the weight size is (filters, kernels of filter,H_k,W_k),the format of weight vector is (C_out,C_in,H_k,W_k)

that's mean we have C_out filters, and each filter has C_in kernels.if you don't understand, look through this link,it will tell you the specific operations.

as we go deeper into the convolution this dimension of channels increases very rapidly thus increases complexity. The spatial dimensions(means height and weight) have some degree of effect on the complexity but in deeper layers, they are not really the cause of concern. Thus in bigger neural networks, the filter groups will dominate.so,the grouped convolution was proposed,you can access to this link for more details.

you can try this code for validation.

python 复制代码
import torch.nn as nn
import torch

# 假设输入特征图的大小为 (batch_size, in_channels, H, W)
batch_size = 1
in_channels = 4
out_channels = 2
H = 6
W = 6

# 定义1x1卷积层,输入通道数为in_channels,输出通道数为out_channels
conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=1, padding=0)

# 对输入特征图进行1x1卷积操作
x = torch.randn(batch_size, in_channels, H, W)
y = conv(x)

# 输入特征图的大小为 (batch_size, in_channels, H, W)
print(x.shape)  # torch.Size([1, 4, 6, 6])
# 输出特征图的大小为 (batch_size, out_channels, H, W)
print(y.size())   # torch.Size([1, 2, 6, 6])
# 获取卷积核的尺寸 (out_channels, in_channels // groups, *kernel_size)
weight_size = conv.weight.size()
print('卷积核的尺寸为:', weight_size)  # torch.Size([2, 4, 1, 1])
相关推荐
Hcoco_me26 分钟前
RNN(循环神经网络)
人工智能·rnn·深度学习
二哈喇子!3 小时前
PyTorch生态与昇腾平台适配:环境搭建与详细安装指南
人工智能·pytorch·python
柠柠酱3 小时前
【深度学习Day5】决战 CIFAR-10:手把手教你搭建第一个“正经”的卷积神经网络 (附调参心法)
深度学习
gravity_w3 小时前
Hugging Face使用指南
人工智能·经验分享·笔记·深度学习·语言模型·nlp
Yeats_Liao5 小时前
MindSpore开发之路(二十六):系列总结与学习路径展望
人工智能·深度学习·学习·机器学习
UnderTurrets5 小时前
A_Survey_on_3D_object_Affordance
pytorch·深度学习·计算机视觉·3d
koo3646 小时前
pytorch深度学习笔记13
pytorch·笔记·深度学习
高洁016 小时前
CLIP 的双编码器架构是如何优化图文关联的?(3)
深度学习·算法·机器学习·transformer·知识图谱
山土成旧客7 小时前
【Python学习打卡-Day40】从“能跑就行”到“工程标准”:PyTorch训练与测试的规范化写法
pytorch·python·学习
lambo mercy7 小时前
无监督学习
人工智能·深度学习