Convolution operation and Grouped Convolution

filter is not the kernel,but the kernels.that's mean a filter include one or two or more kernels.that's depend the input feature map and the output feature maps. for example, if we have an image, the shape of image is (32,32), has 3 channels,that's RGB.so the input feature maps is (1,3,32,32).the format of input feature maps is (batch_size,in_channels,H_in,W_in),the output feature maps is(batch_size,out_channels,H_out,W_out),there is a formulation for out_H,out_W.

p is padding,default is 0. s is stride,default is 1.

so, we get the the Height and Width of output feature map,but how about the output channels?how do we get the output channels from the input channels.Or,In other words,what's the convolution operation?

first,i'll give the conclusion and explain it later.

so the weight size is (filters, kernels of filter,H_k,W_k),the format of weight vector is (C_out,C_in,H_k,W_k)

that's mean we have C_out filters, and each filter has C_in kernels.if you don't understand, look through this link,it will tell you the specific operations.

as we go deeper into the convolution this dimension of channels increases very rapidly thus increases complexity. The spatial dimensions(means height and weight) have some degree of effect on the complexity but in deeper layers, they are not really the cause of concern. Thus in bigger neural networks, the filter groups will dominate.so,the grouped convolution was proposed,you can access to this link for more details.

you can try this code for validation.

python 复制代码
import torch.nn as nn
import torch

# 假设输入特征图的大小为 (batch_size, in_channels, H, W)
batch_size = 1
in_channels = 4
out_channels = 2
H = 6
W = 6

# 定义1x1卷积层,输入通道数为in_channels,输出通道数为out_channels
conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=1, padding=0)

# 对输入特征图进行1x1卷积操作
x = torch.randn(batch_size, in_channels, H, W)
y = conv(x)

# 输入特征图的大小为 (batch_size, in_channels, H, W)
print(x.shape)  # torch.Size([1, 4, 6, 6])
# 输出特征图的大小为 (batch_size, out_channels, H, W)
print(y.size())   # torch.Size([1, 2, 6, 6])
# 获取卷积核的尺寸 (out_channels, in_channels // groups, *kernel_size)
weight_size = conv.weight.size()
print('卷积核的尺寸为:', weight_size)  # torch.Size([2, 4, 1, 1])
相关推荐
青瓷程序设计30 分钟前
植物识别系统【最新版】Python+TensorFlow+Vue3+Django+人工智能+深度学习+卷积神经网络算法
人工智能·python·深度学习
AI即插即用1 小时前
即插即用系列 | CVPR 2025 WPFormer:用于表面缺陷检测的查询式Transformer
人工智能·深度学习·yolo·目标检测·cnn·视觉检测·transformer
T0uken2 小时前
【Python】UV:境内的深度学习环境搭建
人工智能·深度学习·uv
AI即插即用2 小时前
即插即用系列 | 2025 MambaNeXt-YOLO 炸裂登场!YOLO 激吻 Mamba,打造实时检测新霸主
人工智能·pytorch·深度学习·yolo·目标检测·计算机视觉·视觉检测
忘却的旋律dw4 小时前
使用LLM模型的tokenizer报错AttributeError: ‘dict‘ object has no attribute ‘model_type‘
人工智能·pytorch·python
studytosky5 小时前
深度学习理论与实战:MNIST 手写数字分类实战
人工智能·pytorch·python·深度学习·机器学习·分类·matplotlib
哥布林学者6 小时前
吴恩达深度学习课程三: 结构化机器学习项目 第一周:机器学习策略(二)数据集设置
深度学习·ai
【建模先锋】7 小时前
精品数据分享 | 锂电池数据集(四)PINN+锂离子电池退化稳定性建模和预测
深度学习·预测模型·pinn·锂电池剩余寿命预测·锂电池数据集·剩余寿命
九年义务漏网鲨鱼7 小时前
【大模型学习】现代大模型架构(二):旋转位置编码和SwiGLU
深度学习·学习·大模型·智能体
CoovallyAIHub7 小时前
破局红外小目标检测:异常感知Anomaly-Aware YOLO以“俭”驭“繁”
深度学习·算法·计算机视觉