每日Attention学习15——Cross-Model Grafting Module

模块出处

CVPR 22\] [\[link\]](https://openaccess.thecvf.com/content/CVPR2022/html/Xie_Pyramid_Grafting_Network_for_One-Stage_High_Resolution_Saliency_Detection_CVPR_2022_paper.html) [\[code\]](https://github.com/iCVTEAM/PGNet) Pyramid Grafting Network for One-Stage High Resolution Saliency Detection *** ** * ** *** ##### 模块名称 Cross-Model Grafting Module (CMGM) *** ** * ** *** ##### 模块作用 Transformer与CNN之间的特征融合 *** ** * ** *** ##### 模块结构 ![在这里插入图片描述](https://i-blog.csdnimg.cn/direct/e889d03727a149fd952730a5de5ae405.jpeg) *** ** * ** *** ##### 模块思想 Transformer在全局特征上更优,CNN在局部特征上更优,对这两者进行进行融合的最简单做法是直接相加或相乘。但是,相加或相乘本质上属于"局部"操作,如果某片区域两个特征的不确定性都较高,则会带来许多噪声。为此,本文提出了CMGM模块,通过交叉注意力的形式引入更为广泛的信息来增强融合效果。 *** ** * ** *** ##### 模块代码 ```python import torch.nn.functional as F import torch.nn as nn import torch class CMGM(nn.Module): def __init__(self, dim, num_heads=8, qkv_bias=True, qk_scale=None): super().__init__() self.num_heads = num_heads head_dim = dim // num_heads self.scale = qk_scale or head_dim ** -0.5 self.k = nn.Linear(dim, dim , bias=qkv_bias) self.qv = nn.Linear(dim, dim * 2, bias=qkv_bias) self.proj = nn.Linear(dim, dim) self.act = nn.ReLU(inplace=True) self.conv = nn.Conv2d(8,8,kernel_size=3, stride=1, padding=1) self.lnx = nn.LayerNorm(64) self.lny = nn.LayerNorm(64) self.bn = nn.BatchNorm2d(8) self.conv2 = nn.Sequential( nn.Conv2d(64,64,kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(64), nn.ReLU(inplace=True), nn.Conv2d(64,64,kernel_size=3, stride=1, padding=1), nn.BatchNorm2d(64), nn.ReLU(inplace=True) ) def forward(self, x, y): batch_size = x.shape[0] chanel = x.shape[1] sc = x x = x.view(batch_size, chanel, -1).permute(0, 2, 1) sc1 = x x = self.lnx(x) y = y.view(batch_size, chanel, -1).permute(0, 2, 1) y = self.lny(y) B, N, C = x.shape y_k = self.k(y).reshape(B, N, 1, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4) x_qv= self.qv(x).reshape(B,N,2,self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4) x_q, x_v = x_qv[0], x_qv[1] y_k = y_k[0] attn = (x_q @ y_k.transpose(-2, -1)) * self.scale attn = attn.softmax(dim=-1) x = (attn @ x_v).transpose(1, 2).reshape(B, N, C) x = self.proj(x) x = (x+sc1) x = x.permute(0,2,1) x = x.view(batch_size,chanel,*sc.size()[2:]) x = self.conv2(x)+x return x, self.act(self.bn(self.conv(attn+attn.transpose(-1,-2)))) if __name__ == '__main__': x = torch.randn([1, 64, 11, 11]) y = torch.randn([1, 64, 11, 11]) cmgm = CMGM(dim=64) out1, out2 = cmgm(x, y) print(out1.shape) # out feature 1, 64, 11, 11 print(out2.shape) # cross attention matrix 1, 8, 121, 121 ``` *** ** * ** ***

相关推荐
xx_xxxxx_3 小时前
多模态动态融合模型Predictive Dynamic Fusion论文阅读与代码分析运行1-信度概念与基础参数指标
论文阅读
数说星榆18112 小时前
好用的PC电脑流程图软件无需下载在线绘制流程图模板大全
大数据·论文阅读·电脑·流程图·论文笔记
檐下翻书17314 小时前
PC端免费在线流程图工具新手快速制作专业流程图教程
论文阅读·架构·毕业设计·流程图·论文笔记
有Li16 小时前
LoViT:用于手术阶段识别的长视频Transformer/文献速递-基于人工智能的医学影像技术
论文阅读·人工智能·深度学习·文献·医学生
程途拾光15817 小时前
中文用户常用在线流程图工具PC端高效制作各类业务流程图方法
大数据·论文阅读·人工智能·信息可视化·流程图·课程设计
DuHz1 天前
用于汽车应用的数字码调制(DCM)雷达白皮书精读
论文阅读·算法·自动驾驶·汽车·信息与通信·信号处理
@––––––2 天前
论文阅读笔记:The Bitter Lesson (苦涩的教训)
论文阅读·人工智能·笔记
张较瘦_2 天前
[论文阅读] AI + 软件工程 | 突破AAA游戏测试瓶颈!选择性插桩让代码覆盖“轻装上阵”
论文阅读·游戏·软件工程
STLearner2 天前
MM 2025 | 时间序列(Time Series)论文总结【预测,分类,异常检测,医疗时序】
论文阅读·人工智能·深度学习·神经网络·算法·机器学习·数据挖掘
心心喵2 天前
[论文笔记] Agent is all you need | AI智能体前沿进展总结
论文阅读