Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding

只有fused adapter image encoder, viewpoint registration flow, semantic emphasizing module, 和 fully connected layer 训练,其他参数冻结。

Fused Adapter Image Encoder

adapter:



fused adapter:

Viewpoint Registration Flow and Semantic Emphasizing

Viewpoint Registration Flow:


conv1是1x1 ; conv是3x3
,双线性插值

Semantic Emphasizing:

结果展示:
