CVPR2026|底层视觉相关论文汇总(如果觉得有帮助,欢迎点赞和收藏)
- 1.超分辨率(Super-Resolution)
-
-
- xxx
- [AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution](#AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution)
- [Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution](#Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution)
- [CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness](#CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness)
- [Compressed-Domain-Aware Online Video Super-Resolution](#Compressed-Domain-Aware Online Video Super-Resolution)
- [Disentangled Textual Priors for Diffusion-based Image Super-Resolution](#Disentangled Textual Priors for Diffusion-based Image Super-Resolution)
- [DNF-SR: Dual-Input and Negative-Aware Feature Fine-Tuning for Real-World Image Super-Resolution](#DNF-SR: Dual-Input and Negative-Aware Feature Fine-Tuning for Real-World Image Super-Resolution)
- [DreamSR: Towards Ultra-High-Resolution Image Super-Resolution via a Receptive-Field Enhanced Diffusion Transformer](#DreamSR: Towards Ultra-High-Resolution Image Super-Resolution via a Receptive-Field Enhanced Diffusion Transformer)
- [DTG-Restore: Training-Free Diffusion Refinement for Generative Video Super-Resolution](#DTG-Restore: Training-Free Diffusion Refinement for Generative Video Super-Resolution)
- [Dual Graph Regularized Deep Unfolding Network for Guided Depth Map Super-resolution](#Dual Graph Regularized Deep Unfolding Network for Guided Depth Map Super-resolution)
- [DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution](#DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution)
- [DVAR: Dynamic Visual Autoregressive Modeling for Image Super-Resolution](#DVAR: Dynamic Visual Autoregressive Modeling for Image Super-Resolution)
- [Edge-aware Multimodal Residual Diffusion Model for Hyperspectral Image Super-resolution](#Edge-aware Multimodal Residual Diffusion Model for Hyperspectral Image Super-resolution)
- [Edge-Focused Super-Resolution for Omnidirectional Images with Spherical Geometric Augmentation](#Edge-Focused Super-Resolution for Omnidirectional Images with Spherical Geometric Augmentation)
- [Enhancing Unregistered Hyperspectral Image Super-Resolution via Unmixing-based Abundance Fusion Learning](#Enhancing Unregistered Hyperspectral Image Super-Resolution via Unmixing-based Abundance Fusion Learning)
- [FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution](#FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution)
- [FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution](#FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution)
- [FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution](#FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution)
- [GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution](#GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution)
- [Gradient Knows Best: Mixed-Precision Quantization via Gradient-Guided Bit Allocation for Super-Resolution](#Gradient Knows Best: Mixed-Precision Quantization via Gradient-Guided Bit Allocation for Super-Resolution)
- [HDW-SR: High-Frequency Guided Diffusion Model based on Wavelet Decomposition for Image Super-Resolution](#HDW-SR: High-Frequency Guided Diffusion Model based on Wavelet Decomposition for Image Super-Resolution)
- [IAFMNet: Information-Aware Feature Modulation for Efficient Super-Resolution](#IAFMNet: Information-Aware Feature Modulation for Efficient Super-Resolution)
- [IFCSR: Inference-Free Fidelity-Realism Control for One-Step Diffusion-based Real-World Image Super-Resolution](#IFCSR: Inference-Free Fidelity-Realism Control for One-Step Diffusion-based Real-World Image Super-Resolution)
- [Linear Recurrent Unit with Semantic Modulation for Image Super-Resolution](#Linear Recurrent Unit with Semantic Modulation for Image Super-Resolution)
- [LSGQuant: Layer-Sensitivity Guided Quantization for One-Step Diffusion Real-World Video Super-Resolution](#LSGQuant: Layer-Sensitivity Guided Quantization for One-Step Diffusion Real-World Video Super-Resolution)
- [One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution](#One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution)
- [Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction](#Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction)
- [Plug-and-Play Dynamic In-context Learning with Stochastic Regularization for Screen Content Image Super-Resolution](#Plug-and-Play Dynamic In-context Learning with Stochastic Regularization for Screen Content Image Super-Resolution)
- [PS-SR: Pseudo-Single-Step Video Super-Resolution via Speculative Diffusion](#PS-SR: Pseudo-Single-Step Video Super-Resolution via Speculative Diffusion)
- [QDM: Quadtree-Based Region-Adaptive Sparse Diffusion Models for Efficient Image Super-Resolution](#QDM: Quadtree-Based Region-Adaptive Sparse Diffusion Models for Efficient Image Super-Resolution)
- [RAW-Domain Degradation Models for Realistic Smartphone Super-Resolution](#RAW-Domain Degradation Models for Realistic Smartphone Super-Resolution)
- [Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework](#Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework)
- [Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance](#Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance)
- [Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features](#Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features)
- [SAT: Selective Aggregation Transformer for Image Super-Resolution](#SAT: Selective Aggregation Transformer for Image Super-Resolution)
- [Spectral Super-Resolution via Adversarial Unfolding and Data-Driven Spectrum Regularization](#Spectral Super-Resolution via Adversarial Unfolding and Data-Driven Spectrum Regularization)
- [SR3R: Rethinking Super-Resolution 3D Reconstruction With Feed-Forward Gaussian Splatting](#SR3R: Rethinking Super-Resolution 3D Reconstruction With Feed-Forward Gaussian Splatting)
- [STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution](#STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution)
- [TextOVSR: Text-Guided Real-World Opera Video Super-Resolution](#TextOVSR: Text-Guided Real-World Opera Video Super-Resolution)
- [Thermal Diffusion Matters: Infrared Spatial-Temporal Video Super-Resolution through Heat Conduction Priors](#Thermal Diffusion Matters: Infrared Spatial-Temporal Video Super-Resolution through Heat Conduction Priors)
- [Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution](#Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution)
- [Time Without Time: Pseudo-Temporal Representation for Space-Time Super-Resolution](#Time Without Time: Pseudo-Temporal Representation for Space-Time Super-Resolution)
- [TinySR: Shallow Diffusion Transformers for Real-World Image Super-Resolution](#TinySR: Shallow Diffusion Transformers for Real-World Image Super-Resolution)
- [Towards Real-Time Diffusion-Based Streaming Video Super-Resolution](#Towards Real-Time Diffusion-Based Streaming Video Super-Resolution)
- [Toward Real-world Infrared Image Super-Resolution: A Unified Autoregressive Framework and Benchmark Dataset](#Toward Real-world Infrared Image Super-Resolution: A Unified Autoregressive Framework and Benchmark Dataset)
- [TPTransformer: Tensor-Tensor Product Transformer for Hyperspectral Image Super-Resolution](#TPTransformer: Tensor-Tensor Product Transformer for Hyperspectral Image Super-Resolution)
- [TUDSR: Twice Upsampling-Diffusion for Higher Super-Resolution](#TUDSR: Twice Upsampling-Diffusion for Higher Super-Resolution)
- [UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution](#UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution)
- [VSRELL: A Simple Baseline for Video Super-Resolution and Enhancement in Low-Light environment](#VSRELL: A Simple Baseline for Video Super-Resolution and Enhancement in Low-Light environment)
- [VoDaSuRe: A Large-Scale Dataset Revealing Domain Shift in Volumetric Super-Resolution](#VoDaSuRe: A Large-Scale Dataset Revealing Domain Shift in Volumetric Super-Resolution)
- [VOSR: A Vision-Only Generative Model for Image Super-Resolution](#VOSR: A Vision-Only Generative Model for Image Super-Resolution)
-
- [2.图像去雨(Image Deraining)](#2.图像去雨(Image Deraining))
-
-
- [UniRain: Unified Image Deraining with RAG-based Dataset Distillation and Multi-objective Reweighted Optimization](#UniRain: Unified Image Deraining with RAG-based Dataset Distillation and Multi-objective Reweighted Optimization)
- [Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy](#Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy)
-
- [3.图像去雾(Image Dehazing)](#3.图像去雾(Image Dehazing))
-
-
- [Bilevel Layer-Positioning LoRA for Real Image Dehazing](#Bilevel Layer-Positioning LoRA for Real Image Dehazing)
- [Disentanglement-wise Image Dehazing through Cross-Domain Manifold Consensus](#Disentanglement-wise Image Dehazing through Cross-Domain Manifold Consensus)
- [From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing](#From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing)
- [HazeMatching: Dehazing Light Microscopy Images with Guided Conditional Flow Matching](#HazeMatching: Dehazing Light Microscopy Images with Guided Conditional Flow Matching)
- [Inf-Dehaze: Beyond GPU Memory Constraints for Ultra-High-Resolution Image Dehazing](#Inf-Dehaze: Beyond GPU Memory Constraints for Ultra-High-Resolution Image Dehazing)
-
- 4.去模糊(Deblurring)
-
-
- [BluRef: Unsupervised Image Deblurring with Dense-Matching References](#BluRef: Unsupervised Image Deblurring with Dense-Matching References)
- [Event-based Motion Deblurring with Unpaired Data](#Event-based Motion Deblurring with Unpaired Data)
- [Gyro-based Deep Video Deblurring](#Gyro-based Deep Video Deblurring)
- [Motion-Aware Animatable Gaussian Avatars Deblurring](#Motion-Aware Animatable Gaussian Avatars Deblurring)
- [MSCD-GS: Motion-Separated Cooperative Deblurring Dynamic Reconstruction via Gaussian Splatting](#MSCD-GS: Motion-Separated Cooperative Deblurring Dynamic Reconstruction via Gaussian Splatting)
- [MVSSM: Motion-aware Visual State Space Model for Efficient Video Deblurring](#MVSSM: Motion-aware Visual State Space Model for Efficient Video Deblurring)
- [OMoBlur: An Object Motion Blur Dataset and Benchmark for Real-World Local Motion Deblurring](#OMoBlur: An Object Motion Blur Dataset and Benchmark for Real-World Local Motion Deblurring)
- [SelfHVD: Self-Supervised Handheld Video Deblurring](#SelfHVD: Self-Supervised Handheld Video Deblurring)
- [Spatio-Temporal Difference Guided Motion Deblurring with the Complementary Vision Sensor](#Spatio-Temporal Difference Guided Motion Deblurring with the Complementary Vision Sensor)
- [Unblur-SLAM: Dense Neural SLAM for Blurry Inputs](#Unblur-SLAM: Dense Neural SLAM for Blurry Inputs)
-
- 5.去噪(Denoising)
-
-
- [2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition](#2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition)
- [Back to Basics: Let Denoising Generative Models Denoise](#Back to Basics: Let Denoising Generative Models Denoise)
- [Convexity-Aware Noise Calibration: A Self-Supervised Framework for Noise-Level-Unknown Image Denoising](#Convexity-Aware Noise Calibration: A Self-Supervised Framework for Noise-Level-Unknown Image Denoising)
- [Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning](#Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning)
- [Efficient Real-Time Raw-to-Raw Denoising for Extreme Low-Light Ultra HD Video on Mobile Devices](#Efficient Real-Time Raw-to-Raw Denoising for Extreme Low-Light Ultra HD Video on Mobile Devices)
- [Learning to Translate Noise for Robust Image Denoising](#Learning to Translate Noise for Robust Image Denoising)
- [LF-BVN: Blind-View Network for Self-Supervised Light Field Denoising](#LF-BVN: Blind-View Network for Self-Supervised Light Field Denoising)
- [Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising](#Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising)
- [RelativeFlow: Taming Medical Image Denoising Learning with Noisy Reference](#RelativeFlow: Taming Medical Image Denoising Learning with Noisy Reference)
- [Routing on Demand: DSNet for Efficient Progressive Point Cloud Denoising](#Routing on Demand: DSNet for Efficient Progressive Point Cloud Denoising)
- [Statistical Characteristic-Guided Denoising for Rapid High-Resolution Transmission Electron Microscopy Imaging](#Statistical Characteristic-Guided Denoising for Rapid High-Resolution Transmission Electron Microscopy Imaging)
- [TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Denoising](#TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Denoising)
- [Zero-Shot Image Denoising via Hybrid Prior-Guided Pseudo Sample Generation](#Zero-Shot Image Denoising via Hybrid Prior-Guided Pseudo Sample Generation)
-
- [6.图像恢复(Image Restoration)](#6.图像恢复(Image Restoration))
-
-
- [Benchmarking Endoscopic Surgical Image Restoration and Beyond](#Benchmarking Endoscopic Surgical Image Restoration and Beyond)
- [Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration](#Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration)
- [Beyond the Ground Truth: Enhanced Supervision for Image Restoration](#Beyond the Ground Truth: Enhanced Supervision for Image Restoration)
- [Beyond the Static-World: Lifelong Learning for All-in-One Medical Image Restoration](#Beyond the Static-World: Lifelong Learning for All-in-One Medical Image Restoration)
- [Blockwise Divide-and-Aggregate for Image Restoration using Diffusion Priors](#Blockwise Divide-and-Aggregate for Image Restoration using Diffusion Priors)
- [CARD: Correlation Aware Restoration with Diffusion](#CARD: Correlation Aware Restoration with Diffusion)
- [DEBIR: Dynamic Exposure Burst Image Restoration](#DEBIR: Dynamic Exposure Burst Image Restoration)
- [Degradation-Consistent Test-Time Adaptation for All-in-One Image Restoration](#Degradation-Consistent Test-Time Adaptation for All-in-One Image Restoration)
- [DRIFT: Deep Restoration, ISP Fusion, and Tone-mapping](#DRIFT: Deep Restoration, ISP Fusion, and Tone-mapping)
- [EpiAgent: An Agent-Centric System for Ancient Inscription Restoration](#EpiAgent: An Agent-Centric System for Ancient Inscription Restoration)
- [Evolutionary Multi-Agent Collaboration for Real-World Video Face Restoration](#Evolutionary Multi-Agent Collaboration for Real-World Video Face Restoration)
- [Face2Scene: Using Facial Degradation as an Oracle for Diffusion-Based Scene Restoration](#Face2Scene: Using Facial Degradation as an Oracle for Diffusion-Based Scene Restoration)
- [FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration](#FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration)
- [FlowSteer: Conditioning Flow Field for Consistent Image Restoration](#FlowSteer: Conditioning Flow Field for Consistent Image Restoration)
- [FoundIR-v2: Optimizing Pre-Training Data Mixtures for Image Restoration Foundation Model](#FoundIR-v2: Optimizing Pre-Training Data Mixtures for Image Restoration Foundation Model)
- [Gaussian Splatting-based Low-Rank Tensor Representation for Multi-Dimensional Image Recovery](#Gaussian Splatting-based Low-Rank Tensor Representation for Multi-Dimensional Image Recovery)
- [gQIR: Generative Quanta Image Reconstruction](#gQIR: Generative Quanta Image Reconstruction)
- [GSNR: Graph Smooth Null-Space Representation for Inverse Problems](#GSNR: Graph Smooth Null-Space Representation for Inverse Problems)
- [HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration](#HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration)
- [How far have we gone in Generative Image Restoration? A study on its capability, limitations and evaluation practices](#How far have we gone in Generative Image Restoration? A study on its capability, limitations and evaluation practices)
- [Hybrid Agents for Image Restoration](#Hybrid Agents for Image Restoration)
- [LWTformer: A Detail-Aware, Learnable Wavelet-Transformer for Ancient Chinese Character Image Restoration](#LWTformer: A Detail-Aware, Learnable Wavelet-Transformer for Ancient Chinese Character Image Restoration)
- [MMDIR: Multimodal Instruction-Driven Framework for Mixed-Degradation Document Image Restoration](#MMDIR: Multimodal Instruction-Driven Framework for Mixed-Degradation Document Image Restoration)
- [NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration](#NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration)
- [Optical Tolerance-Compensated Diffusion Model for Image Restoration](#Optical Tolerance-Compensated Diffusion Model for Image Restoration)
- [PGDR-BambooSlips: Physics-Guided Multistep Deformation Reversal for Ancient Bamboo Slip Restoration](#PGDR-BambooSlips: Physics-Guided Multistep Deformation Reversal for Ancient Bamboo Slip Restoration)
- [Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration](#Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration)
- [RDBM: Residual Diffusion Bridge Model for Image Restoration](#RDBM: Residual Diffusion Bridge Model for Image Restoration)
- [Residual Diffusion Bridge Model for Image Restoration](#Residual Diffusion Bridge Model for Image Restoration)
- [Restore-R1: Efficient Image Restoration Agents via Reinforcement Learning with Multimodal LLM Perceptual Feedback](#Restore-R1: Efficient Image Restoration Agents via Reinforcement Learning with Multimodal LLM Perceptual Feedback)
- [Restore, Assess, Repeat: A Unified Framework for Iterative Image Restoration](#Restore, Assess, Repeat: A Unified Framework for Iterative Image Restoration)
- [Retrieve-to-Restore: Efficient All-in-One Image Restoration with a Retrieval-Based Degradation Bank](#Retrieve-to-Restore: Efficient All-in-One Image Restoration with a Retrieval-Based Degradation Bank)
- [Scan Clusters, Not Pixels: A Cluster-Centric Paradigm for Efficient Ultra-high-definition Image Restoration](#Scan Clusters, Not Pixels: A Cluster-Centric Paradigm for Efficient Ultra-high-definition Image Restoration)
- [Self-supervised Dynamic Heterogeneous Degradation Modeling for Unified Zero-Shot Image Restoration](#Self-supervised Dynamic Heterogeneous Degradation Modeling for Unified Zero-Shot Image Restoration)
- [ShiftLUT: Spatial Shift Enhanced Look-Up Tables for Efficient Image Restoration](#ShiftLUT: Spatial Shift Enhanced Look-Up Tables for Efficient Image Restoration)
- [Surgical Image Restoration Benchmark](#Surgical Image Restoration Benchmark)
- [UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement](#UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement)
- [UCMNet: Uncertainty-Aware Context Memory Network for Under-Display Camera Image Restoration](#UCMNet: Uncertainty-Aware Context Memory Network for Under-Display Camera Image Restoration)
- [UnfoldIR: Rethinking Deep Unfolding Network in Illumination Degradation Image Restoration](#UnfoldIR: Rethinking Deep Unfolding Network in Illumination Degradation Image Restoration)
- [UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration](#UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration)
- [VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion and Restoration](#VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion and Restoration)
- [ZeroIDIR: Zero-Reference Illumination Degradation Image Restoration with Perturbed Consistency Diffusion Models](#ZeroIDIR: Zero-Reference Illumination Degradation Image Restoration with Perturbed Consistency Diffusion Models)
-
- [7.图像增强(Image Enhancement)](#7.图像增强(Image Enhancement))
-
-
- [Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement](#Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement)
- [Bi-Bridge: Bidirectional Diffusion Bridges for Low-Light Image Enhancement](#Bi-Bridge: Bidirectional Diffusion Bridges for Low-Light Image Enhancement)
- [BiEvLight: Bi-level Learning of Task-Aware Event Refinement for Low-Light Image Enhancement](#BiEvLight: Bi-level Learning of Task-Aware Event Refinement for Low-Light Image Enhancement)
- [CtrlISP: Rescuing Low-Light RAW Images via Controllable Neural ISP](#CtrlISP: Rescuing Low-Light RAW Images via Controllable Neural ISP)
- [EIC-LIE: Event-Illumination Collaborative Low-light Image Enhancement](#EIC-LIE: Event-Illumination Collaborative Low-light Image Enhancement)
- [Evaluating Low-Light Image Enhancement Across Multiple Intensity Levels](#Evaluating Low-Light Image Enhancement Across Multiple Intensity Levels)
- [Event-Illumination Collaborative Low-light Image Enhancement with a High-resolution Real-world Dataset](#Event-Illumination Collaborative Low-light Image Enhancement with a High-resolution Real-world Dataset)
- [IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images](#IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images)
- [Leveraging Multispectral Sensors for Color Correction in Mobile Cameras](#Leveraging Multispectral Sensors for Color Correction in Mobile Cameras)
- [MR. Illuminate: Zero-Shot Low-Light Image Enhancement with Diffusion Prior](#MR. Illuminate: Zero-Shot Low-Light Image Enhancement with Diffusion Prior)
- [Multinex: Lightweight Low-Light Image Enhancement via Multi-prior Retinex](#Multinex: Lightweight Low-Light Image Enhancement via Multi-prior Retinex)
- [NEC-Diff: Noise-Robust Event-RAW Complementary Diffusion for Seeing Motion in Extreme Darkness](#NEC-Diff: Noise-Robust Event-RAW Complementary Diffusion for Seeing Motion in Extreme Darkness)
- [PrismNet: Semantic-Aware Image Enhancement via Vision Transformer and Zero-Cost Gating](#PrismNet: Semantic-Aware Image Enhancement via Vision Transformer and Zero-Cost Gating)
- [RodNet: Visual Pathway-Inspired Adaptive Sparse Network for Efficient Low-Light Image Enhancement](#RodNet: Visual Pathway-Inspired Adaptive Sparse Network for Efficient Low-Light Image Enhancement)
- [SDUIE: Semi-Supervised Diffusion for Underwater Image Enhancement with Quant-Text Dual Control](#SDUIE: Semi-Supervised Diffusion for Underwater Image Enhancement with Quant-Text Dual Control)
- [Towards Generalized Representations for Low-Light Understanding: When Signal Constancy Meets Semantic Enrichment](#Towards Generalized Representations for Low-Light Understanding: When Signal Constancy Meets Semantic Enrichment)
- [VSRELL: A Simple Baseline for Video Super-Resolution and Enhancement in Low-Light environment](#VSRELL: A Simple Baseline for Video Super-Resolution and Enhancement in Low-Light environment)
-
- 8.图像修复(Inpainting)
-
-
- [Blend-Aware Latent Diffusion: Mitigating Stitched Seams in Image Inpainting](#Blend-Aware Latent Diffusion: Mitigating Stitched Seams in Image Inpainting)
- [DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos](#DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos)
- [EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing](#EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing)
- [From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition](#From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition)
- [GOR-IS: 3D Gaussian Object Removal In the Intrinsic Space](#GOR-IS: 3D Gaussian Object Removal In the Intrinsic Space)
- [HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images](#HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images)
- [InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting](#InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting)
- [LaRP: Efficient Multi-View Inpainting with Latent Reprojection Priors](#LaRP: Efficient Multi-View Inpainting with Latent Reprojection Priors)
- [MAGIC: Few-Shot Mask-Guided Anomaly Inpainting](#MAGIC: Few-Shot Mask-Guided Anomaly Inpainting)
- [Object-WIPER: Training-Free Object and Associated Effect Removal in Video](#Object-WIPER: Training-Free Object and Associated Effect Removal in Video)
- [PHAC: Promptable Human Amodal Completion](#PHAC: Promptable Human Amodal Completion)
- [Precise Object and Effect Removal with Adaptive Target-Aware Attention](#Precise Object and Effect Removal with Adaptive Target-Aware Attention)
- [YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal](#YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal)
-
- [9.高动态范围成像(HDR Imaging)](#9.高动态范围成像(HDR Imaging))
-
-
- [Beyond8Bits: Full HDR UGC Dataset](#Beyond8Bits: Full HDR UGC Dataset)
- [ExpoCM: Exposure-Aware One-Step Generative Single-Image HDR Reconstruction](#ExpoCM: Exposure-Aware One-Step Generative Single-Image HDR Reconstruction)
- [F²HDR: Two-Stage HDR Video Reconstruction via Flow Adapter and Physical Motion Modeling](#F²HDR: Two-Stage HDR Video Reconstruction via Flow Adapter and Physical Motion Modeling)
- [LRHDR: Learning Representation-enhanced HDR Video Reconstruction](#LRHDR: Learning Representation-enhanced HDR Video Reconstruction)
- [Olbedo: An Albedo and Shading Aerial Dataset for Large-Scale Outdoor Environments](#Olbedo: An Albedo and Shading Aerial Dataset for Large-Scale Outdoor Environments)
-
- [Seeing through Light and Darkness: Sensor-Physics Grounded Deblurring HDR NeRF from Single-Exposure Images and Events](#Seeing through Light and Darkness: Sensor-Physics Grounded Deblurring HDR NeRF from Single-Exposure Images and Events)
-
- [10.图像质量评价(Image Quality Assessment)](#10.图像质量评价(Image Quality Assessment))
-
-
- [A^3: Towards Advertising Aesthetic Assessment](#A^3: Towards Advertising Aesthetic Assessment)
- [ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding](#ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding)
- [Bridging the Perception Gap in Image Super-Resolution Evaluation](#Bridging the Perception Gap in Image Super-Resolution Evaluation)
- [Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks](#Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks)
- [FluoCLIP: Stain-Aware Focus Quality Assessment in Fluorescence Microscopy](#FluoCLIP: Stain-Aware Focus Quality Assessment in Fluorescence Microscopy)
- [Generalizable Video Quality Assessment via Weak-to-Strong Learning](#Generalizable Video Quality Assessment via Weak-to-Strong Learning)
- [HDR-VLM: HDR-Domain Adaptation of VLMs and Preference-Aligned Quality Assessment for HDR Video Color Grading](#HDR-VLM: HDR-Domain Adaptation of VLMs and Preference-Aligned Quality Assessment for HDR Video Color Grading)
- [Learning Where to Look and How to Judge: Resolution-agnostic Image Quality Assessment with Quality-aware Saliency](#Learning Where to Look and How to Judge: Resolution-agnostic Image Quality Assessment with Quality-aware Saliency)
- [Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling](#Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling)
- [MDS-VQA: Model-Informed Data Selection for Video Quality Assessment](#MDS-VQA: Model-Informed Data Selection for Video Quality Assessment)
- [Pioneering Perceptual Video Fluency Assessment: A Novel Task with Benchmark Dataset and Baseline](#Pioneering Perceptual Video Fluency Assessment: A Novel Task with Benchmark Dataset and Baseline)
- [Probabilistic Prompt Adaptation for Unified Image Aesthetics and Quality Assessment](#Probabilistic Prompt Adaptation for Unified Image Aesthetics and Quality Assessment)
- [PR-IQA: Partial-Reference Image Quality Assessment for Diffusion Models](#PR-IQA: Partial-Reference Image Quality Assessment for Diffusion Models)
- [QD-PCQA: Quality-Aware Domain Adaptation for Point Cloud Quality Assessment](#QD-PCQA: Quality-Aware Domain Adaptation for Point Cloud Quality Assessment)
- [Rethinking Knowledge Transfer in Image Quality Assessment: A Perceptual Preference Structure Alignment Perspective](#Rethinking Knowledge Transfer in Image Quality Assessment: A Perceptual Preference Structure Alignment Perspective)
- [RL-ScanIQA: Reinforcement-Learned Scanpaths for Blind 360° Image Quality Assessment](#RL-ScanIQA: Reinforcement-Learned Scanpaths for Blind 360° Image Quality Assessment)
- [rPPG-VQA: A Video Quality Assessment Framework for Unsupervised rPPG Training](#rPPG-VQA: A Video Quality Assessment Framework for Unsupervised rPPG Training)
- [Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos](#Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos)
- [VITAL: Vision-Encoder-centered Pre-training for LMMs in Visual Quality Assessment](#VITAL: Vision-Encoder-centered Pre-training for LMMs in Visual Quality Assessment)
-
- [11.插帧(Frame Interpolation)](#11.插帧(Frame Interpolation))
-
-
- [Anchoring and Rescaling Attention for Semantically Coherent Inbetweening](#Anchoring and Rescaling Attention for Semantically Coherent Inbetweening)
- [LDF-VFI: Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers](#LDF-VFI: Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers)
- [One-Shot Flow, Any-Time Frame: A Bidirectional Warping Framework for Event-Based Video Frame Interpolation](#One-Shot Flow, Any-Time Frame: A Bidirectional Warping Framework for Event-Based Video Frame Interpolation)
- [Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers](#Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers)
-
- [12.视频/图像压缩(Video/Image Compression)](#12.视频/图像压缩(Video/Image Compression))
-
-
- [Adaptive Learned Image Compression with Graph Neural Networks](#Adaptive Learned Image Compression with Graph Neural Networks)
- [Beyond Pixel Loss: Video-INRs Prefer Perceptual Optimization](#Beyond Pixel Loss: Video-INRs Prefer Perceptual Optimization)
- [Block-based Learned Image Compression without Blocking Artifacts](#Block-based Learned Image Compression without Blocking Artifacts)
- [CADC: Content Adaptive Diffusion-Based Generative Image Compression](#CADC: Content Adaptive Diffusion-Based Generative Image Compression)
- [CoD: A Diffusion Foundation Model for Image Compression](#CoD: A Diffusion Foundation Model for Image Compression)
- [Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression](#Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression)
- [Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates](#Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates)
- [DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression](#DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression)
- [FreqSIC: Frequency-aware Stereo Image Compression with Bi-directional Checkerboard Context Model](#FreqSIC: Frequency-aware Stereo Image Compression with Bi-directional Checkerboard Context Model)
- [Generative Neural Video Compression via Video Diffusion Prior](#Generative Neural Video Compression via Video Diffusion Prior)
- [Generative Video Compression with One-Dimensional Latent Representation](#Generative Video Compression with One-Dimensional Latent Representation)
- [Learned Image Compression via Sparse Attention and Adaptive Frequency](#Learned Image Compression via Sparse Attention and Adaptive Frequency)
- [Low-Bitrate Video Compression through Semantic-Conditioned Diffusion](#Low-Bitrate Video Compression through Semantic-Conditioned Diffusion)
- [MambaSIC: Mamba-based Stereo Image Compression with Bi-directional Multi-reference Entropy Model](#MambaSIC: Mamba-based Stereo Image Compression with Bi-directional Multi-reference Entropy Model)
- [OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data](#OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data)
- [Parallax to Align Them All: An OmniParallax Attention Mechanism for Distributed Multi-View Image Compression](#Parallax to Align Them All: An OmniParallax Attention Mechanism for Distributed Multi-View Image Compression)
- [Perceptual Neural Video Compression with Color Separation and Rank Chain](#Perceptual Neural Video Compression with Color Separation and Rank Chain)
- [Real-Time Neural Video Compression with Unified Intra and Inter Coding](#Real-Time Neural Video Compression with Unified Intra and Inter Coding)
- [Ultra-Fast Neural Video Compression](#Ultra-Fast Neural Video Compression)
- [Ultra-Low Bitrate Perceptual Image Compression with Shallow Encoder](#Ultra-Low Bitrate Perceptual Image Compression with Shallow Encoder)
- [VLIC: Vision-Language Models As Perceptual Judges for Human-Aligned Image Compression](#VLIC: Vision-Language Models As Perceptual Judges for Human-Aligned Image Compression)
- [What and Where to Adapt: Structure-Semantics Co-Tuning for Machine Vision Compression via Synergistic Adapters](#What and Where to Adapt: Structure-Semantics Co-Tuning for Machine Vision Compression via Synergistic Adapters)
- [What Matters in Practical Learned Image Compression](#What Matters in Practical Learned Image Compression)
-
- [13.压缩图像/视频质量增强(Compressed Image/Video Quality Enhancement)](#13.压缩图像/视频质量增强(Compressed Image/Video Quality Enhancement))
- [14.图像去反光(Image Reflection Removal)](#14.图像去反光(Image Reflection Removal))
-
-
- [GenSIRR: Rectifying Latent Space for Generative Single-Image Reflection Removal](#GenSIRR: Rectifying Latent Space for Generative Single-Image Reflection Removal)
- [GFRRN: Explore the Gaps in Single Image Reflection Removal](#GFRRN: Explore the Gaps in Single Image Reflection Removal)
- [LightRR: A Lightweight Network for Single Image Reflection Removal](#LightRR: A Lightweight Network for Single Image Reflection Removal)
- [Polarization State Tracing for Reflection Removal and Color-Consistent Reconstruction](#Polarization State Tracing for Reflection Removal and Color-Consistent Reconstruction)
- [Rectifying Latent Space for Generative Single-Image Reflection Removal](#Rectifying Latent Space for Generative Single-Image Reflection Removal)
- [Reflection Separation from a Single Image via Joint Latent Diffusion](#Reflection Separation from a Single Image via Joint Latent Diffusion)
- [ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation](#ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation)
-
- [15.图像去阴影(Image Shadow Removal)](#15.图像去阴影(Image Shadow Removal))
-
-
- [PhaSR: Generalized Image Shadow Removal with Physically Aligned Priors](#PhaSR: Generalized Image Shadow Removal with Physically Aligned Priors)
-
- [16.图像上色(Image Colorization)](#16.图像上色(Image Colorization))
-
-
- [ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization](#ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization)
- [SketchDeco: Training-Free Latent Composition for Precise Sketch Colourisation](#SketchDeco: Training-Free Latent Composition for Precise Sketch Colourisation)
- [Towards High-resolution and Disentangled Reference-based Sketch Colorization](#Towards High-resolution and Disentangled Reference-based Sketch Colorization)
-
- [17.图像和谐化(Image Harmonization)](#17.图像和谐化(Image Harmonization))
-
-
- [HarmoniDiff-RS: Training-Free Diffusion Harmonization for Satellite Image Composition](#HarmoniDiff-RS: Training-Free Diffusion Harmonization for Satellite Image Composition)
- [HarmoVid: Relightful Video Portrait Harmonization](#HarmoVid: Relightful Video Portrait Harmonization)
-
- [18.视频稳相(Video Stabilization)](#18.视频稳相(Video Stabilization))
-
-
- [LightStab: Unsupervised Online Video Stabilization with Classical Priors](#LightStab: Unsupervised Online Video Stabilization with Classical Priors)
- [No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors](#No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors)
- [StabiGS: Video Stabilization through Rendering-Aware Trajectory Optimization in 3DGS-Reconstructed Scenes](#StabiGS: Video Stabilization through Rendering-Aware Trajectory Optimization in 3DGS-Reconstructed Scenes)
-
- [19.图像融合(Image Fusion)](#19.图像融合(Image Fusion))
-
-
- [Beyond Strict Pairing: Arbitrarily Paired Training for High-Performance Infrared and Visible Image Fusion](#Beyond Strict Pairing: Arbitrarily Paired Training for High-Performance Infrared and Visible Image Fusion)
- [Bridging Human Evaluation to Infrared and Visible Image Fusion](#Bridging Human Evaluation to Infrared and Visible Image Fusion)
- [Customized Fusion: A Closed-Loop Dynamic Network for Adaptive Multi-Task-Aware Infrared-Visible Image Fusion](#Customized Fusion: A Closed-Loop Dynamic Network for Adaptive Multi-Task-Aware Infrared-Visible Image Fusion)
- [Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios](#Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios)
- [Fusion in Your Way: Aligning Image Fusion with Heterogeneous Demands via Direct Preference Optimization](#Fusion in Your Way: Aligning Image Fusion with Heterogeneous Demands via Direct Preference Optimization)
- [FusionRegister: Every Infrared and Visible Image Fusion Deserves Registration](#FusionRegister: Every Infrared and Visible Image Fusion Deserves Registration)
- [More Than Meets the Eye: A Unified Image Fusion Framework via Semantic-Pixel Entropy Trade-off for Zero-Shot Generalization](#More Than Meets the Eye: A Unified Image Fusion Framework via Semantic-Pixel Entropy Trade-off for Zero-Shot Generalization)
- [Multi-Modal Image Fusion via Intervention-Stable Feature Learning](#Multi-Modal Image Fusion via Intervention-Stable Feature Learning)
- [Missing No More: Dictionary-Guided Cross-Modal Image Fusion under Missing Infrared](#Missing No More: Dictionary-Guided Cross-Modal Image Fusion under Missing Infrared)
- [Neurodynamics-Driven Coupled Neural P Systems for Multi-Focus Image Fusion](#Neurodynamics-Driven Coupled Neural P Systems for Multi-Focus Image Fusion)
- [PhyFusion: Physics-Aware Infrared and Visible Image Fusion via Modality-Specific Physical Priors](#PhyFusion: Physics-Aware Infrared and Visible Image Fusion via Modality-Specific Physical Priors)
- [ReCoFuse: Ultra-Robust Image Fusion via Restorative Multi-Modal Diffusion Reciprocal Coupling](#ReCoFuse: Ultra-Robust Image Fusion via Restorative Multi-Modal Diffusion Reciprocal Coupling)
- [RegionFuse: Region-Adaptive Pixel Distribution Learning for Infrared and Visible Image Fusion](#RegionFuse: Region-Adaptive Pixel Distribution Learning for Infrared and Visible Image Fusion)
- [UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation](#UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation)
- [VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion and Restoration](#VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion and Restoration)
-
- 20.其他任务(Others)
-
-
-
- [3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion](#3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion)
- [AceTone: Bridging Words and Colors for Conditional Image Grading](#AceTone: Bridging Words and Colors for Conditional Image Grading)
- [Continuous Exposure-Time Modeling for Realistic Atmospheric Turbulence Synthesis](#Continuous Exposure-Time Modeling for Realistic Atmospheric Turbulence Synthesis)
-
- [Cross-Scale Pansharpening via ScaleFormer and the PanScale Benchmark](#Cross-Scale Pansharpening via ScaleFormer and the PanScale Benchmark)
- [CRFT: Consistent-Recurrent Feature Flow Transformer for Cross-Modal Image Registration](#CRFT: Consistent-Recurrent Feature Flow Transformer for Cross-Modal Image Registration)
-
- [D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping](#D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping)
- [Dark3R: Learning Structure from Motion in the Dark](#Dark3R: Learning Structure from Motion in the Dark)
- [Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement](#Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement)
- [Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models](#Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models)
- [Exploring Spatiotemporal Feature Propagation for Video-Level Compressive Spectral Reconstruction: Dataset, Model and Benchmark](#Exploring Spatiotemporal Feature Propagation for Video-Level Compressive Spectral Reconstruction: Dataset, Model and Benchmark)
- [FastGaMer: Efficient GainMap Learning for Practical Inverse Tone Mapping](#FastGaMer: Efficient GainMap Learning for Practical Inverse Tone Mapping)
- [Fast Kernel-Space Diffusion for Remote Sensing Pansharpening](#Fast Kernel-Space Diffusion for Remote Sensing Pansharpening)
- [HFR and HDR Video from Multi-Attenuated Spikes Using a Rapidly Rotating SpokeND Filter](#HFR and HDR Video from Multi-Attenuated Spikes Using a Rapidly Rotating SpokeND Filter)
- [High-Quality and Efficient Turbulence Mitigation with Events](#High-Quality and Efficient Turbulence Mitigation with Events)
- [InstantRetouch: Efficient and High-Fidelity Instruction-Guided Image Retouching with Bilateral Space](#InstantRetouch: Efficient and High-Fidelity Instruction-Guided Image Retouching with Bilateral Space)
- [It Takes Two: A Duet of Periodicity and Directionality for Burst Flicker Removal](#It Takes Two: A Duet of Periodicity and Directionality for Burst Flicker Removal)
- [JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization](#JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization)
- [Language-Guided One-Step Diffusion Model for Nighttime Flare Removal](#Language-Guided One-Step Diffusion Model for Nighttime Flare Removal)
- [Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal](#Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal)
- [LRDUN: A Low-Rank Deep Unfolding Network for Efficient Spectral Compressive Imaging](#LRDUN: A Low-Rank Deep Unfolding Network for Efficient Spectral Compressive Imaging)
- [MERIT: Multi-domain Efficient RAW Image Translation](#MERIT: Multi-domain Efficient RAW Image Translation)
- [MTRWKV: Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening](#MTRWKV: Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening)
- [Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening](#Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening)
- [POS-ISP: Pipeline Optimization at the Sequence Level for Task-aware ISP](#POS-ISP: Pipeline Optimization at the Sequence Level for Task-aware ISP)
- [PromptStereo: Zero-Shot Stereo Matching via Structure and Motion Prompts](#PromptStereo: Zero-Shot Stereo Matching via Structure and Motion Prompts)
- [Regulating Rather than Constraining: Adaptive Guidance for Complex Spectral Reconstruction in Pansharpening](#Regulating Rather than Constraining: Adaptive Guidance for Complex Spectral Reconstruction in Pansharpening)
- [RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward](#RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward)
- [Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery](#Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery)
- [Seeing Through Blur: Tackling Defocus in Spike-Based Imaging](#Seeing Through Blur: Tackling Defocus in Spike-Based Imaging)
- [Spatial-Spectral Residuals Informed Diffusion Neural Operator for Pan-sharpening](#Spatial-Spectral Residuals Informed Diffusion Neural Operator for Pan-sharpening)
- [SpiralDiff: Spiral Diffusion with LoRA for RGB-to-RAW Conversion Across Cameras](#SpiralDiff: Spiral Diffusion with LoRA for RGB-to-RAW Conversion Across Cameras)
- [Stability and Non-Local Modeling in Hybrid Convolution-Transformer Networks for Snapshot Hyperspectral Reconstruction](#Stability and Non-Local Modeling in Hybrid Convolution-Transformer Networks for Snapshot Hyperspectral Reconstruction)
- [Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework](#Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework)
- [Towards Universal Computational Aberration Correction in Photographic Cameras: A Comprehensive Benchmark Analysis](#Towards Universal Computational Aberration Correction in Photographic Cameras: A Comprehensive Benchmark Analysis)
- [UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision](#UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision)
- [White-Balance First, Adjust Later: Cross-Camera Color Constancy via Vision-Language Evaluation](#White-Balance First, Adjust Later: Cross-Camera Color Constancy via Vision-Language Evaluation)
-
-
整理汇总下2026年底层视觉(Low-Level Vision)相关的论文和代码,括超分辨率,图像去雨,图像去雾,去模糊,去噪,图像恢复,图像增强,图像去摩尔纹,图像修复,图像质量评价,插帧,图像/视频压缩等任务,具体如下。
最新修改版本会首先更新在Github,欢迎star,fork和PR~
也欢迎对底层视觉任务感兴趣的朋友一块更新~
Github :Awesome-CVPR2026-Low-Level-Vision
知乎 :https://zhuanlan.zhihu.com/p/2011913167840764194
参考或转载请注明出处
CVPR2025官网:https://cvpr.thecvf.com/Conferences/2025
CVPR接收论文列表:https://cvpr.thecvf.com/Conferences/2025/AcceptedPapers
CVPR完整论文库:
开会时间:2025月6月11日-2025月6月15日
论文接收公布时间:2025年2月27日
【Contents】
- 1.超分辨率(Super-Resolution)
- [2.图像去雨(Image Deraining)](#2.图像去雨(Image Deraining))
- [3.图像去雾(Image Dehazing)](#3.图像去雾(Image Dehazing))
- 4.去模糊(Deblurring)
- 5.去噪(Denoising)
- [6.图像恢复(Image Restoration)](#6.图像恢复(Image Restoration))
- [7.图像增强(Image Enhancement)](#7.图像增强(Image Enhancement))
- 8.图像修复(Inpainting)
- [9.高动态范围成像(HDR Imaging)](#9.高动态范围成像(HDR Imaging))
- [10.图像质量评价(Image Quality Assessment)](#10.图像质量评价(Image Quality Assessment))
- [11.插帧(Frame Interpolation)](#11.插帧(Frame Interpolation))
- [12.视频/图像压缩(Video/Image Compression)](#12.视频/图像压缩(Video/Image Compression))
- [13.压缩图像质量增强(Compressed Image Quality Enhancement)](#13.压缩图像质量增强(Compressed Image Quality Enhancement))
- [14.图像去反光(Image Reflection Removal)](#14.图像去反光(Image Reflection Removal))
- [15.图像去阴影(Image Shadow Removal)](#15.图像去阴影(Image Shadow Removal))
- [16.图像上色(Image Colorization)](#16.图像上色(Image Colorization))
- [17.图像和谐化(Image Harmonization)](#17.图像和谐化(Image Harmonization))
- [18.视频稳相(Video Stabilization)](#18.视频稳相(Video Stabilization))
- [19.图像融合(Image Fusion)](#19.图像融合(Image Fusion))
- 20.其他任务(Others)
1.超分辨率(Super-Resolution)
xxx
- Paper:
- Code:
AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution
- Paper: https://arxiv.org/abs/2603.00589
- Code:
Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution
CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness
- Paper: https://arxiv.org/abs/2602.22159
- Code:
Compressed-Domain-Aware Online Video Super-Resolution
Disentangled Textual Priors for Diffusion-based Image Super-Resolution
DNF-SR: Dual-Input and Negative-Aware Feature Fine-Tuning for Real-World Image Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Han_DNF-SR_Dual-Input_and_Negative-Aware_Feature_Fine-Tuning_for_Real-World_Image_Super-Resolution_CVPR_2026_paper.html
- Code: https://github.com/SHH-Han/DNF-SR
DreamSR: Towards Ultra-High-Resolution Image Super-Resolution via a Receptive-Field Enhanced Diffusion Transformer
DTG-Restore: Training-Free Diffusion Refinement for Generative Video Super-Resolution
Dual Graph Regularized Deep Unfolding Network for Guided Depth Map Super-resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Zhong_Dual_Graph_Regularized_Deep_Unfolding_Network_for_Guided_Depth_Map_CVPR_2026_paper.html
- Code: https://github.com/zhwzhong/LapNet
DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution
DVAR: Dynamic Visual Autoregressive Modeling for Image Super-Resolution
Edge-aware Multimodal Residual Diffusion Model for Hyperspectral Image Super-resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Zhang_EMR-Diff_Edge-aware_Multimodal_Residual_Diffusion_Model_for_Hyperspectral_Image_Super-resolution_CVPR_2026_paper.html
- Code: https://github.com/luocz55/EMR-Diff
Edge-Focused Super-Resolution for Omnidirectional Images with Spherical Geometric Augmentation
Enhancing Unregistered Hyperspectral Image Super-Resolution via Unmixing-based Abundance Fusion Learning
- Paper: https://arxiv.org/abs/2603.07918
- Code:
FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution
FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution
FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution
GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution
Gradient Knows Best: Mixed-Precision Quantization via Gradient-Guided Bit Allocation for Super-Resolution
HDW-SR: High-Frequency Guided Diffusion Model based on Wavelet Decomposition for Image Super-Resolution
- Paper: https://arxiv.org/abs/2511.13175
- Code:
IAFMNet: Information-Aware Feature Modulation for Efficient Super-Resolution
IFCSR: Inference-Free Fidelity-Realism Control for One-Step Diffusion-based Real-World Image Super-Resolution
Linear Recurrent Unit with Semantic Modulation for Image Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Choi_Linear_Recurrent_Unit_with_Semantic_Modulation_for_Image_Super-Resolution_CVPRF_2026_paper.html
- Code: https://github.com/MingyuChoi-run/LSM
LSGQuant: Layer-Sensitivity Guided Quantization for One-Step Diffusion Real-World Video Super-Resolution
One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution
Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction
Plug-and-Play Dynamic In-context Learning with Stochastic Regularization for Screen Content Image Super-Resolution
PS-SR: Pseudo-Single-Step Video Super-Resolution via Speculative Diffusion
QDM: Quadtree-Based Region-Adaptive Sparse Diffusion Models for Efficient Image Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Yang_QDM_Quadtree-Based_Region-Adaptive_Sparse_Diffusion_Models_for_Efficient_Image_Super-Resolution_CVPRF_2026_paper.html
- Code: https://github.com/linYDTHU/QDM
RAW-Domain Degradation Models for Realistic Smartphone Super-Resolution
- Paper: https://arxiv.org/abs/2603.12493
- Code:
Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework
- Paper: https://arxiv.org/abs/2604.13994
- Code:
Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance
- Paper: https://arxiv.org/abs/2510.21590
- Code:
Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features
SAT: Selective Aggregation Transformer for Image Super-Resolution
- Paper: https://arxiv.org/abs/2604.07994
- Code:
Spectral Super-Resolution via Adversarial Unfolding and Data-Driven Spectrum Regularization
- Paper: https://arxiv.org/abs/2603.00920
- Code:
SR3R: Rethinking Super-Resolution 3D Reconstruction With Feed-Forward Gaussian Splatting
- Paper: https://arxiv.org/abs/2602.24020
- Code:
STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution
TextOVSR: Text-Guided Real-World Opera Video Super-Resolution
Thermal Diffusion Matters: Infrared Spatial-Temporal Video Super-Resolution through Heat Conduction Priors
Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution
Time Without Time: Pseudo-Temporal Representation for Space-Time Super-Resolution
TinySR: Shallow Diffusion Transformers for Real-World Image Super-Resolution
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Dong_TinySR_Shallow_Diffusion_Transformers_for_Real-World_Image_Super-Resolution_CVPRF_2026_paper.html
- Code: https://github.com/Microtreei/TinySR
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution
Toward Real-world Infrared Image Super-Resolution: A Unified Autoregressive Framework and Benchmark Dataset
TPTransformer: Tensor-Tensor Product Transformer for Hyperspectral Image Super-Resolution
TUDSR: Twice Upsampling-Diffusion for Higher Super-Resolution
UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution
VSRELL: A Simple Baseline for Video Super-Resolution and Enhancement in Low-Light environment
VoDaSuRe: A Large-Scale Dataset Revealing Domain Shift in Volumetric Super-Resolution
VOSR: A Vision-Only Generative Model for Image Super-Resolution
2.图像去雨(Image Deraining)
UniRain: Unified Image Deraining with RAG-based Dataset Distillation and Multi-objective Reweighted Optimization
Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy
- Paper: https://arxiv.org/abs/2605.00719
- Code:
3.图像去雾(Image Dehazing)
Bilevel Layer-Positioning LoRA for Real Image Dehazing
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Zhang_Bilevel_Layer-Positioning_LoRA_for_Real_Image_Dehazing_CVPR_2026_paper.html
- Code: https://github.com/YanZhang-zy/BiLaLoRA
Disentanglement-wise Image Dehazing through Cross-Domain Manifold Consensus
From Events to Clarity: The Event-Guided Diffusion Framework for Dehazing
- Paper: https://arxiv.org/abs/2511.11944
- Code:
HazeMatching: Dehazing Light Microscopy Images with Guided Conditional Flow Matching
Inf-Dehaze: Beyond GPU Memory Constraints for Ultra-High-Resolution Image Dehazing
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Yan_Inf-Dehaze_Beyond_GPU_Memory_Constraints_for_Ultra-High-Resolution_Image_Dehazing_CVPRF_2026_paper.html
- Code: https://github.com/fengyanzi/Inf-Dehaze
4.去模糊(Deblurring)
BluRef: Unsupervised Image Deblurring with Dense-Matching References
- Paper: https://arxiv.org/abs/2603.14176
- Code:
Event-based Motion Deblurring with Unpaired Data
Gyro-based Deep Video Deblurring
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Rim_Gyro-based_Deep_Video_Deblurring_CVPR_2026_paper.html
- Code:
Motion-Aware Animatable Gaussian Avatars Deblurring
MSCD-GS: Motion-Separated Cooperative Deblurring Dynamic Reconstruction via Gaussian Splatting
MVSSM: Motion-aware Visual State Space Model for Efficient Video Deblurring
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Zhou_MVSSM_Motion-aware_Visual_State_Space_Model_for_Efficient_Video_Deblurring_CVPRF_2026_paper.html
- Code: https://github.com/Frank-Zhou-01/MVSSM
OMoBlur: An Object Motion Blur Dataset and Benchmark for Real-World Local Motion Deblurring
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Yu_OMoBlur_An_Object_Motion_Blur_Dataset_and_Benchmark_for_Real-World_CVPR_2026_paper.html
- Code: https://github.com/yudingchuan/OMDNet
SelfHVD: Self-Supervised Handheld Video Deblurring
Spatio-Temporal Difference Guided Motion Deblurring with the Complementary Vision Sensor
- Paper: https://arxiv.org/abs/2604.10554
- Code:
Unblur-SLAM: Dense Neural SLAM for Blurry Inputs
5.去噪(Denoising)
2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition
- Paper: https://arxiv.org/abs/2512.03245
- Code:
Back to Basics: Let Denoising Generative Models Denoise
- Paper: https://arxiv.org/abs/2511.13720
- Code:
Convexity-Aware Noise Calibration: A Self-Supervised Framework for Noise-Level-Unknown Image Denoising
Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning
- Paper: https://arxiv.org/abs/2603.04870
- Code:
Efficient Real-Time Raw-to-Raw Denoising for Extreme Low-Light Ultra HD Video on Mobile Devices
Learning to Translate Noise for Robust Image Denoising
- Paper: https://arxiv.org/abs/2412.04727
- Code: https://github.com/dhryougit/learning-to-translate-noise
LF-BVN: Blind-View Network for Self-Supervised Light Field Denoising
Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising
- Paper: https://arxiv.org/abs/2512.21038
- Code:
RelativeFlow: Taming Medical Image Denoising Learning with Noisy Reference
- Paper: https://arxiv.org/abs/2604.15459
- Code:
Routing on Demand: DSNet for Efficient Progressive Point Cloud Denoising
Statistical Characteristic-Guided Denoising for Rapid High-Resolution Transmission Electron Microscopy Imaging
TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Denoising
Zero-Shot Image Denoising via Hybrid Prior-Guided Pseudo Sample Generation
6.图像恢复(Image Restoration)
Benchmarking Endoscopic Surgical Image Restoration and Beyond
- Paper: https://arxiv.org/abs/2505.19161
- Code:
Beyond Ground-Truth: Leveraging Image Quality Priors for Real-World Image Restoration
- Paper: https://arxiv.org/abs/2603.29773
- Code:
Beyond the Ground Truth: Enhanced Supervision for Image Restoration
Beyond the Static-World: Lifelong Learning for All-in-One Medical Image Restoration
Blockwise Divide-and-Aggregate for Image Restoration using Diffusion Priors
CARD: Correlation Aware Restoration with Diffusion
- Paper: https://arxiv.org/abs/2512.05268
- Code:
DEBIR: Dynamic Exposure Burst Image Restoration
Degradation-Consistent Test-Time Adaptation for All-in-One Image Restoration
DRIFT: Deep Restoration, ISP Fusion, and Tone-mapping
- Paper: https://arxiv.org/abs/2604.03402
- Code:
EpiAgent: An Agent-Centric System for Ancient Inscription Restoration
Evolutionary Multi-Agent Collaboration for Real-World Video Face Restoration
Face2Scene: Using Facial Degradation as an Oracle for Diffusion-Based Scene Restoration
- Paper: https://arxiv.org/abs/2603.16570
- Code:
FAPE-IR: Frequency-Aware Planning and Execution Framework for All-in-One Image Restoration
FlowSteer: Conditioning Flow Field for Consistent Image Restoration
FoundIR-v2: Optimizing Pre-Training Data Mixtures for Image Restoration Foundation Model
Gaussian Splatting-based Low-Rank Tensor Representation for Multi-Dimensional Image Recovery
gQIR: Generative Quanta Image Reconstruction
GSNR: Graph Smooth Null-Space Representation for Inverse Problems
- Paper: https://arxiv.org/abs/2602.20328
- Code:
HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration
- Paper: https://arxiv.org/abs/2512.03345
- Code:
How far have we gone in Generative Image Restoration? A study on its capability, limitations and evaluation practices
Hybrid Agents for Image Restoration
- Paper: https://arxiv.org/abs/2503.10120
- Code:
LWTformer: A Detail-Aware, Learnable Wavelet-Transformer for Ancient Chinese Character Image Restoration
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Ruan_LWTformer_A_Detail-Aware_Learnable_Wavelet-Transformer_for_Ancient_Chinese_Character_Image_CVPRF_2026_paper.html
- Code: https://github.com/INWLY/LWTformer
MMDIR: Multimodal Instruction-Driven Framework for Mixed-Degradation Document Image Restoration
NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration
- Paper: https://arxiv.org/abs/2601.09823
- Code:
Optical Tolerance-Compensated Diffusion Model for Image Restoration
PGDR-BambooSlips: Physics-Guided Multistep Deformation Reversal for Ancient Bamboo Slip Restoration
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Tang_Physics-Guided_Multistep_Deformation_Reversal_for_Ancient_Bamboo_Slip_Restoration_CVPR_2026_paper.html
- Code: https://github.com/VillanelleQQ/PGDR-BambooSlips
Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Chen_Q-MambaIR_Accurate_Quantized_Mamba_for_Efficient_Image_Restoration_CVPRF_2026_paper.html
- Code: https://github.com/Areache/Q-MambaIR
RDBM: Residual Diffusion Bridge Model for Image Restoration
Residual Diffusion Bridge Model for Image Restoration
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Wang_Residual_Diffusion_Bridge_Model_for_Image_Restoration_CVPR_2026_paper.html
- Code: https://github.com/MiliLab/RDBM
Restore-R1: Efficient Image Restoration Agents via Reinforcement Learning with Multimodal LLM Perceptual Feedback
Restore, Assess, Repeat: A Unified Framework for Iterative Image Restoration
Retrieve-to-Restore: Efficient All-in-One Image Restoration with a Retrieval-Based Degradation Bank
Scan Clusters, Not Pixels: A Cluster-Centric Paradigm for Efficient Ultra-high-definition Image Restoration
Self-supervised Dynamic Heterogeneous Degradation Modeling for Unified Zero-Shot Image Restoration
ShiftLUT: Spatial Shift Enhanced Look-Up Tables for Efficient Image Restoration
Surgical Image Restoration Benchmark
- Paper: https://arxiv.org/abs/2505.19161
- Code: https://github.com/PJLallen/Surgical-Image-Restoration
UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
- Paper: https://arxiv.org/abs/2512.06750
- Code:
UCMNet: Uncertainty-Aware Context Memory Network for Under-Display Camera Image Restoration
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Kim_UCMNet_Uncertainty-Aware_Context_Memory_Network_for_Under-Display_Camera_Image_Restoration_CVPR_2026_paper.html
- Code: https://github.com/kdhRick2222/UCMNet
UnfoldIR: Rethinking Deep Unfolding Network in Illumination Degradation Image Restoration
UniLDiff: Unlocking the Power of Diffusion Priors for All-in-One Image Restoration
- Paper: https://arxiv.org/abs/2507.23685
- Code:
VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion and Restoration
ZeroIDIR: Zero-Reference Illumination Degradation Image Restoration with Perturbed Consistency Diffusion Models
- Paper: https://arxiv.org/abs/2605.11435
- Code:
7.图像增强(Image Enhancement)
Adapting Large VLMs with Iterative and Manual Instructions for Generative Low-light Enhancement
Bi-Bridge: Bidirectional Diffusion Bridges for Low-Light Image Enhancement
BiEvLight: Bi-level Learning of Task-Aware Event Refinement for Low-Light Image Enhancement
- Paper: https://arxiv.org/abs/2603.04975
- Code:
CtrlISP: Rescuing Low-Light RAW Images via Controllable Neural ISP
EIC-LIE: Event-Illumination Collaborative Low-light Image Enhancement
- Paper:
- Code: https://github.com/QUEAHREN/EIC-LIE
Evaluating Low-Light Image Enhancement Across Multiple Intensity Levels
Event-Illumination Collaborative Low-light Image Enhancement with a High-resolution Real-world Dataset
IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images
Leveraging Multispectral Sensors for Color Correction in Mobile Cameras
MR. Illuminate: Zero-Shot Low-Light Image Enhancement with Diffusion Prior
Multinex: Lightweight Low-Light Image Enhancement via Multi-prior Retinex
NEC-Diff: Noise-Robust Event-RAW Complementary Diffusion for Seeing Motion in Extreme Darkness
PrismNet: Semantic-Aware Image Enhancement via Vision Transformer and Zero-Cost Gating
- Paper: https://openaccess.thecvf.com/content/CVPR2026F/html/Zhang_PrismNet_Semantic-Aware_Image_Enhancement_via_Vision_Transformer_and_Zero-Cost_Gating_CVPRF_2026_paper.html
- Code: https://github.com/kuixu/PrismNet
RodNet: Visual Pathway-Inspired Adaptive Sparse Network for Efficient Low-Light Image Enhancement
SDUIE: Semi-Supervised Diffusion for Underwater Image Enhancement with Quant-Text Dual Control
Towards Generalized Representations for Low-Light Understanding: When Signal Constancy Meets Semantic Enrichment
VSRELL: A Simple Baseline for Video Super-Resolution and Enhancement in Low-Light environment
8.图像修复(Inpainting)
Blend-Aware Latent Diffusion: Mitigating Stitched Seams in Image Inpainting
DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos
- Paper: https://arxiv.org/abs/2604.12270
- Code:
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing
From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition
- Paper: https://arxiv.org/abs/2511.20996
- Code:
GOR-IS: 3D Gaussian Object Removal In the Intrinsic Space
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting
- Paper: https://arxiv.org/abs/2603.23463
- Code:
LaRP: Efficient Multi-View Inpainting with Latent Reprojection Priors
MAGIC: Few-Shot Mask-Guided Anomaly Inpainting
- Paper: https://arxiv.org/abs/2507.02314
- Code: https://github.com/SpatialAILab/MAGIC-Anomaly-generation
Object-WIPER: Training-Free Object and Associated Effect Removal in Video
- Paper: https://arxiv.org/abs/2601.06391
- Code:
PHAC: Promptable Human Amodal Completion
- Paper: https://arxiv.org/abs/2603.14741
- Code:
Precise Object and Effect Removal with Adaptive Target-Aware Attention
YOSE: You Only Select Essential Tokens for Efficient DiT-based Video Object Removal
9.高动态范围成像(HDR Imaging)
Beyond8Bits: Full HDR UGC Dataset
- Paper:
- Code: https://github.com/shreshthsaini/Beyond8Bits
ExpoCM: Exposure-Aware One-Step Generative Single-Image HDR Reconstruction
- Paper: https://arxiv.org/abs/2605.02464
- Code:
F²HDR: Two-Stage HDR Video Reconstruction via Flow Adapter and Physical Motion Modeling
LRHDR: Learning Representation-enhanced HDR Video Reconstruction
Olbedo: An Albedo and Shading Aerial Dataset for Large-Scale Outdoor Environments
Seeing through Light and Darkness: Sensor-Physics Grounded Deblurring HDR NeRF from Single-Exposure Images and Events
10.图像质量评价(Image Quality Assessment)
A^3: Towards Advertising Aesthetic Assessment
ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding
Bridging the Perception Gap in Image Super-Resolution Evaluation
- Paper: https://arxiv.org/abs/2503.13074
- Code:
Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks
FluoCLIP: Stain-Aware Focus Quality Assessment in Fluorescence Microscopy
Generalizable Video Quality Assessment via Weak-to-Strong Learning
- Paper: https://arxiv.org/abs/2505.03631
- Code:
HDR-VLM: HDR-Domain Adaptation of VLMs and Preference-Aligned Quality Assessment for HDR Video Color Grading
Learning Where to Look and How to Judge: Resolution-agnostic Image Quality Assessment with Quality-aware Saliency
Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling
- Paper: https://arxiv.org/abs/2511.19024
- Code:
MDS-VQA: Model-Informed Data Selection for Video Quality Assessment
- Paper: https://arxiv.org/abs/2603.11525
- Code:
Pioneering Perceptual Video Fluency Assessment: A Novel Task with Benchmark Dataset and Baseline
- Paper: https://arxiv.org/abs/2603.26055
- Code:
Probabilistic Prompt Adaptation for Unified Image Aesthetics and Quality Assessment
PR-IQA: Partial-Reference Image Quality Assessment for Diffusion Models
QD-PCQA: Quality-Aware Domain Adaptation for Point Cloud Quality Assessment
Rethinking Knowledge Transfer in Image Quality Assessment: A Perceptual Preference Structure Alignment Perspective
RL-ScanIQA: Reinforcement-Learned Scanpaths for Blind 360° Image Quality Assessment
rPPG-VQA: A Video Quality Assessment Framework for Unsupervised rPPG Training
Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos
- Paper: https://arxiv.org/abs/2603.00938
- Code:
VITAL: Vision-Encoder-centered Pre-training for LMMs in Visual Quality Assessment
- Paper: https://arxiv.org/abs/2511.17962
- Code:
11.插帧(Frame Interpolation)
Anchoring and Rescaling Attention for Semantically Coherent Inbetweening
LDF-VFI: Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers
One-Shot Flow, Any-Time Frame: A Bidirectional Warping Framework for Event-Based Video Frame Interpolation
Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers
12.视频/图像压缩(Video/Image Compression)
Adaptive Learned Image Compression with Graph Neural Networks
Beyond Pixel Loss: Video-INRs Prefer Perceptual Optimization
- Paper:
- Code:
Block-based Learned Image Compression without Blocking Artifacts
CADC: Content Adaptive Diffusion-Based Generative Image Compression
- Paper: https://arxiv.org/abs/2602.21591
- Code:
CoD: A Diffusion Foundation Model for Image Compression
Differentiable Vector Quantization for Rate-Distortion Optimization of Generative Image Compression
Distributed Image Compression with Multimodal Side Information at Extremely Low Bitrates
DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Shi_DiT-IC_Aligned_Diffusion_Transformer_for_Efficient_Image_Compression_CVPR_2026_paper.html
- Code: https://github.com/NJUVISION/DiT-IC
FreqSIC: Frequency-aware Stereo Image Compression with Bi-directional Checkerboard Context Model
Generative Neural Video Compression via Video Diffusion Prior
- Paper: https://arxiv.org/abs/2512.05016
- Code:
Generative Video Compression with One-Dimensional Latent Representation
- Paper: https://arxiv.org/abs/2603.15302
- Code:
Learned Image Compression via Sparse Attention and Adaptive Frequency
Low-Bitrate Video Compression through Semantic-Conditioned Diffusion
MambaSIC: Mamba-based Stereo Image Compression with Bi-directional Multi-reference Entropy Model
OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data
Parallax to Align Them All: An OmniParallax Attention Mechanism for Distributed Multi-View Image Compression
- Paper: https://arxiv.org/abs/2603.03615
- Code:
Perceptual Neural Video Compression with Color Separation and Rank Chain
Real-Time Neural Video Compression with Unified Intra and Inter Coding
- Paper: https://arxiv.org/abs/2510.14431
- Code:
Ultra-Fast Neural Video Compression
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Li_Ultra-Fast_Neural_Video_Compression_CVPR_2026_paper.html
- Code:
Ultra-Low Bitrate Perceptual Image Compression with Shallow Encoder
VLIC: Vision-Language Models As Perceptual Judges for Human-Aligned Image Compression
- Paper: https://arxiv.org/abs/2512.15701
- Code:
What and Where to Adapt: Structure-Semantics Co-Tuning for Machine Vision Compression via Synergistic Adapters
What Matters in Practical Learned Image Compression
- Paper: https://arxiv.org/abs/2605.05148
- Code:
13.压缩图像/视频质量增强(Compressed Image/Video Quality Enhancement)
14.图像去反光(Image Reflection Removal)
GenSIRR: Rectifying Latent Space for Generative Single-Image Reflection Removal
- Paper:
- Code: https://github.com/lime-j/GenSIRR
GFRRN: Explore the Gaps in Single Image Reflection Removal
- Paper: https://arxiv.org/abs/2602.22695
- Code:
LightRR: A Lightweight Network for Single Image Reflection Removal
Polarization State Tracing for Reflection Removal and Color-Consistent Reconstruction
Rectifying Latent Space for Generative Single-Image Reflection Removal
Reflection Separation from a Single Image via Joint Latent Diffusion
ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation
15.图像去阴影(Image Shadow Removal)
PhaSR: Generalized Image Shadow Removal with Physically Aligned Priors
16.图像上色(Image Colorization)
ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization
- Paper: https://arxiv.org/abs/2603.28162
- Code:
SketchDeco: Training-Free Latent Composition for Precise Sketch Colourisation
Towards High-resolution and Disentangled Reference-based Sketch Colorization
- Paper: https://arxiv.org/abs/2603.05971
- Code: https://github.com/tellurion-kanata/ColorizeDiffusionXL
17.图像和谐化(Image Harmonization)
HarmoniDiff-RS: Training-Free Diffusion Harmonization for Satellite Image Composition
HarmoVid: Relightful Video Portrait Harmonization
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Choi_Relightful_Video_Portrait_Harmonization_CVPR_2026_paper.html
- Code:
18.视频稳相(Video Stabilization)
LightStab: Unsupervised Online Video Stabilization with Classical Priors
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Liu_No_Labels_No_Look-Ahead_Unsupervised_Online_Video_Stabilization_with_Classical_CVPR_2026_paper.html
- Code: https://github.com/liutao23/LightStab
No Labels, No Look-Ahead: Unsupervised Online Video Stabilization with Classical Priors
- Paper: https://arxiv.org/abs/2602.23141
- Code:
StabiGS: Video Stabilization through Rendering-Aware Trajectory Optimization in 3DGS-Reconstructed Scenes
19.图像融合(Image Fusion)
Beyond Strict Pairing: Arbitrarily Paired Training for High-Performance Infrared and Visible Image Fusion
Bridging Human Evaluation to Infrared and Visible Image Fusion
- Paper: https://arxiv.org/abs/2603.03871
- Code:
Customized Fusion: A Closed-Loop Dynamic Network for Adaptive Multi-Task-Aware Infrared-Visible Image Fusion
Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios
- Paper: https://arxiv.org/abs/2604.08922
- Code:
Fusion in Your Way: Aligning Image Fusion with Heterogeneous Demands via Direct Preference Optimization
- Paper: https://arxiv.org/abs/2605.06049
- Code:
FusionRegister: Every Infrared and Visible Image Fusion Deserves Registration
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Bian_FusionRegister_Every_Infrared_and_Visible_Image_Fusion_Deserves_Registration_CVPR_2026_paper.html
- Code: https://github.com/bociic/FusionRegister
More Than Meets the Eye: A Unified Image Fusion Framework via Semantic-Pixel Entropy Trade-off for Zero-Shot Generalization
Multi-Modal Image Fusion via Intervention-Stable Feature Learning
- Paper: https://arxiv.org/abs/2603.23272
- Code:
Missing No More: Dictionary-Guided Cross-Modal Image Fusion under Missing Infrared
Neurodynamics-Driven Coupled Neural P Systems for Multi-Focus Image Fusion
- Paper: https://arxiv.org/abs/2509.17704
- Code:
PhyFusion: Physics-Aware Infrared and Visible Image Fusion via Modality-Specific Physical Priors
ReCoFuse: Ultra-Robust Image Fusion via Restorative Multi-Modal Diffusion Reciprocal Coupling
RegionFuse: Region-Adaptive Pixel Distribution Learning for Infrared and Visible Image Fusion
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Xia_RegionFuse_Region-Adaptive_Pixel_Distribution_Learning_for_Infrared_and_Visible_Image_CVPR_2026_paper.html
- Code: https://github.com/DarkIceField/RegionFuse
UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation
- Paper: https://arxiv.org/abs/2603.14214
- Code:
VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion and Restoration
20.其他任务(Others)
3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion
AceTone: Bridging Words and Colors for Conditional Image Grading
Continuous Exposure-Time Modeling for Realistic Atmospheric Turbulence Synthesis
Cross-Scale Pansharpening via ScaleFormer and the PanScale Benchmark
CRFT: Consistent-Recurrent Feature Flow Transformer for Cross-Modal Image Registration
D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping
Dark3R: Learning Structure from Motion in the Dark
Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Zhang_Disentangle-then-Align_Non-Iterative_Hybrid_Multimodal_Image_Registration_via_Cross-Scale_Feature_Disentanglement_CVPR_2026_paper.html
- Code: https://github.com/Chunlei0913/HRNet
Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models
Exploring Spatiotemporal Feature Propagation for Video-Level Compressive Spectral Reconstruction: Dataset, Model and Benchmark
FastGaMer: Efficient GainMap Learning for Practical Inverse Tone Mapping
Fast Kernel-Space Diffusion for Remote Sensing Pansharpening
HFR and HDR Video from Multi-Attenuated Spikes Using a Rapidly Rotating SpokeND Filter
High-Quality and Efficient Turbulence Mitigation with Events
InstantRetouch: Efficient and High-Fidelity Instruction-Guided Image Retouching with Bilateral Space
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Wu_InstantRetouch_Efficient_and_High-Fidelity_Instruction-Guided_Image_Retouching_with_Bilateral_Space_CVPR_2026_paper.html
- Code: https://github.com/OpenImagingLab/InstantRetouch
It Takes Two: A Duet of Periodicity and Directionality for Burst Flicker Removal
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
Language-Guided One-Step Diffusion Model for Nighttime Flare Removal
Learning Latent Transmission and Glare Maps for Lens Veiling Glare Removal
LRDUN: A Low-Rank Deep Unfolding Network for Efficient Spectral Compressive Imaging
MERIT: Multi-domain Efficient RAW Image Translation
- Paper: https://arxiv.org/abs/2603.20836
- Code:
MTRWKV: Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening
- Paper: https://openaccess.thecvf.com/content/CVPR2026/html/Li_Multigrain-aware_Semantic_Prototype_Scanning_and_Tri-Token_Prompt_Learning_Embraced_High-Order_CVPR_2026_paper.html
- Code: https://github.com/L1junfeng/MTRWKV
Multigrain-aware Semantic Prototype Scanning and Tri-Token Prompt Learning Embraced High-Order RWKV for Pan-Sharpening
POS-ISP: Pipeline Optimization at the Sequence Level for Task-aware ISP
PromptStereo: Zero-Shot Stereo Matching via Structure and Motion Prompts
- Paper: https://arxiv.org/abs/2603.01650
- Code:
Regulating Rather than Constraining: Adaptive Guidance for Complex Spectral Reconstruction in Pansharpening
RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward
- Paper: https://arxiv.org/abs/2602.17558
- Code:
Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery
Seeing Through Blur: Tackling Defocus in Spike-Based Imaging
Spatial-Spectral Residuals Informed Diffusion Neural Operator for Pan-sharpening
SpiralDiff: Spiral Diffusion with LoRA for RGB-to-RAW Conversion Across Cameras
Stability and Non-Local Modeling in Hybrid Convolution-Transformer Networks for Snapshot Hyperspectral Reconstruction
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
- Paper: https://arxiv.org/abs/2605.07429
- Code:
Towards Universal Computational Aberration Correction in Photographic Cameras: A Comprehensive Benchmark Analysis
UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision
White-Balance First, Adjust Later: Cross-Camera Color Constancy via Vision-Language Evaluation
- Paper: https://arxiv.org/abs/2605.19613v1
- Code:
持续更新~