深度学习论文分享（六）Simple Baselines for Image Restoration

前言
- Abstract
- [1 Introduction](#1 Introduction)
- [2 Related Works](#2 Related Works)
- - [2.1 Image Restoration](#2.1 Image Restoration)
  - [2.2 Gated Linear Units](#2.2 Gated Linear Units)
- [3 Build A Simple Baseline](#3 Build A Simple Baseline)
- - [3.1 Architecture](#3.1 Architecture)
  - [3.2 A Plain Block](#3.2 A Plain Block)
  - [3.3 Normalization](#3.3 Normalization)
  - [3.4 Activation](#3.4 Activation)
  - [3.5 Attention](#3.5 Attention)
  - [3.6 Summary](#3.6 Summary)
- [4 Nonlinear Activation Free Network](#4 Nonlinear Activation Free Network)
- [5 Experiments](#5 Experiments)
- - [5.1 Ablations](#5.1 Ablations)
  - [5.2 Applications](#5.2 Applications)
- [6 Conclusions](#6 Conclusions)
- Appendix
- - [A Other Details](#A Other Details)
  - - [A.1 Inverted Bottleneck](#A.1 Inverted Bottleneck)
    - [A.2 Channel Attention and Simplified Channel Attention](#A.2 Channel Attention and Simplified Channel Attention)
    - [A.3 Feature Fusion](#A.3 Feature Fusion)
    - [A.4 Downsample/Upsample Layer](#A.4 Downsample/Upsample Layer)
  - [B More Visualization Results](#B More Visualization Results)
- References

前言

论文代码：https://github.com/megvii-research/NAFNet

Title：Simple Baselines for Image Restoration

Authors：Liangyu Chen ⋆ , Xiaojie Chu ⋆ , Xiangyu Zhang, and Jian Sun

MEGVII Technology, Beijing, CN

在此仅做翻译

Abstract

尽管近年来在图像恢复领域取得了重大进展，但最先进的(SOTA)方法的系统复杂性也在增加，这可能会阻碍方法的方便分析和比较。在本文中，我们提出了一个简单的基线，它超过了SOTA方法，并且计算效率很高。为了进一步简化基线，我们揭示了非线性激活函数，如Sigmoid, ReLU, GELU, Softmax等是不必要的:它们可以用乘法代替或删除。因此，我们从基线推导出一个非线性激活自由网络，即NAFNet。SOTA结果在各种具有挑战性的基准上实现，例如GoPro上的33.69 dB PSNR(用于图像去模糊)，超过了之前的SOTA 0.38 dB，计算成本仅为其8.4%;在SIDD(用于图像去噪)上的PSNR为40.30 dB，超过了之前的SOTA 0.28 dB，计算成本不到其一半。代码和预训练的模型在github.com/megvii-research/NAFNet上发布。

关键词:图像恢复，图像去噪，图像去模糊

图1:图像去模糊(左)和图像去噪(右)任务上的PSNR与计算成本

1 Introduction

随着深度学习的发展，图像恢复方法的性能有了很大的提高。基于深度学习的方法[5,37,39,36,6,7,32,8,25]已经取得了巨大的成功。例[39]和[8]分别在SIDD[1]/GoPro[26]上实现40.02/33.31 dB的PSNR去噪/去模糊。

尽管这些方法具有良好的性能，但它们的系统复杂性很高。为了便于讨论，我们将系统复杂度分解为两部分:块间复杂度和块内复杂度。首先是块间复杂性，如图2所示。[7,25]引入不同大小的特征映射之间的联系;[5,37]是多阶段网络，后一阶段对前一阶段的结果进行细化。第二，块内复杂性，即块内的各种设计选择。例如[39]中的多Dconv头转置注意模块和门控Dconv前馈网络(如图3a所示)，[22]中的Swin变压器块，[5]中的HINBlock等。逐个评估设计选择是不现实的。

图2:图像恢复模型的架构比较。用破折号来区分不同大小的特征。(a)多阶段架构[5,37]将UNet架构按顺序堆叠。(b)多尺度融合架构[25,7]融合了不同尺度的特征。©UNet架构，一些SOTA方法采用UNet架构[39,36]。我们用它作为我们的架构。为了简单起见，我们故意省略了一些细节，例如下采样/上采样层、特征融合模块、输入/输出快捷方式等。

基于上述事实，一个自然的问题出现了:低块间和低块内复杂度的网络是否有可能实现SOTA性能?为了实现第一个条件(低块间复杂度)，本文采用单阶段UNet作为架构(遵循一些SOTA方法[39,36])，重点研究第二个条件。为此，我们从一个包含最常见组件的普通块开始，即卷积、ReLU和快捷方式[14]。从普通块中，我们添加/替换SOTA方法的组件，并验证这些组件带来了多少性能增益。通过广泛的消融研究，我们提出了一个简单的基线，如图3c所示，它超过了SOTA方法，并且计算效率高。它有可能激发新的想法，并使其更容易验证。包含GELU[15]和Channel Attention Module[16] (CA)的基线可以进一步简化:我们发现基线中的GELU可以被视为门控线性单元10的一个特例，并由此经验证明它可以被一个简单的门(即特征映射的元素积)所取代。此外，我们还揭示了CA与GLU在形式上的相似性，并且CA中的非线性激活函数也可以被去除。总之，简单的基线可以进一步简化为一个非线性的无激活网络，称为NAFNet。我们主要在SIDD[1]上进行图像去噪实验，在GoPro[26]上进行图像去模糊实验，依次进行[5,39,37]。主要结果如图1所示，我们提出的基线和NAFNet在达到SOTA结果的同时计算效率很高:在GoPro上33.40/33.69 dB，分别超过之前的SOTA[8] 0.09/0.38 dB，计算成本为其8.4%;在SIDD上40.30 dB，超过[39]0.28 dB，而计算成本不到其一半。进行了大量和高质量的实验来说明我们提出的基线的有效性。

本文的贡献总结如下:

通过分解SOTA方法并提取其基本组件，我们形成了一个具有较低系统复杂性的基线(如图3c)，它可以超过以前的SOTA方法，并且具有较低的计算成本，如图1所示。它可以方便研究人员激发新的想法并方便地对其进行评价。
通过揭示GELU (Channel Attention to Gated Linear Unit)之间的联系，我们通过去除或替换非线性激活函数(如Sigmoid、ReLU和GELU)进一步简化了基线，并提出了一个非线性无激活网络，即NAFNet。虽然经过了简化，但可以达到或超过基线。据我们所知，这是第一个证明非线性激活函数可能不是SOTA计算机视觉方法所必需的工作。这项工作可能有潜力扩大SOTA计算机视觉方法的设计空间。

2.1 Image Restoration

图像恢复任务的目的是将退化的图像(如噪声、模糊)恢复到干净的图像。最近，基于深度学习的方法[5,37,39,36,6,7,32,8,25]在这些任务上取得了SOTA结果，并且大多数方法可以被视为经典解UNet[29]的变体。它将块堆叠成u型结构，并采用跳接。这些变体带来了性能的提高，以及系统的复杂性，我们将复杂性大致分为块间复杂性和块内复杂性。

Inter-block Complexity ：

5.1 Ablations

消融研究主要针对图像去噪(SIDD[1])和去模糊(GoPro[26])任务。如果没有指定，我们遵循[5]的实验设置，例如计算预算、梯度剪辑和PSNR损失的16个gmac。我们使用Adam[19]优化器(β1 = 0.9， β2 = 0.9，权重衰减0)训练模型，总迭代次数为200K，初始学习率1e−3逐渐降低到1e−6，采用余弦退火调度[24]。训练补丁大小为256 × 256，批大小为32。通过patch训练和全图像测试会导致性能下降[8]，我们采用MPRNet-local[8]之后的TLC[8]来解决这个问题。TLC对GoPro1的影响见表4。我们主要将TLC与[5]、[25]等采用的"贴片检测"策略进行比较。它带来了性能提升，并避免了补丁带来的工件。此外，我们使用skip-init[11]来稳定训练后续[23]。块的默认宽度和数量分别为32和36。如果块的数量发生变化，我们调整宽度以保持计算预算不变。我们在实验中报告了峰值信噪比(PSNR)和结构相似性(SSIM)。在NVIDIA 2080Ti GPU上，以256 × 256的输入大小进行速度/内存/计算复杂度评估。

From PlainNet to the simple baseline: PlainNet在第3节中定义，其模块如图3b所示。我们发现在默认设置下，PlainNet的训练是不稳定的。作为替代方案，我们将学习率(lr)降低了10倍，以使模型可训练。通过引入层归一化(LN)解决了这个问题:学习率可以从1e−4增加到1e−3，并且训练过程更稳定。在PSNR上，LN在SIDD和GoPro上分别带来0.46 dB和3.39 dB。此外，GELU和通道注意(Channel Attention, CA)在表1中也显示了其有效性。

From the simple baseline to NAFNet:

如第3节所述，NAFNet可以通过简化基线来获得。在表2中，我们展示了这种简化没有性能损失。相反，在SIDD和GoPro中，PSNR分别提高0.11 dB和0.50 dB。为了公平的比较，计算复杂度是一致的，详细信息请参见补充材料。提供了与基线相比修改的加速。此外，在推理中，与Baseline相比没有显著的额外内存消耗。

表1:从PlainNet构建一个简单的基线。验证了层归一化(LN)、GELU和信道注意(CA)的有效性。*表示由于学习率(lr)大，训练是不稳定的。

表2:NAFNet来源于基线的简化，即将GELU替换为SimpleGate (SG)，将Channel Attention (CA)替换为Simplified Channel Attention (SCA)。

Number of blocks: 我们在表3中验证了区块数量对NAFNet的影响。我们主要考虑720 × 1280空间大小的延迟，因为这是整个GoPro图像的大小。在将区块数量增加到36个的过程中，模型的性能得到了很大的提高，并且延迟没有明显增加(与9个区块相比+14.5%)。当区块数量进一步增加到72个时，模型的性能提升并不明显，但延迟明显增加(与36个区块相比增加了30.0%)。因为36块可以实现更好的性能/延迟平衡，所以我们使用它作为默认选项。

Variants of σ in SimpleGate:

Vanilla门控线性单元(GLU)包含一个非线性激活函数σ，如公式Eqn. 1所示。我们建议的SimpleGate，如图4和图4c所示，将其删除。换句话说，SimpleGate中的σ被设置为恒等函数。我们将单位函数中的σ变量化为表5中不同的非线性激活函数，以判断σ中非线性的重要性。SIDD上的PSNR基本不受影响(从39.96 dB波动到39.99 dB)，而GoPro上的PSNR显著下降(从-0.11 dB下降到-0.35 dB)，这表明在NAFNet中，SimpleGate中的σ可能不需要。

表3:区块数量的影响。调整宽度以保持计算预算保持不变。Latency-256和Latency-720分别基于输入大小256 × 256和720 × 1280，单位为毫秒

表4:TLC对GoPro的有效性[8][26]

表5:SimpleGate(X, Y) = X⊙σ(Y)中σ的变量

5.2 Applications

我们将NAFNet应用于各种图像恢复任务，如果没有指定，则遵循消融研究的训练设置，除了将其宽度从32增加到64。批大小和总训练迭代数分别为64和400K，如下[5]。应用随机作物增强。我们报告三个实验结果的平均值。为了获得更好的结果，基线被放大，详情见附录。

RGB Image Denoising 我们将RGB图像去噪结果与其他SOTA方法在SIDD上进行比较，如表6所示。Baseline及其简化版本NAFNet，以其计算成本的一小部分超过了之前的最佳结果Restormer 0.28 dB，如图1所示。定性结果如图5所示。与其他方法相比，我们提出的基线可以恢复更精细的细节。此外，我们在在线基准测试中获得了40.15 dB的SOTA结果，超过了之前排名靠前的方法0.23 dB。

Image Deblurring 我们在GoPro[26]数据集上比较SOTA方法的去模糊结果，采用翻转和旋转增强。如表7和图1所示，我们的基线和NAFNet的PSNR分别超过了之前的最佳方法MPRNet-local[8]，分别为0.09 dB和0.38 dB，而计算成本仅为其8.4%。可视化结果如图6所示，与其他方法相比，我们的基线可以恢复更清晰的结果。

图6 GoPro上图像去模糊方法的定性比较[26]

表6 SIDD图像去噪结果[1]

Raw Image Denoising 我们将NAFNet应用于原始图像去噪任务。训练和测试设置遵循PMRID[35]，为了简单起见，我们将测试集记为4Scenes(因为数据集包含不同光照条件下4个不同场景的39张原始图像)。此外，我们通过将NAFNet的宽度和块数分别从32块改为16块和36块改为7块进行公平比较，使计算成本低于PMRID。表8和图7所示的结果表明NAFNet可以在数量和质量上超过PMRID。此外，该实验表明我们的NAFNet可以灵活地扩展(从1.1 gmac到65 gmac)。

Image Deblurring with JPEG artifacts 我们在REDS[27]数据集上进行实验，训练集如下[5,32]，我们在验证集(记为red -val-300)中的300张图像上评估结果[5,32]。如表9所示，我们的方法优于其他竞争方法，包括之前在NTIRE 2021图像去模糊挑战Track2 JPEG工件的red数据集上的获胜方案(HINet)[27]。

表7 GoPro图像去模糊效果[26]

图7:PMRID[35]和我们的NAFNet的降噪效果的定性比较。放大查看细节

6 Conclusions

通过对SOTA方法进行分解，提取出基本的组件，并将其应用于朴素的PlainNet中。得到的基线在图像去噪和去模糊任务中达到SOTA性能。通过对基线的分析，我们发现它可以进一步简化:可以完全替换或去除其中的非线性激活函数。在此基础上，我们提出了一种非线性无激活网络------NAFNet。虽然简化了，但其性能等于或优于基线。我们提出的基线可能有助于研究人员评估他们的想法。此外，这项工作有可能影响未来的计算机视觉模型设计，因为我们证明了非线性激活函数不是实现SOTA性能所必需的。

致谢:本研究得到国家重点研发计划项目(No. 2017YFA0700800)和北京市人工智能研究院(BAAI)的支持。

Appendix

A Other Details

A.1 Inverted Bottleneck

如下[23]，我们在基线和NAFNet中采用了倒瓶颈设计。我们首先讨论消融研究的背景。在基线中，第一个跳跃连接内的通道宽度始终与输入一致，其计算代价可近似为:

其中H、W为特征映射的空间大小，c为输入维数，k为深度卷积的核大小(我们实验中为3)。在实际中，c比k要高，所以是Eqn。(1)≈2 × H × W × c × c，第二个跳槽连接内隐藏维数为输入维数的2倍，其计算代价为:

notations following Eqn. (1).因此，一个基线块的总计算代价≈6 × H × W × c × c

至于NAFNet的块，SimpleGate模块将通道宽度缩小了一半。我们将第一个跳跃连接的隐藏维数加倍，其计算代价近似为:

notations following Eqn. (1)...且第二跳接的隐维遵循基线。其计算成本为:

因此，一个NAFNet区块的总计算成本≈6 × H × W ×c×c，与基线的区块一致。这样做的好处是基线和NAFNet可以共享超参数，如块数量、学习率等。

对于应用程序，扩展了基线的第一个跳跃连接的隐藏维度，以获得更好的结果。此外，需要注意的是，上面的讨论省略了一些模块的计算，例如层归一化、GELU、通道注意等，因为它们的计算成本与卷积相比可以忽略不计。

A.2 Channel Attention and Simplified Channel Attention

对于宽度为c的特征图，通道注意模块将其缩小r倍，然后将其投影回c(通过完全连接层)。计算成本可以近似为c × c/r + c/r × c，对于简化的信道注意力模块，其计算成本为c × c，为了公平比较，我们选择r = 2，使它们在实验中的计算成本一致。

A.3 Feature Fusion

从编码器块到解码器块之间存在跳跃式连接，并且有几种方法可以融合编码器/解码器的特性。在[5]中，编码器特征通过卷积变换，然后与解码器特征连接。在[39]中，首先将特征连接起来，然后通过卷积进行变换。不同的是，我们简单地按元素添加编码器和解码器特征作为特征融合方法。

A.4 Downsample/Upsample Layer

对于下样本层，我们使用核大小为2，步长为2的卷积。这种设计选择的灵感来自[2]。对于上采样层，我们首先通过逐点卷积将通道宽度加倍，然后遵循像素洗刷模块[31]。

B More Visualization Results

我们提供了原始图像去噪、图像去模糊、RGB图像去噪任务的额外可视化结果，如图1、2和3所示。与其他方法相比，我们的基线可以恢复更精细的细节。建议放大，对比红色框内的详细信息。

References

Abdelhamed, A., Lin, S., Brown, M.S.: A high-quality denoising dataset for smartphone cameras. In: IEEE Conference on Computer Vision and Pattern Recognition

(CVPR) (June 2018)
Alsallakh, B., Kokhlikyan, N., Miglani, V., Yuan, J., Reblitz-Richardson, O.: Mind

the pad--cnns can develop blind spots. arXiv preprint arXiv:2010.02178 (2020)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint

arXiv:1607.06450 (2016)
Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., Gao,

W.: Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF

Conference on Computer Vision and Pattern Recognition. pp. 12299--12310 (2021)
Chen, L., Lu, X., Zhang, J., Chu, X., Chen, C.: Hinet: Half instance normalization

network for image restoration. In: Proceedings of the IEEE/CVF Conference on

Computer Vision and Pattern Recognition. pp. 182--192 (2021)
Cheng, S., Wang, Y., Huang, H., Liu, D., Fan, H., Liu, S.: Nbnet: Noise basis learning for image denoising with subspace projection. In: Proceedings of the

IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4896--

4906 (2021)
Cho, S.J., Ji, S.W., Hong, J.P., Jung, S.W., Ko, S.J.: Rethinking coarse-to-fine approach in single image deblurring. In: Proceedings of the IEEE/CVF International

Conference on Computer Vision. pp. 4641--4650 (2021)
Chu, X., Chen, L., , Chen, C., Lu, X.: Improving image restoration by revisiting

global information aggregation. arXiv preprint arXiv:2112.04491 (2021)
Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q.V., Salakhutdinov, R.:

Transformer-xl: Attentive language models beyond a fixed-length context. arXiv

preprint arXiv:1901.02860 (2019)
Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated

convolutional networks. In: International conference on machine learning. pp. 933--
PMLR (2017)
De, S., Smith, S.: Batch normalization biases residual blocks towards the identity

function in deep networks. Advances in Neural Information Processing Systems

33, 19964--19975 (2020)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner,

T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is

worth 16x16 words: Transformers for image recognition at scale. arXiv preprint

arXiv:2010.11929 (2020)
Han, Q., Fan, Z., Dai, Q., Sun, L., Cheng, M.M., Liu, J., Wang, J.: Demystifying

local vision transformer: Sparse connectivity, weight sharing, and dynamic weight.

arXiv preprint arXiv:2106.04263 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In:

Proceedings of the IEEE conference on computer vision and pattern recognition.

pp. 770--778 (2016)
Hendrycks, D., Gimpel, K.: Gaussian error linear units (gelus). arXiv preprint

arXiv:1606.08415 (2016)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the

IEEE conference on computer vision and pattern recognition. pp. 7132--7141 (2018)
Hua, W., Dai, Z., Liu, H., Le, Q.V.: Transformer quality in linear time. arXiv

preprint arXiv:2202.10447 (2022)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by

reducing internal covariate shift. In: International conference on machine learning.

pp. 448--456. PMLR (2015)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint

arXiv:1412.6980 (2014)
Liang, J., Cao, J., Fan, Y., Zhang, K., Ranjan, R., Li, Y., Timofte, R., Van Gool,

L.: Vrt: A video restoration transformer. arXiv preprint arXiv:2201.12288 (2022)
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image

restoration using swin transformer. In: Proceedings of the IEEE/CVF International

Conference on Computer Vision. pp. 1833--1844 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin

transformer: Hierarchical vision transformer using shifted windows. In: Proceedings

of the IEEE/CVF International Conference on Computer Vision. pp. 10012--10022

(2021)
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for

the 2020s. arXiv preprint arXiv:2201.03545 (2022)
Loshchilov, I., Hutter, F.: Sgdr: Stochastic gradient descent with warm restarts.

arXiv preprint arXiv:1608.03983 (2016)
Mao, X., Liu, Y., Shen, W., Li, Q., Wang, Y.: Deep residual fourier transformation

for single image deblurring. arXiv preprint arXiv:2111.11745 (2021)
Nah, S., Hyun Kim, T., Mu Lee, K.: Deep multi-scale convolutional neural network

for dynamic scene deblurring. In: Proceedings of the IEEE conference on computer

vision and pattern recognition. pp. 3883--3891 (2017)
Nah, S., Son, S., Lee, S., Timofte, R., Lee, K.M.: Ntire 2021 challenge on image

deblurring. In: Proceedings of the IEEE/CVF Conference on Computer Vision and

Pattern Recognition. pp. 149--165 (2021)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Icml (2010)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing

and computer-assisted intervention. pp. 234--241. Springer (2015)
Shazeer, N.: Glu variants improve transformer. arXiv preprint arXiv:2002.05202

(2020)
Shi, W., Caballero, J., Husz´ar, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert,

D., Wang, Z.: Real-time single image and video super-resolution using an efficient

sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on

computer vision and pattern recognition. pp. 1874--1883 (2016)
Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., Li, Y.: Maxim:

Multi-axis mlp for image processing. arXiv preprint arXiv:2201.02973 (2022)
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser,

ÙL., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
Wang, Y., Huang, H., Xu, Q., Liu, J., Liu, Y., Wang, J.: Practical deep raw image

denoising on mobile devices. In: European Conference on Computer Vision. pp.

1--16. Springer (2020)
Wang, Z., Cun, X., Bao, J., Liu, J.: Uformer: A general u-shaped transformer for

image restoration. arXiv preprint arXiv:2106.03106 (2021)
Waqas Zamir, S., Arora, A., Khan, S., Hayat, M., Shahbaz Khan, F., Yang, M.H.,

Shao, L.: Multi-stage progressive image restoration. arXiv e-prints pp. arXiv--2102

(2021)
Yan, J., Wan, R., Zhang, X., Zhang, W., Wei, Y., Sun, J.: Towards stabilizing

batch statistics in backward propagation of batch normalization. arXiv preprint

arXiv:2001.06838 (2020)
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.:

Restormer: Efficient transformer for high-resolution image restoration. arXiv

preprint arXiv:2111.09881 (2021)
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H., Shao, L.:

Learning enriched features for real image restoration and enhancement. In: European Conference on Computer Vision. pp. 492--511. Springer (2020)

深度学习论文分享（六）Simple Baselines for Image Restoration

深度学习论文分享（六）Simple Baselines for Image Restoration

前言

Abstract

1 Introduction

2 Related Works

2.1 Image Restoration

5.1 Ablations

5.2 Applications

6 Conclusions

Appendix

A Other Details

A.1 Inverted Bottleneck

A.2 Channel Attention and Simplified Channel Attention

A.3 Feature Fusion

A.4 Downsample/Upsample Layer

B More Visualization Results

References