卷积神经网络(CNN)

本文仅在理论方面讲述CNN相关的知识,并给出AlexNet, Agg, ResNet等网络结构的代码。

1.构成

​ 由输入层、卷积层、池化层、全连接层构成。

  • 输入层:输入数据
  • 卷积层:提取图像特征
  • 池化层:压缩特征
  • 全连接层:为输出准备,形同一维神经网络,下文不另起文笔描述

2.神经网络与CNN对比

​左边为神经网络,右边为卷积神经网络。均采用的时较为简单的结构,卷积神经网络是对基础神经网络的延申,由一维扩展到三位空间,适用于对图像的操作。

3.卷积层

​ 假设我们在输入一张 32 × 32 × 3 32 \times 32 \times 3 32×32×3 大小的图片进入CNN,我们在卷积层对他进行图像特征提取,输入图片输出特征图。首先我们需要设定以下参数作为卷积层的参数:

  • 滑动窗口步长:卷积核移动的方式,通常使用1,即每进行一次卷积操作向右移动一个像素。
  • 卷积核尺寸:常用 3 × 3 , 11 × 11 3\times 3 ,11\times 11 3×3,11×11等奇数尺寸。用于设定进行卷积操作的范围。
  • 边缘填充:有时为了保证特征提取的结果(特征图)的大小,会在原图周围添加像素为零的点,再进行卷积操作。
  • 卷积核个数:设定卷积核的个数。

卷积操作

​其中卷积操作为需要卷积操作的范围内,对原图像的像素分别乘上卷积核对应内容并相加,得到结果,以红框即第一次卷积操作为例 结果为:
0 ∗ 1 + 2 ∗ 0 + 4 ∗ 1 + 1 ∗ 0 + 3 ∗ 1 + 5 ∗ 0 + 30 ∗ 1 + 12 ∗ 0 + 32 ∗ 1 = 64 0*1+2*0+4*1+1*0+3*1+5*0+30*1+12*0+32*1=64 0∗1+2∗0+4∗1+1∗0+3∗1+5∗0+30∗1+12∗0+32∗1=64

图片中展示的为单通道的卷积操作,由于我们输入的时RGB三通道的图片,我们需要3个卷积核对每一个通道进行卷积操作,再将三个通道相加得到特征图。

特征图尺寸

​ 我们可以通过公式计算出最终得到的卷积结果的大小,其中H代表长,F代表卷积核,P代表Padding边缘填充,S代表步长:
H 2 = H 1 − F H + 2 P S + 1 W 2 = W 2 − F H + 2 P S + 1 H_2 =\frac{H_1-F_H+2P}{S}+1\\ W_2 = \frac{W_2-F_H+2P}{S}+1 H2=SH1−FH+2P+1W2=SW2−FH+2P+1

4.池化层

​ 池化层是为了对特征图进行下采样(即压缩)而被使用的,池化有很多种方式,Max Pooling , Min Pooling , Average Pooling 等。在此我们仅解释Max Pooling操作,其余操作可依此类推:

Max Pooling:对取样范围内的值进行压缩,取范围内最大的值。

Average Pooling: 从核内计算平均值,取该值

5.网络构建

​ 在构成卷积神经网络时,在卷积层后增加激活函数,一般深度神经网络使用ReUL激活函数,每一个卷积层(conv)或全连接层(fc)称为神经网络中的一层。下面我们以一个四层神经网络为例:

6.常见的卷积神经网络

我们使用tensorflow中的keras库尝试搭建这些网络,在此仅展示代码,后续会补上代码的相关解释博客,此处展示的代码为网络结构,若你了解tensorflow训练的流程,可以尝试使用以下网络训练。下述代码笔者均使用tensorflow中的数据集尝试训练过。

AlexNet

​ AlexNet 为第一个深度神经网络,他一共有八层,其中五个卷积层和三个全连接层,卷积核的大小为 11 × 11 11 \times 11 11×11 ,0 padding。

python 复制代码
import tensorflow as tf


class AlexNet8(tf.keras.Model):
    def __init__(self):
        super(AlexNet8, self).__init__()
        self.conv1 = tf.keras.layers.Conv2D(filters=96, kernel_size=(3, 3),
                                            padding='valid', strides=1)
        self.bn1 = tf.keras.layers.BatchNormalization()
        self.activation1 = tf.keras.layers.Activation('relu')
        self.pool1 = tf.keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2)

        self.conv2 = tf.keras.layers.Conv2D(filters=256, kernel_size=(3, 3),
                                            padding='valid', strides=1)
        self.bn2 = tf.keras.layers.BatchNormalization()
        self.activation2 = tf.keras.layers.Activation('relu')
        self.pool2 = tf.keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2)

        self.conv3 = tf.keras.layers.Conv2D(filters=384, kernel_size=(3, 3),
                                            padding='same', activation='relu',
                                            strides=1)

        self.conv4 = tf.keras.layers.Conv2D(filters=384, kernel_size=(3, 3),
                                            padding='same', activation='relu',
                                            strides=1)

        self.conv5 = tf.keras.layers.Conv2D(filters=256, kernel_size=(3, 3),
                                            padding='same', activation='relu',
                                            strides=1)
        self.pool3 = tf.keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2)

        self.flatten = tf.keras.layers.Flatten()
        self.dense1 = tf.keras.layers.Dense(2048, activation='relu')
        self.dropout1 = tf.keras.layers.Dropout(0.5)
        self.dense2 = tf.keras.layers.Dense(2048, activation='relu')
        self.dropout2 = tf.keras.layers.Dropout(0.5)
        self.dense3 = tf.keras.layers.Dense(10, activation='softmax')

    def call(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.activation1(x)
        x = self.pool1(x)

        x = self.conv2(x)
        x = self.bn2(x)
        x = self.activation2(x)
        x = self.pool2(x)

        x = self.conv3(x)
        x = self.conv4(x)
        x = self.conv5(x)
        x = self.pool3(x)

        x = self.flatten(x)
        x = self.dense1(x)
        x = self.dropout1(x)
        x = self.dense2(x)
        x = self.dropout2(x)
        y = self.dense3(x)
        return y

Vgg

​ 下列图中的结构为Vgg16,一共有16层,其中13个卷积层,三个全连接层,卷积核的大小为 3 × 3 3 \times 3 3×3

python 复制代码
import tensorflow as tf
class VGGNet(tf.keras.Model):
    def __init__(self):
        super(VGGNet, self).__init__()
        self.conv1 = tf.keras.layers.Conv2D(filters=64, kernel_size=3, padding='same', strides=1)
        self.bn1 = tf.keras.layers.BatchNormalization()
        self.activation1 = tf.keras.layers.Activation('relu')

        self.conv2 = tf.keras.layers.Conv2D(filters=64, kernel_size=3, padding='same', strides=1)
        self.bn2 = tf.keras.layers.BatchNormalization()
        self.activation2 = tf.keras.layers.Activation('relu')
        self.pool1 = tf.keras.layers.MaxPool2D(pool_size=2, strides=2)
        self.dropout1 = tf.keras.layers.Dropout(0.2)

        self.conv3 = tf.keras.layers.Conv2D(filters=128, kernel_size=3, padding='same', strides=1)
        self.bn3 = tf.keras.layers.BatchNormalization()
        self.activation3 = tf.keras.layers.Activation('relu')

        self.conv4 = tf.keras.layers.Conv2D(filters=128, kernel_size=3, padding='same', strides=1)
        self.bn4 = tf.keras.layers.BatchNormalization()
        self.activation4 = tf.keras.layers.Activation('relu')
        self.pool2 = tf.keras.layers.MaxPool2D(pool_size=2, strides=2)
        self.dropout2 = tf.keras.layers.Dropout(0.2)

        self.conv5 = tf.keras.layers.Conv2D(filters=256, kernel_size=3, padding='same', strides=1)
        self.bn5 = tf.keras.layers.BatchNormalization()
        self.activation5 = tf.keras.layers.Activation('relu')

        self.conv6 = tf.keras.layers.Conv2D(filters=256, kernel_size=3, padding='same', strides=1)
        self.bn6 = tf.keras.layers.BatchNormalization()
        self.activation6 = tf.keras.layers.Activation('relu')

        self.conv7 = tf.keras.layers.Conv2D(filters=256, kernel_size=3, padding='same', strides=1)
        self.bn7 = tf.keras.layers.BatchNormalization()
        self.activation7 = tf.keras.layers.Activation('relu')
        self.pool3 = tf.keras.layers.MaxPool2D(pool_size=2, strides=2)
        self.dropout3 = tf.keras.layers.Dropout(0.2)

        self.conv8 = tf.keras.layers.Conv2D(filters=512, kernel_size=3, padding='same', strides=1)
        self.bn8 = tf.keras.layers.BatchNormalization()
        self.activation8 = tf.keras.layers.Activation('relu')

        self.conv9 = tf.keras.layers.Conv2D(filters=512, kernel_size=3, padding='same')
        self.bn9 = tf.keras.layers.BatchNormalization()
        self.activation9 = tf.keras.layers.Activation('relu')

        self.conv10 = tf.keras.layers.Conv2D(filters=512, kernel_size=3, padding='same', strides=1)
        self.bn10 = tf.keras.layers.BatchNormalization()
        self.activation10 = tf.keras.layers.Activation('relu')
        self.pool4 = tf.keras.layers.MaxPool2D(pool_size=2, strides=2)
        self.dropout4 = tf.keras.layers.Dropout(0.2)

        self.conv11 = tf.keras.layers.Conv2D(filters=512, kernel_size=3, padding='same', strides=1)
        self.bn11 = tf.keras.layers.BatchNormalization()
        self.activation11 = tf.keras.layers.Activation('relu')

        self.conv12 = tf.keras.layers.Conv2D(filters=512, kernel_size=3, padding='same', strides=1)
        self.bn12 = tf.keras.layers.BatchNormalization()
        self.activation12 = tf.keras.layers.Activation('relu')

        self.conv13 = tf.keras.layers.Conv2D(filters=512, kernel_size=3, padding='same', strides=1)
        self.bn13 = tf.keras.layers.BatchNormalization()
        self.activation13 = tf.keras.layers.Activation('relu')
        self.pool5 = tf.keras.layers.MaxPool2D(pool_size=2, strides=2)
        self.dropout5 = tf.keras.layers.Dropout(0.2)

        self.flatten = tf.keras.layers.Flatten()
        self.dense1 = tf.keras.layers.Dense(512, activation='relu')
        self.dropout6 = tf.keras.layers.Dropout(0.2)
        self.dense2 = tf.keras.layers.Dense(512, activation='relu')
        self.dropout7 = tf.keras.layers.Dropout(0.2)
        self.dense3 = tf.keras.layers.Dense(10, activation='softmax')

    def call(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.activation1(x)

        x = self.conv2(x)
        x = self.bn2(x)
        x = self.activation2(x)
        x = self.pool1(x)
        x = self.dropout1(x)

        x = self.conv3(x)
        x = self.bn3(x)
        x = self.activation3(x)

        x = self.conv4(x)
        x = self.bn4(x)
        x = self.activation4(x)
        x = self.pool2(x)
        x = self.dropout2(x)

        x = self.conv5(x)
        x = self.bn5(x)
        x = self.activation5(x)

        x = self.conv6(x)
        x = self.bn6(x)
        x = self.activation6(x)

        x = self.conv7(x)
        x = self.bn7(x)
        x = self.activation7(x)
        x = self.pool3(x)
        x = self.dropout3(x)

        x = self.conv8(x)
        x = self.bn8(x)
        x = self.activation8(x)

        x = self.conv9(x)
        x = self.bn9(x)
        x = self.activation9(x)

        x = self.conv10(x)
        x = self.bn10(x)
        x = self.activation10(x)
        x = self.pool4(x)
        x = self.dropout4(x)

        x = self.conv11(x)
        x = self.bn11(x)
        x = self.activation11(x)

        x = self.conv12(x)
        x = self.bn12(x)
        x = self.activation12(x)

        x = self.conv13(x)
        x = self.bn13(x)
        x = self.activation13(x)
        x = self.pool5(x)
        x = self.dropout5(x)

        x = self.flatten(x)
        x = self.dense1(x)
        x = self.dropout6(x)
        x = self.dense2(x)
        x = self.dropout7(x)
        y = self.dense3(x)
        return y

ResNet

​ 由于添加更深层网络(大于20层)时,会出现精度下降的情况,导致20层以上的深度神经网络无法达到更好的性能。resnet网络则解决了这一问题,通过将上一层结果和本层卷积结果进行比较,取更优的网络作为我们传入下层的输入。

python 复制代码
import tensorflow as tf


class ResnetBlock(tf.keras.Model):

    def __init__(self, filters, strides=1, residual_path=False):
        super(ResnetBlock, self).__init__()
        self.filters = filters
        self.strides = strides
        self.residual_path = residual_path

        self.c1 = tf.keras.layers.Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
        self.b1 = tf.keras.layers.BatchNormalization()
        self.a1 = tf.keras.layers.Activation('relu')

        self.c2 = tf.keras.layers.Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
        self.b2 = tf.keras.layers.BatchNormalization()

        if residual_path:
            self.down_c1 = tf.keras.layers.Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
            self.down_b1 = tf.keras.layers.BatchNormalization()

        self.a2 = tf.keras.layers.Activation('relu')

    def call(self, inputs):
        residual = inputs
        x = self.c1(inputs)
        x = self.b1(x)
        x = self.a1(x)

        x = self.c2(x)
        y = self.b2(x)

        if self.residual_path:
            residual = self.down_c1(inputs)
            residual = self.down_b1(residual)

        out = self.a2(y + residual)
        return out


class ResNet18(tf.keras.Model):

    def __init__(self, block_list, initial_filters=64):
        super(ResNet18, self).__init__()
        self.num_blocks = len(block_list)
        self.block_list = block_list
        self.out_filters = initial_filters
        self.c1 = tf.keras.layers.Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
        self.b1 = tf.keras.layers.BatchNormalization()
        self.a1 = tf.keras.layers.Activation('relu')
        self.blocks = tf.keras.models.Sequential()

        # 构建ResNet网络结构
        for block_id in range(len(block_list)):
            for layer_id in range(block_list[block_id]):

                if block_id != 0 and layer_id == 0:
                    block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
                else:
                    block = ResnetBlock(self.out_filters, residual_path=False)
                self.blocks.add(block)
            self.out_filters *= 2
        self.p1 = tf.keras.layers.GlobalAveragePooling2D()
        self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())

    def call(self, inputs):
        x = self.c1(inputs)
        x = self.b1(x)
        x = self.a1(x)
        x = self.blocks(x)
        x = self.p1(x)
        y = self.f1(x)
        return y
相关推荐
深度学习实战训练营44 分钟前
基于CNN-RNN的影像报告生成
人工智能·深度学习
昨日之日20063 小时前
Moonshine - 新型开源ASR(语音识别)模型,体积小,速度快,比OpenAI Whisper快五倍 本地一键整合包下载
人工智能·whisper·语音识别
浮生如梦_3 小时前
Halcon基于laws纹理特征的SVM分类
图像处理·人工智能·算法·支持向量机·计算机视觉·分类·视觉检测
深度学习lover3 小时前
<项目代码>YOLOv8 苹果腐烂识别<目标检测>
人工智能·python·yolo·目标检测·计算机视觉·苹果腐烂识别
热爱跑步的恒川4 小时前
【论文复现】基于图卷积网络的轻量化推荐模型
网络·人工智能·开源·aigc·ai编程
阡之尘埃6 小时前
Python数据分析案例61——信贷风控评分卡模型(A卡)(scorecardpy 全面解析)
人工智能·python·机器学习·数据分析·智能风控·信贷风控
孙同学要努力8 小时前
全连接神经网络案例——手写数字识别
人工智能·深度学习·神经网络
Eric.Lee20218 小时前
yolo v5 开源项目
人工智能·yolo·目标检测·计算机视觉
其实吧39 小时前
基于Matlab的图像融合研究设计
人工智能·计算机视觉·matlab
丕羽9 小时前
【Pytorch】基本语法
人工智能·pytorch·python