机器学习--VGG

VGG

我们已经见过了 L e N e t LeNet LeNet 和 A l e x N e t AlexNet AlexNet,一个是 3 3 3 层,一个是 7 7 7 层,那么如果我们想实现一些更多层的神经网络要怎么办呢,还是手动开每一层吗?

显然不能这么原始,所以我们想到了一件事:为社么不试试for循环呢qwq

这就是 v g g vgg vgg 的本质了

这玩意儿虽然名字叫very deep network 但是现在看来这玩意儿也就那么几层 其实挺浅的哈哈qwq

vgg_block

一个 v g g _ b l o c k vgg\_block vgg_block 就是一个小的 n e t net net,其中包含多个 c o n v o l u t i o n      l a y e r convolution \;\; layer convolutionlayer 和一个 m a x p o o l i n g maxpooling maxpooling,就像这样:

python 复制代码
from mxnet.gluon import nn

def vgg_block(num_convs, channels):
    out = nn.Sequential()
    with out.name_scope():
        for _ in range(num_convs):
            out.add(nn.Conv2D(channels = channels, kernel_size = 3, padding = 1, activation = 'relu'))
        out.add(nn.MaxPool2D(pool_size = 2, strides = 2))
    return out

然后我们可以定义一个block来试试效果:

python 复制代码
blk = vgg_block(2, 128)
blk.initialize()
x = nd.random.uniform(shape = (2, 3, 16, 16))                                                    # (batch_size, channels, height, width)
print(blk(x).shape)

输出的shape就是:

复制代码
(2, 128, 8, 8)

vgg_stack

我们先定义一个 a r c h i t e c t u r e architecture architecture,它由很多个元组组成:

python 复制代码
architecture = ((1, 64), (1, 128), (2, 256), (2, 512), (2, 512))

每一个元组代表一个 v g g _ b l o c k vgg\_block vgg_block,就像这样:

python 复制代码
def vgg_stack(architecture):
    out = nn.Sequential()
    with out.name_scope():
        for (num_convs, channels) in architecture:
            out.add(vgg_block(num_convs, channels))
    return out

vgg net

最后我们的 n e t net net 就是这样的:

python 复制代码
net = nn.Sequential()
with net.name_scope():
    net.add(vgg_stack(architecture))
    net.add(nn.Flatten())
    net.add(nn.Dense(4096, activation = 'relu'))
    net.add(nn.Dropout(.5))
    net.add(nn.Dense(4096, activation = 'relu'))
    net.add(nn.Dropout(.5))
    net.add(nn.Dense(10))

print(net)

输出:

复制代码
Sequential(
  (0): Sequential(
    (0): Sequential(
      (0): Conv2D(None -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (1): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
    )
    (1): Sequential(
      (0): Conv2D(None -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (1): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
    )
    (2): Sequential(
      (0): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (1): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (2): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
    )
    (3): Sequential(
      (0): Conv2D(None -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (1): Conv2D(None -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (2): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
    )
    (4): Sequential(
      (0): Conv2D(None -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (1): Conv2D(None -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), Activation(relu))
      (2): MaxPool2D(size=(2, 2), stride=(2, 2), padding=(0, 0), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
    )
  )
  (1): Flatten
  (2): Dense(None -> 4096, Activation(relu))
  (3): Dropout(p = 0.5, axes=())
  (4): Dense(None -> 4096, Activation(relu))
  (5): Dropout(p = 0.5, axes=())
  (6): Dense(None -> 10, linear)
)

其他的vgg

我们刚才构造的其实准确的来说应该叫 v g g 11 vgg11 vgg11,因为我们一共有 8 8 8 个卷积层加上 3 3 3 个全连接。但是更常用的应该是 v g g 16 vgg16 vgg16 和 v g g 19 vgg19 vgg19

根据上表你可以看到其他的 v g g vgg vgg 是怎么定义的,也可以试着自己构造一下qwq

训练

训练和 A l e x N e t AlexNet AlexNet 和 L e N e t LeNet LeNet 都是一样的,这里因为太慢了,所以把reszie的224改成了96[合十]:

python 复制代码
batch_size, num_epoch, lr = 128, 5, 0.01
train_data, test_data = d2l.load_data_fashion_mnist(batch_size, resize = 96)                  # resize from 28 * 28 to 96 * 96

ctx = d2l.try_gpu()
net.initialize(ctx = ctx, init = init.Xavier())

softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate' : lr})

def accuracy(output, label):
    return nd.mean(output.argmax(axis = 1) == label.astype('float32')).asscalar()

def evaluate_accuracy(data_itetator, net, context):
    acc = 0.
    for data, label in data_itetator:
        output = net(data)
        acc += accuracy(output, label)
    return acc / len(data_itetator)

import time

for epoch in range(num_epoch):
    start_time = time.time()
    train_loss, train_acc = 0., 0.
    for data, label in train_data:
        label = label.as_in_context(ctx)
        data = data.as_in_context(ctx)
        with autograd.record():
            out = net(data)
            loss = softmax_cross_entropy(out, label)
        loss.backward()
        trainer.step(batch_size)
        train_loss += nd.mean(loss).asscalar()
        train_acc += accuracy(out, label)
    test_acc = evaluate_accuracy(test_data, net, ctx)
    end_time = time.time()
    print("Epoch %d. Loss : %f, Train acc : %f, Test acc : %f, Used time : %f min" % 
                (epoch, train_loss / len(train_data), train_acc / len(train_data), test_acc, (end_time - start_time) / 60))
    if(epoch != num_epoch - 1):
        time.sleep(300)
        print("CPU has rested 5 min")

输出(一个epoch跑10min 真是服了:

复制代码
Epoch 0. Loss : 2.139392, Train acc : 0.270794, Test acc : 0.585443, Used time : 11.752733 min
CPU has rested 5 min
Epoch 1. Loss : 0.861443, Train acc : 0.669210, Test acc : 0.757021, Used time : 10.107111 min
CPU has rested 5 min
Epoch 2. Loss : 0.650519, Train acc : 0.758362, Test acc : 0.796974, Used time : 10.013898 min
CPU has rested 5 min
Epoch 3. Loss : 0.552376, Train acc : 0.795109, Test acc : 0.832081, Used time : 9.947225 min
CPU has rested 5 min
Epoch 4. Loss : 0.488786, Train acc : 0.819946, Test acc : 0.847310, Used time : 9.911732 min
相关推荐
风铃喵游27 分钟前
让大模型调用MCP服务变得超级简单
前端·人工智能
旷世奇才李先生31 分钟前
Pillow 安装使用教程
深度学习·microsoft·pillow
booooooty1 小时前
基于Spring AI Alibaba的多智能体RAG应用
java·人工智能·spring·多智能体·rag·spring ai·ai alibaba
PyAIExplorer1 小时前
基于 OpenCV 的图像 ROI 切割实现
人工智能·opencv·计算机视觉
风口猪炒股指标1 小时前
技术分析、超短线打板模式与情绪周期理论,在市场共识的形成、分歧、瓦解过程中缘起性空的理解
人工智能·博弈论·群体博弈·人生哲学·自我引导觉醒
ai_xiaogui2 小时前
一键部署AI工具!用AIStarter快速安装ComfyUI与Stable Diffusion
人工智能·stable diffusion·部署ai工具·ai应用市场教程·sd快速部署·comfyui一键安装
聚客AI3 小时前
Embedding进化论:从Word2Vec到OpenAI三代模型技术跃迁
人工智能·llm·掘金·日新计划
weixin_387545643 小时前
深入解析 AI Gateway:新一代智能流量控制中枢
人工智能·gateway
聽雨2373 小时前
03每日简报20250705
人工智能·社交电子·娱乐·传媒·媒体
二川bro4 小时前
飞算智造JavaAI:智能编程革命——AI重构Java开发新范式
java·人工智能·重构