动手学深度学习d2l.Animator无法在PyCharm中显示动态图片的解决方案

py 复制代码
from d2l import torch as d2l

一、问题描述

运行d2l的训练函数,仅在控制台输出以下内容,无法显示动态图片(训练监控)

py 复制代码
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>

二、解决方案

修改d2l.Animatoradd函数,以下分别是修改前的代码及修改后的代码:

py 复制代码
def add(self, x, y):
    # Add multiple data points into the figure
    if not hasattr(y, "__len__"):
        y = [y]
    n = len(y)
    if not hasattr(x, "__len__"):
        x = [x] * n
    if not self.X:
        self.X = [[] for _ in range(n)]
    if not self.Y:
        self.Y = [[] for _ in range(n)]
    for i, (a, b) in enumerate(zip(x, y)):
        if a is not None and b is not None:
            self.X[i].append(a)
            self.Y[i].append(b)
    self.axes[0].cla()
    for x, y, fmt in zip(self.X, self.Y, self.fmts):
        self.axes[0].plot(x, y, fmt)
    self.config_axes()
    display.display(self.fig)
    display.clear_output(wait=True)
py 复制代码
def add(self, x, y):
    # Add multiple data points into the figure
    if not hasattr(y, "__len__"):
        y = [y]
    n = len(y)
    if not hasattr(x, "__len__"):
        x = [x] * n
    if not self.X:
        self.X = [[] for _ in range(n)]
    if not self.Y:
        self.Y = [[] for _ in range(n)]
    for i, (a, b) in enumerate(zip(x, y)):
        if a is not None and b is not None:
            self.X[i].append(a)
            self.Y[i].append(b)
    self.axes[0].cla()
    for x, y, fmt in zip(self.X, self.Y, self.fmts):
        self.axes[0].plot(x, y, fmt)
    self.config_axes()
    display.display(self.fig)
    # 通过以下两行代码实现了在PyCharm中显示动图
    plt.draw()
    plt.pause(interval=0.001)
    display.clear_output(wait=True)

同时,在使用相关函数时,添加如下一行代码d2l.plt.show(),如下:

py 复制代码
d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices)
d2l.plt.show()

三、实现效果

附:使用控制台打印输出训练监控信息,而不通过动态图的方式

重写训练函数,以d2l.train_ch13为例,以下分别是修改前的代码及修改后的代码:

py 复制代码
def train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs,
               devices=d2l.try_all_gpus()):
    """Train a model with multiple GPUs (defined in Chapter 13).

    Defined in :numref:`sec_image_augmentation`"""
    timer, num_batches = d2l.Timer(), len(train_iter)
    animator = d2l.Animator(xlabel='epoch', xlim=[1, num_epochs], ylim=[0, 1],
                            legend=['train loss', 'train acc', 'test acc'])
    net = nn.DataParallel(net, device_ids=devices).to(devices[0])
    for epoch in range(num_epochs):
        # Sum of training loss, sum of training accuracy, no. of examples,
        # no. of predictions
        metric = d2l.Accumulator(4)
        for i, (features, labels) in enumerate(train_iter):
            timer.start()
            l, acc = train_batch_ch13(
                net, features, labels, loss, trainer, devices)
            metric.add(l, acc, labels.shape[0], labels.numel())
            timer.stop()
            if (i + 1) % (num_batches // 5) == 0 or i == num_batches - 1:
                animator.add(epoch + (i + 1) / num_batches,
                             (metric[0] / metric[2], metric[1] / metric[3],
                              None))
        test_acc = d2l.evaluate_accuracy_gpu(net, test_iter)
        animator.add(epoch + 1, (None, None, test_acc))
    print(f'loss {metric[0] / metric[2]:.3f}, train acc '
          f'{metric[1] / metric[3]:.3f}, test acc {test_acc:.3f}')
    print(f'{metric[2] * num_epochs / timer.sum():.1f} examples/sec on '
          f'{str(devices)}')
py 复制代码
def train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices=d2l.try_all_gpus()):
    """使用多GPU训练模型"""
    timer, num_batches = d2l.Timer(), len(train_iter)
    net = nn.DataParallel(net, device_ids=devices).to(devices[0])
    for epoch in range(num_epochs):
        # 训练损失、训练准确度、实例数
        metric = d2l.Accumulator(3)
        for i, (features, labels) in enumerate(train_iter):
            timer.start()
            l, acc = train_batch_ch13(net, features, labels, loss, trainer, devices)
            metric.add(l, acc, labels.shape[0])  # labels.shape[0] == labels.numel()
            timer.stop()
            if (i + 1) % (num_batches // 5) == 0 and not (i + 1) == num_batches:
                print(
                    f'epoch {epoch + 1}, iter {i + 1}: train loss {metric[0] / metric[2]:.3f}, train acc {metric[1] / metric[2]:.3f}')
        test_acc = d2l.evaluate_accuracy_gpu(net, test_iter)
        print(
            f'epoch {epoch + 1}: train loss {metric[0] / metric[2]:.3f}, train acc {metric[1] / metric[2]:.3f}, test acc {test_acc:.3f}')
    print(f'{metric[2] * num_epochs / timer.sum():.1f} examples/sec on {str(devices)}')

修改后的训练代码运行效果如下图所示:

py 复制代码
epoch 1, iter 6: train loss 2.580, train acc 0.484
epoch 1, iter 12: train loss 1.871, train acc 0.560
epoch 1, iter 18: train loss 1.390, train acc 0.653
epoch 1, iter 24: train loss 1.111, train acc 0.709
epoch 1, iter 30: train loss 0.936, train acc 0.748
epoch 1: train loss 0.909, train acc 0.754, test acc 0.894
epoch 2, iter 6: train loss 0.257, train acc 0.909
epoch 2, iter 12: train loss 0.266, train acc 0.898
epoch 2, iter 18: train loss 0.255, train acc 0.902
epoch 2, iter 24: train loss 0.257, train acc 0.906
epoch 2, iter 30: train loss 0.252, train acc 0.908
epoch 2: train loss 0.247, train acc 0.910, test acc 0.911
epoch 3, iter 6: train loss 0.207, train acc 0.922
epoch 3, iter 12: train loss 0.191, train acc 0.923
epoch 3, iter 18: train loss 0.204, train acc 0.914
epoch 3, iter 24: train loss 0.212, train acc 0.912
epoch 3, iter 30: train loss 0.211, train acc 0.914
epoch 3: train loss 0.209, train acc 0.915, test acc 0.901
epoch 4, iter 6: train loss 0.192, train acc 0.930
epoch 4, iter 12: train loss 0.213, train acc 0.924
epoch 4, iter 18: train loss 0.222, train acc 0.918
epoch 4, iter 24: train loss 0.212, train acc 0.921
epoch 4, iter 30: train loss 0.215, train acc 0.920
epoch 4: train loss 0.214, train acc 0.919, test acc 0.914
epoch 5, iter 6: train loss 0.191, train acc 0.938
epoch 5, iter 12: train loss 0.188, train acc 0.935
epoch 5, iter 18: train loss 0.193, train acc 0.931
epoch 5, iter 24: train loss 0.191, train acc 0.930
epoch 5, iter 30: train loss 0.196, train acc 0.925
epoch 5: train loss 0.197, train acc 0.924, test acc 0.936
176.0 examples/sec on [device(type='cuda', index=0)]
相关推荐
SpikeKing12 分钟前
LLM - 使用 LLaMA-Factory 微调大模型 环境配置与训练推理 教程 (1)
人工智能·llm·大语言模型·llama·环境配置·llamafactory·训练框架
黄焖鸡能干四碗41 分钟前
信息化运维方案,实施方案,开发方案,信息中心安全运维资料(软件资料word)
大数据·人工智能·软件需求·设计规范·规格说明书
42 分钟前
开源竞争-数据驱动成长-11/05-大专生的思考
人工智能·笔记·学习·算法·机器学习
ctrey_1 小时前
2024-11-4 学习人工智能的Day21 openCV(3)
人工智能·opencv·学习
攻城狮_Dream1 小时前
“探索未来医疗:生成式人工智能在医疗领域的革命性应用“
人工智能·设计·医疗·毕业
学习前端的小z1 小时前
【AIGC】如何通过ChatGPT轻松制作个性化GPTs应用
人工智能·chatgpt·aigc
埃菲尔铁塔_CV算法2 小时前
人工智能图像算法:开启视觉新时代的钥匙
人工智能·算法
EasyCVR2 小时前
EHOME视频平台EasyCVR视频融合平台使用OBS进行RTMP推流,WebRTC播放出现抖动、卡顿如何解决?
人工智能·算法·ffmpeg·音视频·webrtc·监控视频接入
打羽毛球吗️2 小时前
机器学习中的两种主要思路:数据驱动与模型驱动
人工智能·机器学习
好喜欢吃红柚子2 小时前
万字长文解读空间、通道注意力机制机制和超详细代码逐行分析(SE,CBAM,SGE,CA,ECA,TA)
人工智能·pytorch·python·计算机视觉·cnn