动手学深度学习d2l.Animator无法在PyCharm中显示动态图片的解决方案

py 复制代码
from d2l import torch as d2l

一、问题描述

运行d2l的训练函数,仅在控制台输出以下内容,无法显示动态图片(训练监控)

py 复制代码
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>
<Figure size 350x250 with 1 Axes>

二、解决方案

修改d2l.Animatoradd函数,以下分别是修改前的代码及修改后的代码:

py 复制代码
def add(self, x, y):
    # Add multiple data points into the figure
    if not hasattr(y, "__len__"):
        y = [y]
    n = len(y)
    if not hasattr(x, "__len__"):
        x = [x] * n
    if not self.X:
        self.X = [[] for _ in range(n)]
    if not self.Y:
        self.Y = [[] for _ in range(n)]
    for i, (a, b) in enumerate(zip(x, y)):
        if a is not None and b is not None:
            self.X[i].append(a)
            self.Y[i].append(b)
    self.axes[0].cla()
    for x, y, fmt in zip(self.X, self.Y, self.fmts):
        self.axes[0].plot(x, y, fmt)
    self.config_axes()
    display.display(self.fig)
    display.clear_output(wait=True)
py 复制代码
def add(self, x, y):
    # Add multiple data points into the figure
    if not hasattr(y, "__len__"):
        y = [y]
    n = len(y)
    if not hasattr(x, "__len__"):
        x = [x] * n
    if not self.X:
        self.X = [[] for _ in range(n)]
    if not self.Y:
        self.Y = [[] for _ in range(n)]
    for i, (a, b) in enumerate(zip(x, y)):
        if a is not None and b is not None:
            self.X[i].append(a)
            self.Y[i].append(b)
    self.axes[0].cla()
    for x, y, fmt in zip(self.X, self.Y, self.fmts):
        self.axes[0].plot(x, y, fmt)
    self.config_axes()
    display.display(self.fig)
    # 通过以下两行代码实现了在PyCharm中显示动图
    plt.draw()
    plt.pause(interval=0.001)
    display.clear_output(wait=True)

同时,在使用相关函数时,添加如下一行代码d2l.plt.show(),如下:

py 复制代码
d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices)
d2l.plt.show()

三、实现效果

附:使用控制台打印输出训练监控信息,而不通过动态图的方式

重写训练函数,以d2l.train_ch13为例,以下分别是修改前的代码及修改后的代码:

py 复制代码
def train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs,
               devices=d2l.try_all_gpus()):
    """Train a model with multiple GPUs (defined in Chapter 13).

    Defined in :numref:`sec_image_augmentation`"""
    timer, num_batches = d2l.Timer(), len(train_iter)
    animator = d2l.Animator(xlabel='epoch', xlim=[1, num_epochs], ylim=[0, 1],
                            legend=['train loss', 'train acc', 'test acc'])
    net = nn.DataParallel(net, device_ids=devices).to(devices[0])
    for epoch in range(num_epochs):
        # Sum of training loss, sum of training accuracy, no. of examples,
        # no. of predictions
        metric = d2l.Accumulator(4)
        for i, (features, labels) in enumerate(train_iter):
            timer.start()
            l, acc = train_batch_ch13(
                net, features, labels, loss, trainer, devices)
            metric.add(l, acc, labels.shape[0], labels.numel())
            timer.stop()
            if (i + 1) % (num_batches // 5) == 0 or i == num_batches - 1:
                animator.add(epoch + (i + 1) / num_batches,
                             (metric[0] / metric[2], metric[1] / metric[3],
                              None))
        test_acc = d2l.evaluate_accuracy_gpu(net, test_iter)
        animator.add(epoch + 1, (None, None, test_acc))
    print(f'loss {metric[0] / metric[2]:.3f}, train acc '
          f'{metric[1] / metric[3]:.3f}, test acc {test_acc:.3f}')
    print(f'{metric[2] * num_epochs / timer.sum():.1f} examples/sec on '
          f'{str(devices)}')
py 复制代码
def train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices=d2l.try_all_gpus()):
    """使用多GPU训练模型"""
    timer, num_batches = d2l.Timer(), len(train_iter)
    net = nn.DataParallel(net, device_ids=devices).to(devices[0])
    for epoch in range(num_epochs):
        # 训练损失、训练准确度、实例数
        metric = d2l.Accumulator(3)
        for i, (features, labels) in enumerate(train_iter):
            timer.start()
            l, acc = train_batch_ch13(net, features, labels, loss, trainer, devices)
            metric.add(l, acc, labels.shape[0])  # labels.shape[0] == labels.numel()
            timer.stop()
            if (i + 1) % (num_batches // 5) == 0 and not (i + 1) == num_batches:
                print(
                    f'epoch {epoch + 1}, iter {i + 1}: train loss {metric[0] / metric[2]:.3f}, train acc {metric[1] / metric[2]:.3f}')
        test_acc = d2l.evaluate_accuracy_gpu(net, test_iter)
        print(
            f'epoch {epoch + 1}: train loss {metric[0] / metric[2]:.3f}, train acc {metric[1] / metric[2]:.3f}, test acc {test_acc:.3f}')
    print(f'{metric[2] * num_epochs / timer.sum():.1f} examples/sec on {str(devices)}')

修改后的训练代码运行效果如下图所示:

py 复制代码
epoch 1, iter 6: train loss 2.580, train acc 0.484
epoch 1, iter 12: train loss 1.871, train acc 0.560
epoch 1, iter 18: train loss 1.390, train acc 0.653
epoch 1, iter 24: train loss 1.111, train acc 0.709
epoch 1, iter 30: train loss 0.936, train acc 0.748
epoch 1: train loss 0.909, train acc 0.754, test acc 0.894
epoch 2, iter 6: train loss 0.257, train acc 0.909
epoch 2, iter 12: train loss 0.266, train acc 0.898
epoch 2, iter 18: train loss 0.255, train acc 0.902
epoch 2, iter 24: train loss 0.257, train acc 0.906
epoch 2, iter 30: train loss 0.252, train acc 0.908
epoch 2: train loss 0.247, train acc 0.910, test acc 0.911
epoch 3, iter 6: train loss 0.207, train acc 0.922
epoch 3, iter 12: train loss 0.191, train acc 0.923
epoch 3, iter 18: train loss 0.204, train acc 0.914
epoch 3, iter 24: train loss 0.212, train acc 0.912
epoch 3, iter 30: train loss 0.211, train acc 0.914
epoch 3: train loss 0.209, train acc 0.915, test acc 0.901
epoch 4, iter 6: train loss 0.192, train acc 0.930
epoch 4, iter 12: train loss 0.213, train acc 0.924
epoch 4, iter 18: train loss 0.222, train acc 0.918
epoch 4, iter 24: train loss 0.212, train acc 0.921
epoch 4, iter 30: train loss 0.215, train acc 0.920
epoch 4: train loss 0.214, train acc 0.919, test acc 0.914
epoch 5, iter 6: train loss 0.191, train acc 0.938
epoch 5, iter 12: train loss 0.188, train acc 0.935
epoch 5, iter 18: train loss 0.193, train acc 0.931
epoch 5, iter 24: train loss 0.191, train acc 0.930
epoch 5, iter 30: train loss 0.196, train acc 0.925
epoch 5: train loss 0.197, train acc 0.924, test acc 0.936
176.0 examples/sec on [device(type='cuda', index=0)]
相关推荐
lijianhua_97125 小时前
国内某顶级大学内部用的ai自动生成论文的提示词
人工智能
EDPJ5 小时前
当图像与文本 “各说各话” —— CLIP 中的模态鸿沟与对象偏向
深度学习·计算机视觉
蔡俊锋5 小时前
用AI实现乐高式大型可插拔系统的技术方案
人工智能·ai工程·ai原子能力·ai乐高工程
自然语5 小时前
人工智能之数字生命 认知架构白皮书 第7章
人工智能·架构
大熊背5 小时前
利用ISP离线模式进行分块LSC校正的方法
人工智能·算法·机器学习
eastyuxiao6 小时前
如何在不同的机器上运行多个OpenClaw实例?
人工智能·git·架构·github·php
诸葛务农6 小时前
AGI 主要技术路径及核心技术:归一融合及未来之路5
大数据·人工智能
光影少年6 小时前
AI Agent智能体开发
人工智能·aigc·ai编程
ai生成式引擎优化技术6 小时前
TSPR-WEB-LLM-HIC (TWLH四元结构)AI生成式引擎(GEO)技术白皮书
人工智能
帐篷Li6 小时前
9Router:开源AI路由网关的架构设计与技术实现深度解析
人工智能