文献速递：深度学习肝脏肿瘤诊断---动态对比增强 MRI 上的自动肝脏肿瘤分割使用 4D 信息：基于 3D 卷积和卷积 LSTM 的深度学习模型

Title

题目

Automatic Liver Tumor Segmentation on Dynamic Contrast Enhanced MRI Using 4D Information: Deep Learning Model Based on 3D Convolution and Convolutional LSTM

动态对比增强 MRI 上的自动肝脏肿瘤分割使用 4D 信息：基于 3D 卷积和卷积 LSTM 的深度学习模型

文献速递介绍

肝癌是导致癌症相关死亡的最常见原因之一。肝细胞癌（HCC），作为最常见的原发性肝癌类型，是第五大恶性肿瘤和全球第三大癌症相关死亡原因。HCC 的早期诊断和治疗对于成功的肿瘤切除至关重要。精确的肿瘤分割能够确定基于体积的定量信息，如纹理特征，这可以有助于肝疗法规划程序，并提供更可靠的治疗反应分类、肝脏肿瘤分类和病人存活预测。

目前，肝脏肿瘤分割仍然严重依赖于手动勾画，这一过程繁琐、耗时且受到操作者之间/操作者内部变异的影响。已经提出了基于传统图像处理算法的多种计算机辅助方法，如阈值方法、空间正则化技术、监督分类和无监督聚类方法，用于肝脏和病变的分割。然而，肿瘤形状、外观和定位的高度可变性、不明显的边缘以及对比剂引入的额外噪声使得自动分割变得困难。

前述方法的主要缺点是只能利用有限的信息，例如，只有强度信息，导致模糊肿瘤边界上的边界泄露。

深度学习的发展在近年来极大地促进了医学图像分析。例如，深度卷积神经网络（CNNs）已成功应用于脑肿瘤分割和前列腺癌检测。在肝脏成像中，深度学习也已被用于健康肝脏分割、肝纤维化分期、肝脂肪浸润分类、肝脏肿瘤诊断和肝脏肿块鉴别。深度学习是一套数据驱动的算法，能够自动从图像中捕获高级特征，并提高这些任务的性能。同样，在肝脏肿瘤分割方面，深度学习也取得了显著的成果，如2017年肝脏肿瘤分割（LiTS）挑战赛中表现最好的方法均基于深度学习。

Abstract

摘要

Objective: Accurate segmentation of liver tumors, which could help physicians make appropriate treatment decisions and assess the effectiveness of surgi cal treatment, is crucial for the clinical diagnosis of liver cancer. In this study, we propose a 4-dimensional (4D) deep learning model based on 3D convolution and con volutional long short-term memory (C-LSTM) for hepato cellular carcinoma (HCC) lesion segmentation. Methods: The proposed deep learning model utilizes 4D information on dynamic contrast enhanced (DCE) magnetic resonance imaging (MRI) images to assist liver tumor segmentation. Specifically, a shallow U-net based 3D CNN module was designed to extract 3D spatial domain features from each DCE phase, followed by a 4-layer C-LSTM network module for time domain information exploitation. The combined information of multi-phase DCE images and the manner by which tissue imaging features change on multi-contrast images allow the network to more effectively learn the char acteristics of HCC, resulting in better segmentation perfor mance. Results: The proposed model achieved a Dice score a volume similarity of 0.891 ±0.080 for liver tumor segmen tation, which outperformed the 3D U-net model, RA-UNet model and other models in the ablation study in both internal and external test sets. Moreover, the performance of the proposed model is comparable to the nnU-Net model, which showed state-of-the-art performance in many segmentation tasks, with significantly reduced prediction time. Conclu sion: The proposed 3D convolution and C-LSTM based model can achieve accurate segmentation of HCC lesions. Index Terms**---4D information, deep learning, 3D convo lution, convolutional LSTM, tumor segmentation.

目标：准确地分割肝脏肿瘤对于临床诊断肝癌至关重要，它能帮助医生做出适当的治疗决策并评估外科治疗的有效性。在本研究中，我们提出了一个基于三维卷积和卷积长短期记忆（C-LSTM）的四维（4D）深度学习模型，用于肝细胞癌（HCC）病变的分割。方法：所提出的深度学习模型利用动态对比增强（DCE）磁共振成像（MRI）图像上的四维信息来辅助肝脏肿瘤分割。具体来说，设计了一个浅层 U-net 基三维 CNN 模块，用于从每个 DCE 阶段提取三维空间域特征，随后通过一个四层的 C-LSTM 网络模块来开发时域信息。多相位 DCE 图像的综合信息以及多对比图像上组织成像特征的变化方式，使得网络能够更有效地学习 HCC 的特征，从而获得更好的分割性能。结果：所提出的模型在肝脏肿瘤分割上达到了 Dice 分数和体积相似性为 0.891 ± 0.080，这优于 3D U-net 模型、RA-UNet 模型和其他在消融研究中的模型，在内部和外部测试集上都表现出色。此外，所提模型的性能与表现出众的 nnU-Net 模型相当，而且显著减少了预测时间。结论：所提出的基于三维卷积和 C-LSTM 的模型能够准确地分割 HCC 病变。索引术语------四维信息、深度学习、三维卷积、卷积 LSTM、肿瘤分割。

Conclusions

结论

In this paper, we developed a 4D deep learning model

for automatic liver tumor segmentation, which offers better performance than some other networks. Our approach utilizes a 3D CNN module for 3D spatial context extraction and a C-LSTM network module for exploiting time domain infor mation, forming 4D information for assisting segmentation. The experimental results demonstrated the proposed model has improved tumor segmentation performance in ablation experiments, and is comparable to some of the existing state of-the-art models, while significantly reducing the prediction time. The accurate segmentation of liver tumors is an impor tant prerequisite for subsequent quantitative analysis and could greatly help doctors for clinical diagnosis and treatment.

在本文中，我们开发了一个用于自动肝脏肿瘤分割的 4D 深度学习模型，该模型的性能优于一些其他网络。我们的方法利用了一个三维 CNN 模块来提取三维空间上下文和一个 C-LSTM 网络模块来开发时域信息，形成了辅助分割的 4D 信息。实验结果表明，提出的模型在消融实验中提高了肿瘤分割性能，并且与一些现有的最先进模型相当，同时显著减少了预测时间。准确的肝脏肿瘤分割是后续定量分析的重要前提，可以极大地帮助医生进行临床诊断和治疗。

Method

方法

A. Dataset and MRI Protocol This study was approved by the research-ethics commit tee of First Affiliated Hospital of Zhejiang University. The clinical, radiological, and histopathological data were col lected from medical charts. The retrospective study included 190 pathologically confirmed primary HCC patients who underwent liver MRI scanning before surgery between January 2017 and March 2020. A fat-suppressed 3D T1-weighted GRE sequence was performed on a 3.0 T clinical scanner (GE Signa HDx; GE Healthcare). Gadopentetate dimeglumine (Magnevist; Bayer Healthcare, Germany, 0.1 mmol/kg) was injected at a rate of 2.5 ml/s followed by saline flush with a maximum dose of 18 mL. Images in the hepatic arterial, portal venous, and delayed phases were obtained at 25∼35 s, 55∼75 s, and 180∼240 s after contrast medium injection respectively. The scanning parameters are as follows, echo time (TE): 1.5 ms; repetition time (TR): 3.2 ms; In-plane resolution: 0.8 × 0.8 mm2; slice thickness: 2.5 mm; matrix size: 320 × 256; number of slices: 84; and field of view (FOV): 400 × 400 × 210 mm3. This data was randomly split into a training set (110 cases), a validation set (40 cases) and an internal test set (40 cases). In addition, we further included 60 HCC DCE data from Fudan University Affiliated Zhongshan Hospital as an external test set for this study. MR scanning was performed on a 3.0 T Siemens scanner (Magnetom Verio 3T MRI, Siemens Healthineers) with 3D gradient-echo VIBE sequence. The contrast, injection rate, and image acquisition timepoints were the same as for the internal dataset. The scanning parameters are as follows, TE: 1.4 ms; TR: 4.1 ms; In-plane resolution: 1.1 × 1.1 mm2; slice thickness: 3.0 mm; matrix size: 352× 260; number of slices: 72; and FOV: 269× 380 ×180 mm3. The target livers and HCC lesions were outlined by an expe rienced radiologist (with 15 years of experience in abdominal imaging) using ITK-SNAP (v3.6.0) in the delayed phas, with reference to the pre-contrast, arterial and portal venous phases. In addition, another radiologist (with 15 years of experience in abdominal imaging) checked and adjusted the outlined labels, and if there was no agreement on a particular area, a thirdradiologist (with 30 years of experience in liver imaging) would make the final decision.

A. 数据集和 MRI 协议

本研究已获得浙江大学第一附属医院研究伦理委员会的批准。临床、放射学和组织病理学数据从医疗记录中收集。这项回顾性研究包括了在2017年1月至2020年3月之间手术前进行肝脏 MRI 扫描的190名病理学确认的原发性 HCC 患者。在3.0 T 临床扫描仪（GE Signa HDx; GE Healthcare）上执行了脂肪抑制的3D T1加权GRE序列。硫酸钆镁葡胺（Magnevist; Bayer Healthcare, 德国, 0.1 mmol/kg）以2.5 ml/s的速度注射，随后用最大剂量18 mL的生理盐水冲洗。在对比剂注射后的25∼35秒、55∼75秒和180∼240秒分别获取肝动脉期、门静脉期和延迟期的图像。扫描参数如下，回波时间（TE）：1.5毫秒；重复时间（TR）：3.2毫秒；平面分辨率：0.8 × 0.8 mm²；层厚：2.5毫米；矩阵大小：320 × 256；切片数量：84；视野（FOV）：400 × 400 × 210 mm³。这些数据随机分为训练集（110例）、验证集（40例）和内部测试集（40例）。

此外，我们还从复旦大学附属中山医院进一步包括了60例 HCC DCE 数据作为本研究的外部测试集。MR 扫描是在3.0 T 西门子扫描仪（Magnetom Verio 3T MRI, Siemens Healthineers）上进行的，使用3D 梯度回波 VIBE 序列。对比剂注射速率和图像获取时间点与内部数据集相同。扫描参数如下，TE：1.4毫秒；TR：4.1毫秒；平面分辨率：1.1 × 1.1 mm²；层厚：3.0毫米；矩阵大小：352 × 260；切片数量：72；视野（FOV）：269 × 380 × 180 mm³。

目标肝脏和 HCC 病变由一位经验丰富的放射科医师（具有15年腹部影像学经验）使用 ITK-SNAP (v3.6.0) 在延迟期描绘，参考了无对比剂期、动脉期和门静脉期。此外，另一位放射科医师（同样具有15年腹部影像学经验）检查并调整了描绘的标签，如果对特定区域没有一致意见，第三位放射科医师（具有30年肝脏影像学经验）将做出最终决定。

Figure

图

Fig. 1. Overall framework of the proposed 4D deep learning model for HCC segmentation, including a 3D CNN module (pink block) and a C-LSTM network module (green block). A shallow 3D U-net based Basic module was used for spatial domain information extraction in the pre-contrast, arterial, portal venous and delayed phases, separately. A 4-layer Conv-LSTM network was designed for time domain information exploiting through multiple DCE phases. In Conv-LSTM network block, m refers to the number of layers of the C-LSTM network, and here m = 4.

图 1. 提出的用于 HCC 分割的 4D 深度学习模型的整体框架，包括一个三维 CNN 模块（粉色块）和一个 C-LSTM 网络模块（绿色块）。一个基于浅层 3D U-net 的基本模块被用于在无对比剂期、动脉期、门静脉期和延迟期分别提取空间域信息。一个四层 Conv-LSTM 网络被设计用于通过多个 DCE 阶段开发时域信息。在 Conv-LSTM 网络模块中，m 指的是 C-LSTM 网络的层数，这里 m = 4。

Fig. 2. 4D information in the proposed deep learning model: 3D spatial context extracted from seven consecutive slices of images for each DCE phase (by 3D convolution), with time domain information extracted through four-phase DCE images (by C-LSTM). Information from a total of 28 slices of images was used to predict the tumor mask at the targetslice.

图 2. 提出的深度学习模型中的 4D 信息：通过每个 DCE 阶段的七个连续切片图像提取的三维空间上下文（通过三维卷积），并通过四阶段 DCE 图像提取时域信息（通过 C-LSTM）。共使用了 28 个切片图像的信息来预测目标切片上的肿瘤掩模。

Fig. 3. Network architecture of the 3D U-net based liver segmentationmodel.

图 3. 基于 3D U-net 的肝脏分割模型的网络架构。

Fig. 4. Training and testing strategies. (a) Liver segmentation, training:image patches of size 16 × 256 ×256, with 8 slices of overlap; testing:image patches of size 16 ×256 × 256, with 8 slices of overlap, retaining the prediction results of middle 8 slices;(b) tumor segmentation, training: image patches of size 7 × 224 × 256, generating training candidates z for all tumor slices, with one out of every three for non-tumor slices, and cropping seven consecutive slices centered on each candidate (from z−3 to z+3); testing: image patches of size 7 × 224 × 256, generating testing candidates for all image slices, and predicting theintermediate slice for each image patch.

图 4. 训练和测试策略。(a) 肝脏分割，训练：图像块大小为 16 × 256 × 256，重叠 8 切片；测试：图像块大小为 16 × 256 × 256，重叠 8 切片，保留中间 8 切片的预测结果；(b) 肿瘤分割，训练：图像块大小为 7 × 224 × 256，为所有肿瘤切片生成训练候选区 z，每三个非肿瘤切片中选出一个，且裁剪以每个候选区为中心的连续七个切片（从 z-3 到 z+3）；测试：图像块大小为 7 × 224 × 256，为所有图像切片生成测试候选区，并预测每个图像块的中间切片。

Fig. 5.Different combinations of network inputs and Basic mod ules, (a) separate Basic module for each DCE phase in the proposed model, Basic+C-LSTM model;(b) Basic module with shared weights,Basicshare+C-LSTM model;(c) Basic module with multi-channel inputs,Basicstack+C-LSTM model. I indicates image, F indicates feature map,DYN1-DYN4 indicate pre-contrast, arterial, portal venous, and delayedphases respectively.

图 5.网络输入和基础模块的不同组合，(a) 提出模型中每个 DCE 阶段的独立基础模块，Basic+C-LSTM 模型；(b) 具有共享权重的基础模块，Basicshare+C-LSTM 模型；(c) 具有多通道输入的基础模块，Basicstack+C-LSTM 模型。I 表示图像，F 表示特征图，DYN1-DYN4 分别表示无对比剂、动脉、门静脉和延迟期。

Fig. 6.Ablation experiments of the proposed model, (a) models withsingle-phase DCE input, BasicDYN1-BasicDYN4 models; (b) model without C-LSTM structure, Basic model; (c) model with information interactiontruncation C-LSTM, information early fusion, Basic+CNNEF model;(d) model with information interaction truncation C-LSTM, informationlate fusion, Basic+CNNLF model. The dashed box indicates the originalC-LSTM network module, replaced by the 2D CNN structure here.

图 6.提出模型的消融实验，(a) 单阶段 DCE 输入的模型，BasicDYN1-BasicDYN4 模型；(b) 无 C-LSTM 结构的模型，Basic 模型；(c) 信息交互截断 C-LSTM，信息早期融合，Basic+CNNEF 模型；(d) 信息交互截断 C-LSTM，信息晚期融合，Basic+CNNLF 模型。虚线框表示原始的 C-LSTM 网络模块，在此处被 2D CNN 结构替代。

Fig. 7. Liver tumor segmentation results based on the proposed model and other models in the ablation experiment, (a) Pat #1 and (b) Pat #2 cases from the internal test set; (c) Pat #3 and (d) Pat #4 cases from the external test set. From left to right: segmentation results shown in pre-contrast phase, arterial phase, portal venous phase and delayed phase. Contour in green: manually labeled mask (ground truth), blue: mask predicted by the proposed model, yellow: mask predicted by Basicmodel, red: mask predicted by Basic+CNNLF model.

图 7. 基于提出的模型及消融实验中其他模型的肝脏肿瘤分割结果，(a) 内部测试集的 Pat #1 和 (b) Pat #2 案例；(c) 外部测试集的 Pat #3 和 (d) Pat #4 案例。从左到右：在无对比剂阶段、动脉阶段、门静脉阶段和延迟阶段显示的分割结果。绿色轮廓：手工标记的掩模（真实情况），蓝色：由提出的模型预测的掩模，黄色：由 Basic 模型预测的掩模，红色：由 Basic+CNNLF 模型预测的掩模。

Fig. 8. Liver tumor segmentation results based on the proposed model and some external baseline models. (a) Pat #1 and (b) Pat #2 cases from the internal test set; (c) Pat #3 and (d) Pat #4 cases from the external test set. From left to right: segmentation results shown in pre-contrast phase, arterial phase, portal venous phase and delayed phase. Contour in green: manually labeled mask (ground truth), blue: mask predicted by the proposed model, yellow: mask predicted by nnU-net model, red: mask predicted by RA-Unet model. (comparison with the 3D U-net was not displayed since it performed significantly worse).

图 8. 基于提出的模型和一些外部基线模型的肝脏肿瘤分割结果。(a) 内部测试集的 Pat #1 和 (b) Pat #2 案例；(c) 外部测试集的 Pat #3 和 (d) Pat #4 案例。从左到右：在无对比剂阶段、动脉阶段、门静脉阶段和延迟阶段显示的分割结果。绿色轮廓：手工标记的掩模（真实情况），蓝色：由提出的模型预测的掩模，黄色：由 nnU-net 模型预测的掩模，红色：由 RA-Unet 模型预测的掩模。（与 3D U-net 的比较没有显示，因为其性能显著较差）。

Fig. 9. Feature map analysis for cases of large tumors with internal inhomogeneity. Dashed box in red: input image, green: feature maps extracted after 3D CNN module, blue: feature maps extracted after C-LSTM network module, purple: output probability map, and the image on the far right is the manually labeled ground truth.

图 9.对内部不均匀的大型肿瘤案例的特征图分析。红色虚线框：输入图像，绿色：经过三维 CNN 模块提取的特征图，蓝色：经过 C-LSTM 网络模块提取的特征图，紫色：输出概率图，最右侧的图像是手工标记的真实情况。

Fig. 10. Feature map analysis for cases of small tumors with significant enhancement in the arterial phase. Dashed box in red: input image, green: feature maps extracted after 3D CNN module, blue: feature maps extracted after C-LSTM network module, purple: output probability map, and the image on the far right is the manually labeled ground truth.

图 10. 对在动脉期有显著增强的小型肿瘤案例的特征图分析。红色虚线框：输入图像，绿色：经过三维 CNN 模块提取的特征图，蓝色：经过 C-LSTM 网络模块提取的特征图，紫色：输出概率图，最右侧的图像是手工标记的真实情况。

Table

表

TABLE I network frame work of the proposed model for tumor segmen tation

表 I提出模型的网络框架，用于肿瘤分割

TABLE II quantitative results of various models in anlation experiments

表 II在消融实验中各种模型的定量结果

TABLE III quantita results of various models in 2.5d cnn scenario

表 III在 2.5D CNN 场景中各种模型的定量结果

TABLE IV performa comparison with external baseline models

表 IV与外部基线模型的性能比较