高分Panel复现系列|疾病关联复合热图:从数据到顶部柱状图 + 双热图 + 侧边统计条

"顶部柱状图 + 主体热图 + 侧边统计条"的复合图,非常适合展示多维关联结果。它可以同时回答三个问题:哪些疾病关联更多,哪些细胞类型关联更多,以及每个 HERV 在不同疾病和细胞类型中的关联分布。


图片来源

项目 内容
文章 Single-cell eQTL mapping of human endogenous retroviruses reveals cell type-specific genetic regulation in autoimmune diseases
期刊/年份 Nature Communications, 2025
图号 Fig. 5c

这篇文章基于 PBMC 单细胞数据构建 HERV 表达和遗传调控图谱,并进一步结合 GWAS 和 SMR 分析,探索 HERV 与自身免疫疾病之间的遗传关联。Fig. 5c 展示的是 100 个自身免疫疾病相关 HERV 在不同疾病和免疫细胞类型中的关联分布。


图片解读

这是一张典型的 多模块复合热图。

它可以拆成四个部分:

  1. 顶部左侧柱状图:展示每个自身免疫疾病中显著关联的 SNP 或 HERV 关联数量。
  2. 左侧主体热图:展示每个 HERV 在不同疾病中的关联计数。
  3. 中间窄热图:展示每个 HERV 在不同免疫细胞类型中的关联计数。
  4. 右侧横向柱状图:展示每个 HERV 总共关联到多少个疾病或细胞类型组合。

读图时可以这样理解:

  • 顶部柱越高:说明该疾病中检测到的关联越多;
  • 热图颜色越深:说明对应 HERV 在该疾病或细胞类型中的关联计数越高;
  • 右侧横向条越长:说明该 HERV 具有更广泛的多疾病或多细胞类型关联;
  • 左右模块共享同一行顺序,因此每一行都代表同一个 HERV。

这种图的核心不是单个热图,而是把"列统计、行统计和矩阵主体"组合起来,适合展示疾病 × 分子 × 细胞类型这类多维关联结果。


输入数据

这类复合图通常拆成多个输入表来管理,每个表负责一个图层。

文件 关键列 含义
disease_summary.csv disease, snp_count 顶部左侧疾病柱状图
cell_summary.csv cell_type, herv_count 顶部右侧细胞类型柱状图
disease_heatmap.csv herv_id, disease, count 左侧疾病-HERV 热图
cell_heatmap.csv herv_id, cell_type, count 中间细胞类型-HERV 热图
herv_summary.csv herv_id, total_count 右侧每个 HERV 的总关联数

数据结构可以理解为:

herv_id disease count
HERV_001 T1D 5
HERV_001 CEL 2
HERV_002 CD 3

以及:

herv_id cell_type count
HERV_001 CD4-T 5
HERV_001 CD8-T 2
HERV_002 NK 3

其中最关键的是 count。它决定热图颜色深浅;顶部和右侧的柱状图则是对疾病、细胞类型或 HERV 维度进行汇总后的结果。

r 复制代码
library(tidyverse)

disease_summary <- read_csv("disease_summary.csv", show_col_types = FALSE)
cell_summary <- read_csv("cell_summary.csv", show_col_types = FALSE)
disease_heatmap <- read_csv("disease_heatmap.csv", show_col_types = FALSE)
cell_heatmap <- read_csv("cell_heatmap.csv", show_col_types = FALSE)
herv_summary <- read_csv("herv_summary.csv", show_col_types = FALSE)

需要示例数据的后台 添加小编 领取,调整好数据结构,以下代码可以直接复制粘贴运行。


第一步:读取数据并固定显示顺序

复合热图最重要的是对齐。顶部柱状图、主体热图和右侧横向柱状图必须使用同一套疾病顺序、细胞类型顺序和 HERV 顺序。

r 复制代码
library(tidyverse)
library(patchwork)

disease_summary <- read_csv("disease_summary.csv", show_col_types = FALSE)
cell_summary <- read_csv("cell_summary.csv", show_col_types = FALSE)
disease_heatmap <- read_csv("disease_heatmap.csv", show_col_types = FALSE)
cell_heatmap <- read_csv("cell_heatmap.csv", show_col_types = FALSE)
herv_summary <- read_csv("herv_summary.csv", show_col_types = FALSE)

disease_order <- disease_summary$disease
cell_order <- cell_summary$cell_type

herv_order <- herv_summary |>
  arrange(desc(total_count), herv_id) |>
  pull(herv_id)

disease_summary <- disease_summary |>
  mutate(disease = factor(disease, levels = disease_order))

cell_summary <- cell_summary |>
  mutate(cell_type = factor(cell_type, levels = cell_order))

disease_heatmap <- disease_heatmap |>
  mutate(
    disease = factor(disease, levels = disease_order),
    herv_id = factor(herv_id, levels = rev(herv_order))
  )

cell_heatmap <- cell_heatmap |>
  mutate(
    cell_type = factor(cell_type, levels = cell_order),
    herv_id = factor(herv_id, levels = rev(herv_order))
  )

herv_summary <- herv_summary |>
  mutate(herv_id = factor(herv_id, levels = rev(herv_order)))

这里使用 rev(herv_order),是因为 ggplot2 的 y 轴默认从下往上排列。反转以后,图中从上到下的顺序就会和我们设定的 HERV 顺序一致。


第二步:绘制顶部柱状图

顶部由两个柱状图组成:左侧对应疾病,右侧对应细胞类型。为了复现原图风格,疾病柱状图使用蓝色,细胞类型柱状图使用青色。

r 复制代码
p_top_disease <- ggplot(disease_summary, aes(disease, snp_count)) +
  geom_col(fill = "#2b97c8", width = 0.82) +
  scale_y_continuous(
    limits = c(0, 52),
    breaks = seq(0, 50, 10),
    expand = c(0, 0)
  ) +
  labs(x = NULL, y = NULL, title = "SNP") +
  theme_classic(base_size = 9) +
  theme(
    plot.title = element_text(hjust = 0.08, size = 10),
    axis.text.x = element_text(
      angle = 90,
      hjust = 1,
      vjust = 0.5,
      color = "black",
      size = 7
    ),
    axis.text.y = element_text(color = "#1b6d8f", size = 7),
    axis.ticks.x = element_blank(),
    axis.line.x = element_blank(),
    axis.line.y = element_line(color = "#1b6d8f", linewidth = 0.4),
    plot.margin = margin(2, 1, 0, 1)
  )

p_top_cell <- ggplot(cell_summary, aes(cell_type, herv_count)) +
  geom_col(fill = "#2bbbd3", width = 0.82) +
  scale_y_continuous(
    limits = c(0, 120),
    breaks = c(0, 40, 80, 120),
    position = "right",
    expand = c(0, 0)
  ) +
  labs(x = NULL, y = NULL, title = "GWAS Trait") +
  theme_classic(base_size = 9) +
  theme(
    plot.title = element_text(hjust = 0.08, size = 10),
    axis.text.x = element_text(
      angle = 90,
      hjust = 1,
      vjust = 0.5,
      color = "black",
      size = 7
    ),
    axis.text.y = element_text(color = "#0b8a9b", size = 7),
    axis.ticks.x = element_blank(),
    axis.line.x = element_blank(),
    axis.line.y = element_line(color = "#0b8a9b", linewidth = 0.4),
    plot.margin = margin(2, 1, 0, 1)
  )

p_top_blank <- ggplot() + theme_void()

这里的 position = "right" 可以把右侧柱状图的 y 轴刻度放到右边,更接近原图。


第三步:绘制疾病-HERV 主体热图

左侧主体热图展示每个 HERV 在不同自身免疫疾病中的关联计数。颜色越深,说明计数越高。

r 复制代码
p_disease_heat <- ggplot(disease_heatmap, aes(disease, herv_id, fill = count)) +
  geom_tile(color = "white", linewidth = 0.18) +
  scale_fill_gradientn(
    colors = c("#edf2f3", "#b7d0df", "#6ea6cc", "#1f6fa8"),
    limits = c(0, 5),
    breaks = 0:5,
    name = "Count"
  ) +
  labs(x = NULL, y = NULL) +
  theme_minimal(base_size = 8) +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    panel.border = element_rect(color = "black", fill = NA, linewidth = 0.5),
    legend.position = "none",
    plot.margin = margin(0, 1, 2, 1)
  )

这里用 geom_tile() 绘制热图,每一个小格子代表一个 HERV × disease 组合。


第四步:绘制细胞类型-HERV 热图和右侧统计条

中间窄热图展示每个 HERV 在不同免疫细胞类型中的关联计数;右侧横向柱状图展示每个 HERV 的总关联数量。

r 复制代码
p_cell_heat <- ggplot(cell_heatmap, aes(cell_type, herv_id, fill = count)) +
  geom_tile(color = "white", linewidth = 0.18) +
  scale_fill_gradientn(
    colors = c("#f5f8ee", "#bfe3bd", "#6fcbb1", "#21b7d3", "#005d9c"),
    limits = c(0, 5),
    breaks = 0:5,
    name = "Count"
  ) +
  labs(x = NULL, y = NULL) +
  theme_minimal(base_size = 8) +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    panel.border = element_rect(color = "black", fill = NA, linewidth = 0.5),
    legend.position = "none",
    plot.margin = margin(0, 1, 2, 1)
  )

p_herv_bar <- ggplot(herv_summary, aes(total_count, herv_id)) +
  geom_col(fill = "#b9d7c2", width = 0.85) +
  scale_x_continuous(
    limits = c(0, 8),
    breaks = c(0, 2, 4, 6, 8),
    position = "top",
    expand = c(0, 0)
  ) +
  labs(x = NULL, y = NULL) +
  theme_classic(base_size = 8) +
  theme(
    axis.text.x = element_text(angle = 90, vjust = 0.5, color = "black", size = 7),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.line.y = element_blank(),
    axis.line.x = element_line(color = "black", linewidth = 0.4),
    plot.margin = margin(0, 1, 2, 1)
  )

这一步的关键是:p_cell_heatp_herv_bar 都使用同一个 herv_id 排序,这样每一行才能严格对齐。


第五步:用 patchwork 拼接所有模块

最后用 patchwork 组合上下两行。上面一行放柱状图,下面一行放热图和右侧统计条。

r 复制代码
top_row <- p_top_disease + p_top_cell + p_top_blank +
  plot_layout(widths = c(16, 5, 3))

bottom_row <- p_disease_heat + p_cell_heat + p_herv_bar +
  plot_layout(widths = c(16, 5, 3))

fig <- top_row / bottom_row +
  plot_layout(heights = c(1.25, 8.8))

ggsave(
  "herv_disease_composite_heatmap.png",
  fig,
  width = 7.4,
  height = 10.8,
  dpi = 320,
  bg = "white"
)

ggsave(
  "herv_disease_composite_heatmap.pdf",
  fig,
  width = 7.4,
  height = 10.8,
  bg = "white"
)

这里不把论文中的 panel 标号画进结果图。图号只保留在"图片来源"表格中。


完整代码

r 复制代码
library(tidyverse)
library(patchwork)

disease_summary <- read_csv("disease_summary.csv", show_col_types = FALSE)
cell_summary <- read_csv("cell_summary.csv", show_col_types = FALSE)
disease_heatmap <- read_csv("disease_heatmap.csv", show_col_types = FALSE)
cell_heatmap <- read_csv("cell_heatmap.csv", show_col_types = FALSE)
herv_summary <- read_csv("herv_summary.csv", show_col_types = FALSE)

disease_order <- disease_summary$disease
cell_order <- cell_summary$cell_type

herv_order <- herv_summary |>
  arrange(desc(total_count), herv_id) |>
  pull(herv_id)

disease_summary <- disease_summary |>
  mutate(disease = factor(disease, levels = disease_order))

cell_summary <- cell_summary |>
  mutate(cell_type = factor(cell_type, levels = cell_order))

disease_heatmap <- disease_heatmap |>
  mutate(
    disease = factor(disease, levels = disease_order),
    herv_id = factor(herv_id, levels = rev(herv_order))
  )

cell_heatmap <- cell_heatmap |>
  mutate(
    cell_type = factor(cell_type, levels = cell_order),
    herv_id = factor(herv_id, levels = rev(herv_order))
  )

herv_summary <- herv_summary |>
  mutate(herv_id = factor(herv_id, levels = rev(herv_order)))

p_top_disease <- ggplot(disease_summary, aes(disease, snp_count)) +
  geom_col(fill = "#2b97c8", width = 0.82) +
  scale_y_continuous(
    limits = c(0, 52),
    breaks = seq(0, 50, 10),
    expand = c(0, 0)
  ) +
  labs(x = NULL, y = NULL, title = "SNP") +
  theme_classic(base_size = 9) +
  theme(
    plot.title = element_text(hjust = 0.08, size = 10),
    axis.text.x = element_text(
      angle = 90,
      hjust = 1,
      vjust = 0.5,
      color = "black",
      size = 7
    ),
    axis.text.y = element_text(color = "#1b6d8f", size = 7),
    axis.ticks.x = element_blank(),
    axis.line.x = element_blank(),
    axis.line.y = element_line(color = "#1b6d8f", linewidth = 0.4),
    plot.margin = margin(2, 1, 0, 1)
  )

p_top_cell <- ggplot(cell_summary, aes(cell_type, herv_count)) +
  geom_col(fill = "#2bbbd3", width = 0.82) +
  scale_y_continuous(
    limits = c(0, 120),
    breaks = c(0, 40, 80, 120),
    position = "right",
    expand = c(0, 0)
  ) +
  labs(x = NULL, y = NULL, title = "GWAS Trait") +
  theme_classic(base_size = 9) +
  theme(
    plot.title = element_text(hjust = 0.08, size = 10),
    axis.text.x = element_text(
      angle = 90,
      hjust = 1,
      vjust = 0.5,
      color = "black",
      size = 7
    ),
    axis.text.y = element_text(color = "#0b8a9b", size = 7),
    axis.ticks.x = element_blank(),
    axis.line.x = element_blank(),
    axis.line.y = element_line(color = "#0b8a9b", linewidth = 0.4),
    plot.margin = margin(2, 1, 0, 1)
  )

p_top_blank <- ggplot() + theme_void()

p_disease_heat <- ggplot(disease_heatmap, aes(disease, herv_id, fill = count)) +
  geom_tile(color = "white", linewidth = 0.18) +
  scale_fill_gradientn(
    colors = c("#edf2f3", "#b7d0df", "#6ea6cc", "#1f6fa8"),
    limits = c(0, 5),
    breaks = 0:5,
    name = "Count"
  ) +
  labs(x = NULL, y = NULL) +
  theme_minimal(base_size = 8) +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    panel.border = element_rect(color = "black", fill = NA, linewidth = 0.5),
    legend.position = "none",
    plot.margin = margin(0, 1, 2, 1)
  )

p_cell_heat <- ggplot(cell_heatmap, aes(cell_type, herv_id, fill = count)) +
  geom_tile(color = "white", linewidth = 0.18) +
  scale_fill_gradientn(
    colors = c("#f5f8ee", "#bfe3bd", "#6fcbb1", "#21b7d3", "#005d9c"),
    limits = c(0, 5),
    breaks = 0:5,
    name = "Count"
  ) +
  labs(x = NULL, y = NULL) +
  theme_minimal(base_size = 8) +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    panel.border = element_rect(color = "black", fill = NA, linewidth = 0.5),
    legend.position = "none",
    plot.margin = margin(0, 1, 2, 1)
  )

p_herv_bar <- ggplot(herv_summary, aes(total_count, herv_id)) +
  geom_col(fill = "#b9d7c2", width = 0.85) +
  scale_x_continuous(
    limits = c(0, 8),
    breaks = c(0, 2, 4, 6, 8),
    position = "top",
    expand = c(0, 0)
  ) +
  labs(x = NULL, y = NULL) +
  theme_classic(base_size = 8) +
  theme(
    axis.text.x = element_text(angle = 90, vjust = 0.5, color = "black", size = 7),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.line.y = element_blank(),
    axis.line.x = element_line(color = "black", linewidth = 0.4),
    plot.margin = margin(0, 1, 2, 1)
  )

top_row <- p_top_disease + p_top_cell + p_top_blank +
  plot_layout(widths = c(16, 5, 3))

bottom_row <- p_disease_heat + p_cell_heat + p_herv_bar +
  plot_layout(widths = c(16, 5, 3))

fig <- top_row / bottom_row +
  plot_layout(heights = c(1.25, 8.8))

ggsave(
  "herv_disease_composite_heatmap.png",
  fig,
  width = 7.4,
  height = 10.8,
  dpi = 320,
  bg = "white"
)

ggsave(
  "herv_disease_composite_heatmap.pdf",
  fig,
  width = 7.4,
  height = 10.8,
  bg = "white"
)

复现结果


参考链接