多组学可视化进阶：OmicsDashboard 搭建与交互式报告生成（R Shiny/Python Dash 实战）

多组学研究（转录组 + 蛋白组 + 代谢组 + 甲基化组等）已成为解析生物复杂调控机制的核心手段，但数据可视化始终是制约多组学成果落地的关键瓶颈：传统静态图表（如 ggplot2 绘制的热图、火山图）无法支持非生信人员的交互式探索，多组学数据整合后维度高达数千甚至数万，静态图表难以展现全局关联；投稿 / 项目汇报时，静态报告无法让评审 / 团队成员自主筛选感兴趣的分子、调整可视化参数，导致研究价值无法充分呈现。

OmicsDashboard（多组学交互式仪表盘）通过 R Shiny 或 Python Dash 构建可视化平台，将多组学数据整合为 "可交互、可探索、可导出" 的动态报告，解决了 "静态图表无法联动、非专业人员无法操作、多组学数据整合难" 的痛点。本文从多组学可视化的核心需求出发，分别基于 R Shiny 和 Python Dash 实现完整的 OmicsDashboard 搭建，涵盖数据整合、交互式筛选、多类型可视化、报告生成、部署共享全流程，提供可直接复用的代码模板，助力生信人员快速落地多组学交互式分析。

一、多组学可视化的核心痛点与 OmicsDashboard 价值

1. 多组学数据的核心特征

多组学数据具有 "高维度、多类型、强关联、异质性" 四大特征：

高维度：转录组 / 蛋白组数据少则数千个分子，多则数万个；
多类型：包含计数数据（RNA-seq count）、连续数据（蛋白表达量）、分类数据（样本分组）、注释数据（GO/KEGG 通路）；
强关联：同一分子在不同组学层面的表达存在调控关联（如 lncRNA→mRNA→蛋白）；
异质性：不同组学数据的量纲、分布差异大（如转录组是 count，代谢组是峰面积）。

2. 传统静态可视化的痛点

痛点	具体表现	OmicsDashboard 解决方案
无交互性	无法筛选特定分子 / 样本，无法调整可视化参数（如热图聚类方法）	交互式筛选组件（下拉框、滑块、复选框），实时更新图表
数据整合难	转录组、蛋白组图表分开展示，无法联动查看同一分子的多组学表达	跨组学分子联动查询，一键展示目标分子在各层面的表达
复用性差	每换一批数据需重新编写可视化代码	模块化 Dashboard，只需替换数据文件即可复用
报告不灵活	静态 PDF 报告无法满足个性化探索需求	支持导出交互式 HTML 报告、静态 PNG/PDF、原始数据
非专业人员使用门槛高	需生信人员协助才能调整分析参数	可视化界面操作，无需代码基础

3. OmicsDashboard 核心价值

交互性：支持分子 / 样本筛选、参数调整、图表联动，自主探索数据；
整合性：统一展示多组学数据，直观呈现分子间调控关联；
易用性：可视化界面操作，降低非生信人员使用门槛；
可复用性：模块化架构，适配不同多组学数据集；
可分享性：支持本地 / 云端部署，方便团队协作与成果展示。

二、技术选型与环境配置

1. R Shiny vs Python Dash 对比（生信场景适配）

特性	R Shiny	Python Dash	生信场景推荐
生信生态	丰富（ggplot2、pheatmap、clusterProfiler）	较薄弱（需结合 Plotly、scikit-learn）	转录组 / 表观组分析优先选
可视化语法	贴合生信人员习惯（ggplot2）	基于 Plotly，交互性更强	需高交互性图表优先选
学习成本	R 语言使用者低	Python 语言使用者低	按团队技术栈选择
部署难度	简单（Shiny Server/Shinyapps.io）	稍复杂（Gunicorn+Nginx）	快速部署选 Shiny，大规模部署选 Dash
性能	小数据量（<10 万行）表现好	大数据量（>10 万行）更高效	数据量 > 10 万行选 Dash

2. 环境配置

（1）R Shiny 环境配置

R 复制代码

# 安装核心包
install.packages(c("shiny", "shinydashboard", "ggplot2", "pheatmap", "dplyr", "tidyr", "plotly", "DT", "shinyWidgets", "rmarkdown"))
# 生信专用包
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("clusterProfiler", "org.Hs.eg.db", "ComplexHeatmap"))

（2）Python Dash 环境配置

bash 复制代码

# 创建虚拟环境
conda create -n dash_omics python=3.9 -y
conda activate dash_omics

# 安装核心包
pip install dash dash-bootstrap-components plotly pandas numpy scipy scikit-learn seaborn openpyxl reportlab
# 生信专用包（可选）
pip install pysam pybedtools biom-format

三、实战 1：基于 R Shiny 搭建多组学 Dashboard（转录组 + 代谢组）

以 "肺癌转录组 + 代谢组整合分析" 为例，搭建包含数据概览、交互式筛选、多组学可视化、功能富集、报告导出的完整 Dashboard。

1. Dashboard 架构设计

采用 shinydashboard 经典布局：

侧边栏（Sidebar）：功能导航（数据概览、基因筛选、代谢组分析、富集分析、报告导出）+ 筛选组件（基因名、样本分组、p 值阈值）；
主面板（Body）：动态展示图表 / 表格，支持实时更新；
顶部导航栏（Header）：项目信息、数据版本、帮助文档。

2. 核心代码实现（完整可运行）

步骤 1：准备多组学测试数据

创建 data/ 目录，放入以下文件：

rna_seq_counts.csv：转录组 count 矩阵（行：基因，列：样本，最后一列：基因注释）；
metabolomics_data.csv：代谢组数据（行：代谢物，列：样本，最后一列：代谢通路）；
sample_info.csv：样本分组信息（列：样本名、分组（Tumor/Normal）、性别、年龄）。

示例数据格式（rna_seq_counts.csv）：

GeneID	Sample1	Sample2	Sample3	Sample4	Annotation
TP53	1200	1350	800	750	Tumor suppressor
EGFR	850	920	1500	1600	Oncogene
GAPDH	5000	5200	4800	4900	Housekeeping

步骤 2：编写 Shiny App 代码（`app.R`）

R 复制代码

# 加载包
library(shiny)
library(shinydashboard)
library(ggplot2)
library(pheatmap)
library(dplyr)
library(tidyr)
library(plotly)
library(DT)
library(shinyWidgets)
library(clusterProfiler)
library(org.Hs.eg.db)

# ====================== 1. 加载数据 ======================
# 转录组数据
rna_data <- read.csv("data/rna_seq_counts.csv", row.names = 1, stringsAsFactors = FALSE)
# 代谢组数据
metab_data <- read.csv("data/metabolomics_data.csv", row.names = 1, stringsAsFactors = FALSE)
# 样本信息
sample_info <- read.csv("data/sample_info.csv", row.names = 1, stringsAsFactors = FALSE)

# 提取表达矩阵（去除注释列）
rna_expr <- rna_data[, !colnames(rna_data) %in% "Annotation"]
metab_expr <- metab_data[, !colnames(metab_data) %in% "Pathway"]

# ====================== 2. 定义UI ======================
ui <- dashboardPage(
  # 顶部导航栏
  dashboardHeader(title = "肺癌多组学分析Dashboard", titleWidth = 300),
  
  # 侧边栏
  dashboardSidebar(
    width = 300,
    sidebarMenu(
      menuItem("数据概览", tabName = "overview", icon = icon("dashboard")),
      menuItem("基因筛选与可视化", tabName = "gene_vis", icon = icon("dna")),
      menuItem("代谢组分析", tabName = "metab_vis", icon = icon("flask")),
      menuItem("功能富集分析", tabName = "enrich", icon = icon("chart-pie")),
      menuItem("报告导出", tabName = "report", icon = icon("file-export"))
    ),
    
    # 通用筛选组件（样本分组）
    hr(),
    h4("样本筛选"),
    pickerInput(
      inputId = "sample_group",
      label = "选择样本分组",
      choices = unique(sample_info$Group),
      selected = unique(sample_info$Group),
      multiple = TRUE,
      options = list(`actions-box` = TRUE)
    ),
    
    # 基因筛选组件（仅在基因可视化页面显示）
    conditionalPanel(
      condition = "input.tabName == 'gene_vis'",
      hr(),
      h4("基因筛选"),
      textInput("gene_name", "输入基因名（支持模糊匹配）", placeholder = "TP53/EGFR"),
      sliderInput("expr_threshold", "基因表达量阈值", min = 0, max = max(rna_expr), value = 500),
      pickerInput(
        inputId = "plot_type",
        label = "选择可视化类型",
        choices = c("箱线图", "热图", "火山图"),
        selected = "箱线图"
      )
    ),
    
    # 代谢组筛选组件
    conditionalPanel(
      condition = "input.tabName == 'metab_vis'",
      hr(),
      h4("代谢物筛选"),
      textInput("metab_name", "输入代谢物名（支持模糊匹配）", placeholder = "Glucose/Lactate"),
      pickerInput(
        inputId = "metab_plot",
        label = "选择可视化类型",
        choices = c("散点图", "热图", "通路富集图"),
        selected = "散点图"
      )
    )
  ),
  
  # 主面板
  dashboardBody(
    tabItems(
      # ====================== 标签页1：数据概览 ======================
      tabItem(
        tabName = "overview",
        fluidRow(
          # 数据基本信息卡片
          box(title = "数据概览", width = 12,
              column(6,
                     h4("转录组数据"),
                     p(paste("基因数量：", nrow(rna_expr))),
                     p(paste("样本数量：", ncol(rna_expr))),
                     p(paste("肿瘤样本数：", sum(sample_info$Group == "Tumor"))),
                     p(paste("正常样本数：", sum(sample_info$Group == "Normal")))
              ),
              column(6,
                     h4("代谢组数据"),
                     p(paste("代谢物数量：", nrow(metab_expr))),
                     p(paste("样本数量：", ncol(metab_expr))),
                     p(paste("代谢通路数：", length(unique(metab_data$Pathway)))),
                     p(paste("数据量纲：峰面积（归一化后）"))
              )
          ),
          
          # 样本分组分布饼图
          box(title = "样本分组分布", width = 6,
              plotOutput("sample_pie", height = 300)
          ),
          
          # 转录组表达量分布直方图
          box(title = "转录组基因表达量分布", width = 6,
              plotOutput("rna_expr_dist", height = 300)
          ),
          
          # 样本相关性热图
          box(title = "样本相关性分析（转录组）", width = 12,
              plotOutput("sample_corr_heatmap", height = 400)
          )
        )
      ),
      
      # ====================== 标签页2：基因筛选与可视化 ======================
      tabItem(
        tabName = "gene_vis",
        fluidRow(
          # 筛选结果表格
          box(title = "筛选基因列表", width = 12,
              DTOutput("gene_table")
          ),
          
          # 可视化图表
          box(title = "基因表达可视化", width = 12,
              conditionalPanel(
                condition = "input.plot_type == '箱线图'",
                plotlyOutput("gene_boxplot", height = 400)
              ),
              conditionalPanel(
                condition = "input.plot_type == '热图'",
                plotOutput("gene_heatmap", height = 400)
              ),
              conditionalPanel(
                condition = "input.plot_type == '火山图'",
                plotlyOutput("gene_volcano", height = 400)
              )
          )
        )
      ),
      
      # ====================== 标签页3：代谢组分析 ======================
      tabItem(
        tabName = "metab_vis",
        fluidRow(
          # 代谢物筛选表格
          box(title = "筛选代谢物列表", width = 12,
              DTOutput("metab_table")
          ),
          
          # 代谢物可视化
          box(title = "代谢物表达可视化", width = 12,
              conditionalPanel(
                condition = "input.metab_plot == '散点图'",
                plotlyOutput("metab_scatter", height = 400)
              ),
              conditionalPanel(
                condition = "input.metab_plot == '热图'",
                plotOutput("metab_heatmap", height = 400)
              ),
              conditionalPanel(
                condition = "input.metab_plot == '通路富集图'",
                plotOutput("metab_pathway", height = 400)
              )
          )
        )
      ),
      
      # ====================== 标签页4：功能富集分析 ======================
      tabItem(
        tabName = "enrich",
        fluidRow(
          box(title = "富集分析参数", width = 4,
              textAreaInput("gene_list", "输入富集基因列表（换行分隔）", rows = 10, placeholder = "TP53\nEGFR\nMYC"),
              pickerInput(
                inputId = "enrich_type",
                label = "富集类型",
                choices = c("GO-BP", "GO-CC", "GO-MF", "KEGG"),
                selected = "GO-BP"
              ),
              actionButton("run_enrich", "运行富集分析", class = "btn-primary")
          ),
          
          box(title = "富集结果表格", width = 8,
              DTOutput("enrich_table")
          ),
          
          box(title = "富集气泡图", width = 12,
              plotOutput("enrich_bubble", height = 400)
          )
        )
      ),
      
      # ====================== 标签页5：报告导出 ======================
      tabItem(
        tabName = "report",
        fluidRow(
          box(title = "报告导出设置", width = 6,
              textInput("report_title", "报告标题", value = "肺癌多组学分析报告"),
              selectInput("report_format", "报告格式", choices = c("HTML", "PDF", "Word"), selected = "HTML"),
              checkboxGroupInput("include_plots", "包含图表", 
                                 choices = c("数据概览", "基因表达图", "代谢物图", "富集分析图"),
                                 selected = c("数据概览", "基因表达图")),
              actionButton("generate_report", "生成报告", class = "btn-success")
          ),
          
          box(title = "导出状态", width = 6,
              verbatimTextOutput("report_status"),
              downloadButton("download_report", "下载报告", class = "btn-info")
          )
        )
      )
    )
  )
)

# ====================== 3. 定义Server ======================
server <- function(input, output, session) {
  # ---------------------- 数据概览页面 ----------------------
  # 样本分组饼图
  output$sample_pie <- renderPlot({
    group_counts <- table(sample_info$Group)
    ggplot(data.frame(Group = names(group_counts), Count = as.vector(group_counts)),
           aes(x = "", y = Count, fill = Group)) +
      geom_bar(stat = "identity", width = 1) +
      coord_polar("y", start = 0) +
      theme_void() +
      labs(title = "样本分组分布", fill = "分组") +
      scale_fill_manual(values = c("Tumor" = "red", "Normal" = "green"))
  })
  
  # 转录组表达量分布
  output$rna_expr_dist <- renderPlot({
    rna_expr_long <- pivot_longer(as.data.frame(rna_expr), cols = everything(), names_to = "Sample", values_to = "Expression")
    ggplot(rna_expr_long, aes(x = Expression)) +
      geom_histogram(bins = 50, fill = "steelblue", alpha = 0.7) +
      theme_bw() +
      labs(title = "基因表达量分布", x = "表达量（count）", y = "基因数量") +
      scale_x_log10()  # 对数转换，更易观察分布
  })
  
  # 样本相关性热图
  output$sample_corr_heatmap <- renderPlot({
    # 计算样本相关性
    sample_corr <- cor(t(rna_expr))
    # 绘制热图
    pheatmap(sample_corr, 
             annotation_col = sample_info[, "Group", drop = FALSE],
             color = colorRampPalette(c("blue", "white", "red"))(100),
             main = "样本相关性热图（转录组）",
             show_rownames = TRUE,
             show_colnames = TRUE)
  })
  
  # ---------------------- 基因筛选与可视化页面 ----------------------
  # 筛选基因数据（响应式）
  filtered_genes <- reactive({
    # 筛选样本
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    rna_filtered <- rna_expr[, selected_samples, drop = FALSE]
    
    # 筛选基因（表达量阈值 + 基因名模糊匹配）
    if (input$gene_name != "") {
      gene_match <- grepl(input$gene_name, rownames(rna_filtered), ignore.case = TRUE)
      rna_filtered <- rna_filtered[gene_match, , drop = FALSE]
    }
    rna_filtered <- rna_filtered[rowMeans(rna_filtered) >= input$expr_threshold, , drop = FALSE]
    
    # 合并注释信息
    rna_filtered <- cbind(rna_filtered, Annotation = rna_data[rownames(rna_filtered), "Annotation"])
    return(rna_filtered)
  })
  
  # 筛选基因表格
  output$gene_table <- renderDT({
    datatable(filtered_genes(), 
              options = list(pageLength = 10, scrollX = TRUE),
              caption = "筛选后的基因表达数据") %>%
      formatRound(columns = 1:(ncol(filtered_genes())-1), digits = 0)
  })
  
  # 基因箱线图
  output$gene_boxplot <- renderPlotly({
    if (nrow(filtered_genes()) == 0) {
      return(plotly_empty() %>% layout(title = "无符合条件的基因"))
    }
    
    # 转换为长格式
    gene_long <- filtered_genes() %>%
      rownames_to_column("GeneID") %>%
      pivot_longer(cols = -c(GeneID, Annotation), names_to = "Sample", values_to = "Expression") %>%
      left_join(sample_info %>% rownames_to_column("Sample"), by = "Sample")
    
    # 绘制箱线图
    p <- ggplot(gene_long, aes(x = Group, y = Expression, fill = Group)) +
      geom_boxplot(alpha = 0.7) +
      facet_wrap(~GeneID, scales = "free_y") +
      theme_bw() +
      scale_fill_manual(values = c("Tumor" = "red", "Normal" = "green")) +
      labs(x = "样本分组", y = "基因表达量（count）", title = "基因表达箱线图")
    
    ggplotly(p)
  })
  
  # 基因热图
  output$gene_heatmap <- renderPlot({
    if (nrow(filtered_genes()) == 0) {
      return(ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的基因")) + theme_void())
    }
    
    # 提取表达矩阵（去除注释列）
    expr_mat <- filtered_genes()[, !colnames(filtered_genes()) %in% "Annotation"]
    # 标准化（行Z-score）
    expr_mat <- t(scale(t(expr_mat)))
    
    pheatmap(expr_mat,
             annotation_col = sample_info[colnames(expr_mat), "Group", drop = FALSE],
             color = colorRampPalette(c("blue", "white", "red"))(100),
             main = "基因表达热图（Z-score标准化）",
             show_rownames = TRUE,
             show_colnames = TRUE)
  })
  
  # 基因火山图（仅展示差异基因，此处简化为Tumor vs Normal均值比）
  output$gene_volcano <- renderPlotly({
    if (nrow(filtered_genes()) == 0) {
      return(plotly_empty() %>% layout(title = "无符合条件的基因"))
    }
    
    # 计算Tumor和Normal的均值
    tumor_samples <- rownames(sample_info)[sample_info$Group == "Tumor"]
    normal_samples <- rownames(sample_info)[sample_info$Group == "Normal"]
    tumor_mean <- rowMeans(filtered_genes()[, tumor_samples, drop = FALSE])
    normal_mean <- rowMeans(filtered_genes()[, normal_samples, drop = FALSE])
    
    # 计算log2FC和p值（简化：t检验）
    volcano_data <- data.frame(
      GeneID = names(tumor_mean),
      log2FC = log2((tumor_mean + 1)/(normal_mean + 1)),
      pvalue = sapply(names(tumor_mean), function(g) {
        t.test(filtered_genes()[g, tumor_samples], filtered_genes()[g, normal_samples])$p.value
      })
    ) %>%
      mutate(-log10p = -log10(pvalue),
             Significance = ifelse(pvalue < 0.05 & abs(log2FC) > 1, "Significant", "Not significant"))
    
    # 绘制火山图
    p <- ggplot(volcano_data, aes(x = log2FC, y = -log10p, color = Significance)) +
      geom_point(alpha = 0.7, size = 2) +
      theme_bw() +
      scale_color_manual(values = c("Significant" = "red", "Not significant" = "gray")) +
      geom_vline(xintercept = c(-1, 1), linetype = "dashed", color = "black") +
      geom_hline(yintercept = -log10(0.05), linetype = "dashed", color = "black") +
      labs(x = "log2(Fold Change)", y = "-log10(P-value)", title = "基因差异表达火山图")
    
    ggplotly(p) %>% layout(hovermode = "closest")
  })
  
  # ---------------------- 代谢组分析页面 ----------------------
  # 筛选代谢物数据
  filtered_metabs <- reactive({
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    metab_filtered <- metab_expr[, selected_samples, drop = FALSE]
    
    if (input$metab_name != "") {
      metab_match <- grepl(input$metab_name, rownames(metab_filtered), ignore.case = TRUE)
      metab_filtered <- metab_filtered[metab_match, , drop = FALSE]
    }
    
    metab_filtered <- cbind(metab_filtered, Pathway = metab_data[rownames(metab_filtered), "Pathway"])
    return(metab_filtered)
  })
  
  # 代谢物表格
  output$metab_table <- renderDT({
    datatable(filtered_metabs(), 
              options = list(pageLength = 10, scrollX = TRUE),
              caption = "筛选后的代谢物表达数据") %>%
      formatRound(columns = 1:(ncol(filtered_metabs())-1), digits = 2)
  })
  
  # 代谢物散点图
  output$metab_scatter <- renderPlotly({
    if (nrow(filtered_metabs()) == 0) {
      return(plotly_empty() %>% layout(title = "无符合条件的代谢物"))
    }
    
    metab_long <- filtered_metabs() %>%
      rownames_to_column("Metabolite") %>%
      pivot_longer(cols = -c(Metabolite, Pathway), names_to = "Sample", values_to = "Expression") %>%
      left_join(sample_info %>% rownames_to_column("Sample"), by = "Sample")
    
    p <- ggplot(metab_long, aes(x = Sample, y = Expression, color = Group)) +
      geom_point(size = 3, alpha = 0.7) +
      facet_wrap(~Metabolite, scales = "free_y") +
      theme_bw() +
      scale_color_manual(values = c("Tumor" = "red", "Normal" = "green")) +
      labs(x = "样本", y = "代谢物表达量（峰面积）", title = "代谢物表达散点图") +
      theme(axis.text.x = element_text(angle = 45, hjust = 1))
    
    ggplotly(p)
  })
  
  # 代谢物热图
  output$metab_heatmap <- renderPlot({
    if (nrow(filtered_metabs()) == 0) {
      return(ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的代谢物")) + theme_void())
    }
    
    expr_mat <- filtered_metabs()[, !colnames(filtered_metabs()) %in% "Pathway"]
    expr_mat <- t(scale(t(expr_mat)))
    
    pheatmap(expr_mat,
             annotation_col = sample_info[colnames(expr_mat), "Group", drop = FALSE],
             color = colorRampPalette(c("blue", "white", "red"))(100),
             main = "代谢物表达热图（Z-score标准化）",
             show_rownames = TRUE,
             show_colnames = TRUE)
  })
  
  # 代谢物通路富集图（简化版）
  output$metab_pathway <- renderPlot({
    if (nrow(filtered_metabs()) == 0) {
      return(ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的代谢物")) + theme_void())
    }
    
    # 统计各通路的代谢物数量
    pathway_counts <- table(filtered_metabs()$Pathway)
    pathway_df <- data.frame(Pathway = names(pathway_counts), Count = as.vector(pathway_counts)) %>%
      arrange(desc(Count)) %>%
      slice_head(n = 10)  # 展示前10个通路
    
    ggplot(pathway_df, aes(x = reorder(Pathway, Count), y = Count)) +
      geom_bar(stat = "identity", fill = "orange", alpha = 0.7) +
      theme_bw() +
      labs(x = "代谢通路", y = "代谢物数量", title = "代谢通路富集TOP10") +
      theme(axis.text.x = element_text(angle = 45, hjust = 1))
  })
  
  # ---------------------- 功能富集分析页面 ----------------------
  # 富集分析结果（响应式）
  enrich_result <- eventReactive(input$run_enrich, {
    if (input$gene_list == "") {
      return(NULL)
    }
    
    # 提取基因列表
    genes <- unlist(strsplit(input$gene_list, "\n"))
    genes <- genes[genes != ""]
    
    # 转换为Entrez ID（富集分析需要）
    gene_entrez <- bitr(genes, fromType = "SYMBOL", toType = "ENTREZID", OrgDb = org.Hs.eg.db)$ENTREZID
    
    if (length(gene_entrez) == 0) {
      return(NULL)
    }
    
    # 运行富集分析
    if (input$enrich_type == "GO-BP") {
      enrich <- enrichGO(gene = gene_entrez, OrgDb = org.Hs.eg.db, ont = "BP", pAdjustMethod = "fdr", qvalueCutoff = 0.05)
    } else if (input$enrich_type == "GO-CC") {
      enrich <- enrichGO(gene = gene_entrez, OrgDb = org.Hs.eg.db, ont = "CC", pAdjustMethod = "fdr", qvalueCutoff = 0.05)
    } else if (input$enrich_type == "GO-MF") {
      enrich <- enrichGO(gene = gene_entrez, OrgDb = org.Hs.eg.db, ont = "MF", pAdjustMethod = "fdr", qvalueCutoff = 0.05)
    } else if (input$enrich_type == "KEGG") {
      enrich <- enrichKEGG(gene = gene_entrez, organism = "hsa", pvalueCutoff = 0.05)
    }
    
    return(enrich)
  })
  
  # 富集结果表格
  output$enrich_table <- renderDT({
    if (is.null(enrich_result())) {
      return(datatable(data.frame(Message = "请输入有效基因列表并运行富集分析")))
    }
    
    datatable(as.data.frame(enrich_result()), 
              options = list(pageLength = 10, scrollX = TRUE),
              caption = paste(input$enrich_type, "富集分析结果"))
  })
  
  # 富集气泡图
  output$enrich_bubble <- renderPlot({
    if (is.null(enrich_result())) {
      return(ggplot() + geom_text(aes(x = 1, y = 1, label = "无富集结果")) + theme_void())
    }
    
    dotplot(enrich_result(), showCategory = 15, title = paste(input$enrich_type, "富集气泡图")) +
      theme(axis.text.y = element_text(size = 10))
  })
  
  # ---------------------- 报告导出页面 ----------------------
# 生成报告
report_path <- reactiveVal(NULL)

# 辅助函数：保存图表为图片
save_plot <- function(plot_obj, filename, width = 10, height = 6, dpi = 300) {
  if (is.null(plot_obj)) return(NULL)
  plot_path <- file.path(tempdir(), filename)
  # 区分ggplot/plotly对象
  if (inherits(plot_obj, "ggplot")) {
    ggsave(plot_path, plot_obj, width = width, height = height, dpi = dpi)
  } else if (inherits(plot_obj, "plotly")) {
    plotly::export(plot_obj, file = plot_path)
  } else {
    png(plot_path, width = width, height = height, units = "in", res = dpi)
    print(plot_obj)
    dev.off()
  }
  return(plot_path)
}

observeEvent(input$generate_report, {
  # 1. 生成并保存核心图表（临时目录）
  ## 1.1 样本分布饼图
  sample_pie_plot <- ggplot(data.frame(Group = names(table(sample_info$Group)), 
                                       Count = as.vector(table(sample_info$Group))),
                            aes(x = "", y = Count, fill = Group)) +
    geom_bar(stat = "identity", width = 1) +
    coord_polar("y", start = 0) +
    theme_void() +
    labs(title = "样本分组分布", fill = "分组") +
    scale_fill_manual(values = c("Tumor" = "red", "Normal" = "green"))
  sample_pie_path <- save_plot(sample_pie_plot, "sample_pie.png")
  
  ## 1.2 基因表达图表（根据选择的可视化类型）
  gene_plot <- if (input$plot_type == "箱线图") {
    # 重新生成箱线图（非交互式版本，适配Rmd）
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    rna_filtered <- rna_expr[, selected_samples, drop = FALSE]
    if (input$gene_name != "") {
      gene_match <- grepl(input$gene_name, rownames(rna_filtered), ignore.case = TRUE)
      rna_filtered <- rna_filtered[gene_match, , drop = FALSE]
    }
    rna_filtered <- rna_filtered[rowMeans(rna_filtered) >= input$expr_threshold, , drop = FALSE]
    
    if (nrow(rna_filtered) == 0) {
      ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的基因")) + theme_void()
    } else {
      gene_long <- rna_filtered %>%
        rownames_to_column("GeneID") %>%
        pivot_longer(cols = -GeneID, names_to = "Sample", values_to = "Expression") %>%
        left_join(sample_info %>% rownames_to_column("Sample"), by = "Sample")
      ggplot(gene_long, aes(x = Group, y = Expression, fill = Group)) +
        geom_boxplot(alpha = 0.7) +
        facet_wrap(~GeneID, scales = "free_y") +
        theme_bw() +
        scale_fill_manual(values = c("Tumor" = "red", "Normal" = "green")) +
        labs(x = "样本分组", y = "基因表达量（count）", title = "基因表达箱线图")
    }
  } else if (input$plot_type == "热图") {
    # 重新生成热图
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    rna_filtered <- rna_expr[, selected_samples, drop = FALSE]
    if (input$gene_name != "") {
      gene_match <- grepl(input$gene_name, rownames(rna_filtered), ignore.case = TRUE)
      rna_filtered <- rna_filtered[gene_match, , drop = FALSE]
    }
    rna_filtered <- rna_filtered[rowMeans(rna_filtered) >= input$expr_threshold, , drop = FALSE]
    
    if (nrow(rna_filtered) == 0) {
      ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的基因")) + theme_void()
    } else {
      expr_mat <- t(scale(t(rna_filtered)))
      pheatmap(expr_mat,
               annotation_col = sample_info[colnames(expr_mat), "Group", drop = FALSE],
               color = colorRampPalette(c("blue", "white", "red"))(100),
               main = "基因表达热图（Z-score标准化）",
               show_rownames = TRUE,
               show_colnames = TRUE)
    }
  } else {
    # 火山图
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    rna_filtered <- rna_expr[, selected_samples, drop = FALSE]
    if (input$gene_name != "") {
      gene_match <- grepl(input$gene_name, rownames(rna_filtered), ignore.case = TRUE)
      rna_filtered <- rna_filtered[gene_match, , drop = FALSE]
    }
    rna_filtered <- rna_filtered[rowMeans(rna_filtered) >= input$expr_threshold, , drop = FALSE]
    
    if (nrow(rna_filtered) == 0) {
      ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的基因")) + theme_void()
    } else {
      tumor_samples <- rownames(sample_info)[sample_info$Group == "Tumor"]
      normal_samples <- rownames(sample_info)[sample_info$Group == "Normal"]
      tumor_mean <- rowMeans(rna_filtered[, tumor_samples, drop = FALSE])
      normal_mean <- rowMeans(rna_filtered[, normal_samples, drop = FALSE])
      volcano_data <- data.frame(
        GeneID = names(tumor_mean),
        log2FC = log2((tumor_mean + 1)/(normal_mean + 1)),
        pvalue = sapply(names(tumor_mean), function(g) {
          t.test(rna_filtered[g, tumor_samples], rna_filtered[g, normal_samples])$p.value
        })
      ) %>%
        mutate(-log10p = -log10(pvalue),
               Significance = ifelse(pvalue < 0.05 & abs(log2FC) > 1, "Significant", "Not significant"))
      ggplot(volcano_data, aes(x = log2FC, y = -log10p, color = Significance)) +
        geom_point(alpha = 0.7, size = 2) +
        theme_bw() +
        scale_color_manual(values = c("Significant" = "red", "Not significant" = "gray")) +
        geom_vline(xintercept = c(-1, 1), linetype = "dashed", color = "black") +
        geom_hline(yintercept = -log10(0.05), linetype = "dashed", color = "black") +
        labs(x = "log2(Fold Change)", y = "-log10(P-value)", title = "基因差异表达火山图")
    }
  }
  gene_plot_path <- save_plot(gene_plot, "gene_expression.png")
  
  ## 1.3 代谢组分析图表
  metab_plot <- if (input$metab_plot == "散点图") {
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    metab_filtered <- metab_expr[, selected_samples, drop = FALSE]
    if (input$metab_name != "") {
      metab_match <- grepl(input$metab_name, rownames(metab_filtered), ignore.case = TRUE)
      metab_filtered <- metab_filtered[metab_match, , drop = FALSE]
    }
    
    if (nrow(metab_filtered) == 0) {
      ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的代谢物")) + theme_void()
    } else {
      metab_long <- metab_filtered %>%
        rownames_to_column("Metabolite") %>%
        pivot_longer(cols = -Metabolite, names_to = "Sample", values_to = "Expression") %>%
        left_join(sample_info %>% rownames_to_column("Sample"), by = "Sample")
      ggplot(metab_long, aes(x = Sample, y = Expression, color = Group)) +
        geom_point(size = 3, alpha = 0.7) +
        facet_wrap(~Metabolite, scales = "free_y") +
        theme_bw() +
        scale_color_manual(values = c("Tumor" = "red", "Normal" = "green")) +
        labs(x = "样本", y = "代谢物表达量（峰面积）", title = "代谢物表达散点图") +
        theme(axis.text.x = element_text(angle = 45, hjust = 1))
    }
  } else if (input$metab_plot == "热图") {
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    metab_filtered <- metab_expr[, selected_samples, drop = FALSE]
    if (input$metab_name != "") {
      metab_match <- grepl(input$metab_name, rownames(metab_filtered), ignore.case = TRUE)
      metab_filtered <- metab_filtered[metab_match, , drop = FALSE]
    }
    
    if (nrow(metab_filtered) == 0) {
      ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的代谢物")) + theme_void()
    } else {
      expr_mat <- t(scale(t(metab_filtered)))
      pheatmap(expr_mat,
               annotation_col = sample_info[colnames(expr_mat), "Group", drop = FALSE],
               color = colorRampPalette(c("blue", "white", "red"))(100),
               main = "代谢物表达热图（Z-score标准化）",
               show_rownames = TRUE,
               show_colnames = TRUE)
    }
  } else {
    # 通路富集图
    selected_samples <- rownames(sample_info)[sample_info$Group %in% input$sample_group]
    metab_filtered <- metab_expr[, selected_samples, drop = FALSE]
    if (input$metab_name != "") {
      metab_match <- grepl(input$metab_name, rownames(metab_filtered), ignore.case = TRUE)
      metab_filtered <- metab_filtered[metab_match, , drop = FALSE]
    }
    metab_filtered <- cbind(metab_filtered, Pathway = metab_data[rownames(metab_filtered), "Pathway"])
    
    if (nrow(metab_filtered) == 0) {
      ggplot() + geom_text(aes(x = 1, y = 1, label = "无符合条件的代谢物")) + theme_void()
    } else {
      pathway_counts <- table(metab_filtered$Pathway)
      pathway_df <- data.frame(Pathway = names(pathway_counts), Count = as.vector(pathway_counts)) %>%
        arrange(desc(Count)) %>%
        slice_head(n = 10)
      ggplot(pathway_df, aes(x = reorder(Pathway, Count), y = Count)) +
        geom_bar(stat = "identity", fill = "orange", alpha = 0.7) +
        theme_bw() +
        labs(x = "代谢通路", y = "代谢物数量", title = "代谢通路富集TOP10") +
        theme(axis.text.x = element_text(angle = 45, hjust = 1))
    }
  }
  metab_plot_path <- save_plot(metab_plot, "metabolomics_analysis.png")
  
  ## 1.4 功能富集分析图表
  enrich_plot <- if (!is.null(enrich_result())) {
    dotplot(enrich_result(), showCategory = 15, title = paste(input$enrich_type, "富集气泡图")) +
      theme(axis.text.y = element_text(size = 10))
  } else {
    ggplot() + geom_text(aes(x = 1, y = 1, label = "未运行富集分析或无富集结果")) + theme_void()
  }
  enrich_plot_path <- save_plot(enrich_plot, "enrichment_analysis.png")

  # 2. 定义R Markdown报告模板
  report_template <- "
---
title: '%s'
output:
  %s_document:
    toc: true
    toc_depth: 3
    number_sections: true
---

# 多组学分析报告
## 1. 数据概览
### 1.1 基本信息
- 转录组基因数量：%d
- 代谢组代谢物数量：%d
- 样本总数：%d（肿瘤：%d，正常：%d）

### 1.2 样本分布
```{r echo=FALSE, fig.width=8, fig.height=4, warning=FALSE}
knitr::include_graphics('%s')

附录：数据文件说明

转录组数据：rna_seq_counts.csv
代谢组数据：metabolomics_data.csv
样本信息：sample_info.csv"

填充模板参数

report_title <- inputreporttitlereportformat<−switch(inputreport_format,HTML = "html",PDF = "pdf",Word = "word")

提取图表代码（简化版，实际可保存图片后插入）

sample_pie_code <- "print(outputsamplepie())"geneplotcode<−if(inputplot_type == "箱线图") "print (outputgeneboxplot())"else"print(outputgene_heatmap())"metab_plot_code <- if (input散点图metab_scatter())" else "print(outputmetabheatmap())"enrichplotcode<−if(!is.null(enrichresult()))"print(outputenrich_bubble ())"else"ggplot () + geom_text (aes (x=1,y=1,label=' 无富集结果 '))"

填充模板

report_content <- sprintf(report_template,report_title,report_format,nrow(rna_expr),nrow(metab_expr),nrow(sample_info),sum(sample_infoGroup=="Tumor"),sum(sampleinfoGroup == "Normal"),sample_pie_code,gene_plot_code,metab_plot_code,enrich_plot_code)

保存报告模板

temp_report <- tempfile(fileext = ".Rmd")writeLines(report_content, temp_report)

渲染报告

output_file <- paste0("OmicsReport_", Sys.Date(), ".", report_format)rmarkdown::render(temp_report, output_file = output_file, envir = new.env())

更新报告路径

report_path (output_file)output$report_status <- renderPrint ({cat (paste ("报告生成成功！路径：", output_file))})})

下载报告

outputdownload_report <- downloadHandler( filename = function() { paste0("OmicsReport_", Sys.Date(), ".", inputreport_format)},content = function(file) {file.copy(report_path(), file)},contentType = switch(input$report_format,HTML = "text/html",PDF = "application/pdf",Word = "application/msword"))}

4. 运行 App

shinyApp(ui, server)

plaintext

复制代码

### 3. 本地运行与部署
#### （1）本地运行
将 `app.R` 和 `data/` 目录放在同一文件夹，在RStudio中打开 `app.R`，点击"Run App"按钮即可启动Dashboard。

#### （2）服务器部署（Shiny Server）
1. 安装 Shiny Server（Ubuntu 示例）：
   ```bash
   sudo apt-get install r-base
   sudo R -e "install.packages('shiny')"
   wget https://download3.rstudio.org/ubuntu-18.04/x86_64/shiny-server-1.5.20.1002-amd64.deb
   sudo dpkg -i shiny-server-1.5.20.1002-amd64.deb

将项目文件夹复制到 /srv/shiny-server/omics_dashboard/；
启动 Shiny Server：sudo systemctl start shiny-server；
访问 Dashboard：http://服务器IP:3838/omics_dashboard/。

（3）云端部署（Shinyapps.io）

注册 Shinyapps.io 账号（https://www.shinyapps.io/）；
安装 rsconnect 包：install.packages("rsconnect")；
配置账号：rsconnect::setAccountInfo(name='你的账号', token='你的token', secret='你的secret')；

部署 App：

R 复制代码

library(rsconnect)
deployApp(appDir = "path/to/omics_dashboard", appName = "lung_cancer_omics")

四、实战 2：基于 Python Dash 搭建多组学 Dashboard（蛋白组 + 甲基化组）

以 "肝癌蛋白组 + 甲基化组整合分析" 为例，搭建高交互性的 Dash Dashboard，重点实现跨组学联动、动态更新、大数据量优化。

1. Dashboard 架构设计

Dash 基于 "组件 + 回调" 架构：

布局组件 ：dash-bootstrap-components 构建响应式布局（侧边栏 + 主面板）；
回调函数：实现组件交互（筛选→更新图表）；
可视化 ：plotly 绘制交互式图表，支持缩放、悬停、下载；
数据处理 ：pandas 整合多组学数据，scipy 做统计分析。

2. 核心代码实现（完整可运行）

步骤 1：准备测试数据

创建 data/ 目录，放入 proteomics_data.csv（蛋白组）、methylation_data.csv（甲基化组）、sample_info.csv（样本信息）。

步骤 2：编写 Dash App 代码（`app.py`）

python 复制代码

import dash
from dash import dcc, html, Input, Output, State, callback
import dash_bootstrap_components as dbc
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np
from scipy import stats
import os

# ====================== 1. 加载数据 ======================
# 设置数据路径
DATA_DIR = "data"
proteomics_df = pd.read_csv(os.path.join(DATA_DIR, "proteomics_data.csv"), index_col=0)
methylation_df = pd.read_csv(os.path.join(DATA_DIR, "methylation_data.csv"), index_col=0)
sample_info = pd.read_csv(os.path.join(DATA_DIR, "sample_info.csv"), index_col=0)

# 提取表达矩阵（去除注释列）
proteomics_expr = proteomics_df.drop(columns=["Annotation"], errors="ignore")
methylation_expr = methylation_df.drop(columns=["CpG_Island"], errors="ignore")

# ====================== 2. 初始化App ======================
app = dash.Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP], suppress_callback_exceptions=True)
server = app.server  # 部署用

# ====================== 3. 定义布局 ======================
app.layout = dbc.Container([
    # 标题
    dbc.Row([
        dbc.Col(html.H1("肝癌多组学分析Dashboard", className="text-center mb-4"), width=12)
    ]),
    
    # 主体布局（侧边栏+主面板）
    dbc.Row([
        # 侧边栏（筛选组件）
        dbc.Col([
            html.Div([
                html.H4("筛选条件", className="mb-3"),
                
                # 样本分组筛选
                html.Label("样本分组"),
                dcc.Dropdown(
                    id="sample_group",
                    options=[{"label": g, "value": g} for g in sample_info["Group"].unique()],
                    value=sample_info["Group"].unique().tolist(),
                    multi=True,
                    className="mb-3"
                ),
                
                # 分子类型选择
                html.Label("分子类型"),
                dcc.RadioItems(
                    id="omics_type",
                    options=[
                        {"label": "蛋白组", "value": "proteomics"},
                        {"label": "甲基化组", "value": "methylation"},
                        {"label": "跨组学联动", "value": "cross_omics"}
                    ],
                    value="proteomics",
                    className="mb-3"
                ),
                
                # 蛋白组筛选（条件显示）
                html.Div(id="proteomics_filters", children=[
                    html.Label("蛋白名（模糊匹配）"),
                    dcc.Input(id="protein_name", type="text", placeholder="p53/AKT", className="mb-3 form-control"),
                    html.Label("表达量阈值"),
                    dcc.Slider(
                        id="protein_expr_thresh",
                        min=0, max=proteomics_expr.values.max(),
                        value=100, step=10,
                        marks={i: str(i) for i in range(0, int(proteomics_expr.values.max())+1, 500)},
                        className="mb-3"
                    )
                ], className="mb-4"),
                
                # 甲基化组筛选（条件显示）
                html.Div(id="methylation_filters", children=[
                    html.Label("CpG位点（模糊匹配）"),
                    dcc.Input(id="cpg_name", type="text", placeholder="cg0001/cg1234", className="mb-3 form-control"),
                    html.Label("甲基化水平阈值（β值）"),
                    dcc.Slider(
                        id="methylation_thresh",
                        min=0, max=1, value=0.5, step=0.1,
                        marks={i/10: str(i/10) for i in range(0, 11, 2)},
                        className="mb-3"
                    )
                ], className="mb-4"),
                
                # 跨组学筛选（条件显示）
                html.Div(id="cross_omics_filters", children=[
                    html.Label("目标基因/蛋白名"),
                    dcc.Input(id="cross_gene_name", type="text", placeholder="TP53", className="mb-3 form-control"),
                    html.Label("关联分析类型"),
                    dcc.RadioItems(
                        id="corr_type",
                        options=[
                            {"label": "皮尔逊相关", "value": "pearson"},
                            {"label": "斯皮尔曼相关", "value": "spearman"}
                        ],
                        value="pearson",
                        className="mb-3"
                    )
                ], className="mb-4"),
                
                # 可视化类型选择
                html.Label("可视化类型"),
                dcc.Dropdown(
                    id="plot_type",
                    options=[
                        {"label": "箱线图", "value": "box"},
                        {"label": "热图", "value": "heatmap"},
                        {"label": "散点图（相关性）", "value": "scatter"},
                        {"label": "火山图", "value": "volcano"}
                    ],
                    value="box",
                    className="mb-4"
                ),
                
                # 报告导出按钮
                dbc.Button("导出交互式报告", id="export_report", color="success", className="mt-4")
            ], className="bg-light p-4 rounded-3")
        ], width=3),
        
        # 主面板（图表+表格）
        dbc.Col([
            # 图表区域
            dcc.Loading([
                dcc.Graph(id="main_plot", figure={}, className="mb-4")
            ]),
            
            # 数据表格区域
            html.H4("筛选结果", className="mb-2"),
            dash.dash_table.DataTable(
                id="omics_table",
                columns=[],
                data=[],
                page_size=10,
                scroll_x=True,
                style_table={"overflowX": "auto"},
                style_cell={"textAlign": "center"},
                style_header={"backgroundColor": "lightgray", "fontWeight": "bold"}
            ),
            
            # 报告下载链接
            html.Div(id="report_link", className="mt-4")
        ], width=9)
    ]),
    
    # 隐藏组件（存储临时数据）
    dcc.Store(id="filtered_data")
], fluid=True)

# ====================== 4. 回调函数 ======================
# 回调1：根据分子类型显示/隐藏筛选组件
@callback(
    Output("proteomics_filters", "style"),
    Output("methylation_filters", "style"),
    Output("cross_omics_filters", "style"),
    Input("omics_type", "value")
)
def toggle_filters(omics_type):
    # 默认隐藏所有筛选组件
    hidden = {"display": "none"}
    visible = {"display": "block"}
    
    if omics_type == "proteomics":
        return visible, hidden, hidden
    elif omics_type == "methylation":
        return hidden, visible, hidden
    elif omics_type == "cross_omics":
        return hidden, hidden, visible
    else:
        return hidden, hidden, hidden

# 回调2：筛选数据并存储
@callback(
    Output("filtered_data", "data"),
    Input("omics_type", "value"),
    Input("sample_group", "value"),
    # 蛋白组筛选参数
    Input("protein_name", "value"),
    Input("protein_expr_thresh", "value"),
    # 甲基化组筛选参数
    Input("cpg_name", "value"),
    Input("methylation_thresh", "value"),
    # 跨组学筛选参数
    Input("cross_gene_name", "value"),
    Input("corr_type", "value")
)
def filter_data(omics_type, sample_group, protein_name, protein_thresh, cpg_name, methylation_thresh, cross_gene_name, corr_type):
    # 筛选样本
    selected_samples = sample_info[sample_info["Group"].isin(sample_group)].index.tolist()
    
    if omics_type == "proteomics":
        # 筛选蛋白组数据
        df = proteomics_expr[selected_samples].copy()
        # 模糊匹配蛋白名
        if protein_name:
            df = df[df.index.str.contains(protein_name, case=False)]
        # 表达量阈值筛选
        df = df[df.mean(axis=1) >= protein_thresh]
        # 合并注释
        if "Annotation" in proteomics_df.columns:
            df["Annotation"] = proteomics_df.loc[df.index, "Annotation"]
        return df.to_dict("records")
    
    elif omics_type == "methylation":
        # 筛选甲基化组数据
        df = methylation_expr[selected_samples].copy()
        # 模糊匹配CpG位点
        if cpg_name:
            df = df[df.index.str.contains(cpg_name, case=False)]
        # 甲基化水平阈值筛选
        df = df[df.mean(axis=1) >= methylation_thresh]
        # 合并注释
        if "CpG_Island" in methylation_df.columns:
            df["CpG_Island"] = methylation_df.loc[df.index, "CpG_Island"]
        return df.to_dict("records")
    
    elif omics_type == "cross_omics":
        # 跨组学联动：匹配蛋白和甲基化数据中的目标基因
        if not cross_gene_name:
            return []
        
        # 提取目标蛋白和甲基化数据
        protein_data = proteomics_expr[selected_samples].copy()
        methylation_data = methylation_expr[selected_samples].copy()
        
        # 模糊匹配目标基因
        protein_match = protein_data.index.str.contains(cross_gene_name, case=False)
        methylation_match = methylation_data.index.str.contains(cross_gene_name, case=False)
        
        if not protein_match.any() or not methylation_match.any():
            return []
        
        # 计算相关性
        protein_vals = protein_data[protein_match].mean(axis=0)
        methylation_vals = methylation_data[methylation_match].mean(axis=0)
        
        corr_func = stats.pearsonr if corr_type == "pearson" else stats.spearmanr
        corr, pval = corr_func(protein_vals, methylation_vals)
        
        # 构建结果数据
        cross_df = pd.DataFrame({
            "Sample": selected_samples,
            "Protein_Expression": protein_vals.values,
            "Methylation_Level": methylation_vals.values,
            "Group": sample_info.loc[selected_samples, "Group"].values,
            "Correlation": corr,
            "P_Value": pval
        })
        return cross_df.to_dict("records")
    
    else:
        return []

# 回调3：更新表格
@callback(
    Output("omics_table", "columns"),
    Output("omics_table", "data"),
    Input("filtered_data", "data"),
    Input("omics_type", "value")
)
def update_table(filtered_data, omics_type):
    if not filtered_data:
        return [], []
    
    df = pd.DataFrame(filtered_data)
    # 定义表格列
    columns = [{"name": col, "id": col} for col in df.columns]
    # 格式化数值
    for col in df.select_dtypes(include=[np.number]).columns:
        df[col] = df[col].round(2)
    
    return columns, df.to_dict("records")

# 回调4：更新主图表
@callback(
    Output("main_plot", "figure"),
    Input("filtered_data", "data"),
    Input("omics_type", "value"),
    Input("plot_type", "value"),
    Input("sample_group", "value"),
    Input("corr_type", "value")
)
def update_plot(filtered_data, omics_type, plot_type, sample_group, corr_type):
    if not filtered_data:
        return go.Figure().update_layout(title="无符合条件的数据")
    
    df = pd.DataFrame(filtered_data)
    selected_samples = sample_info[sample_info["Group"].isin(sample_group)].index.tolist()
    
    # 蛋白组/甲基化组可视化
    if omics_type in ["proteomics", "methylation"]:
        # 转换为长格式
        value_cols = [col for col in df.columns if col in selected_samples]
        df_long = df.melt(
            id_vars=[col for col in df.columns if col not in value_cols],
            value_vars=value_cols,
            var_name="Sample",
            value_name="Value"
        )
        # 合并样本分组
        df_long["Group"] = df_long["Sample"].map(sample_info["Group"])
        
        if plot_type == "box":
            # 箱线图
            fig = px.box(
                df_long,
                x="Group",
                y="Value",
                color="Group",
                facet_col="index" if omics_type == "proteomics" else "index",
                facet_col_wrap=3,
                title=f"{omics_type.capitalize()} 表达/甲基化水平箱线图",
                labels={"Value": "蛋白表达量" if omics_type == "proteomics" else "甲基化水平（β值）"},
                color_discrete_map={"Tumor": "red", "Normal": "green"}
            )
            fig.update_layout(height=600)
        
        elif plot_type == "heatmap":
            # 热图
            expr_mat = df[value_cols].T
            fig = px.imshow(
                expr_mat,
                labels=dict(x="分子", y="样本", color="Value"),
                x=expr_mat.columns,
                y=expr_mat.index,
                color_continuous_scale="RdBu_r",
                title=f"{omics_type.capitalize()} 表达/甲基化水平热图"
            )
            # 添加样本分组注释
            fig.add_annotation(
                text="样本分组：" + ", ".join([f"{g}: {sum(sample_info['Group']==g)}" for g in sample_group]),
                x=0.5, y=-0.1, showarrow=False, xref="paper", yref="paper"
            )
        
        elif plot_type == "volcano":
            # 火山图（Tumor vs Normal）
            tumor_samples = sample_info[sample_info["Group"] == "Tumor"].index.tolist()
            normal_samples = sample_info[sample_info["Group"] == "Normal"].index.tolist()
            
            if not tumor_samples or not normal_samples:
                return go.Figure().update_layout(title="需同时选择Tumor和Normal样本")
            
            # 计算均值和log2FC
            tumor_mean = df[tumor_samples].mean(axis=1)
            normal_mean = df[normal_samples].mean(axis=1)
            log2fc = np.log2((tumor_mean + 1) / (normal_mean + 1))
            
            # 计算p值
            pvals = []
            for idx in df.index:
                tumor_vals = df.loc[idx, tumor_samples].values
                normal_vals = df.loc[idx, normal_samples].values
                _, pval = stats.ttest_ind(tumor_vals, normal_vals)
                pvals.append(pval)
            
            volcano_df = pd.DataFrame({
                "ID": df.index,
                "log2FC": log2fc,
                "-log10p": -np.log10(pvals),
                "Significant": np.where((np.array(pvals) < 0.05) & (np.abs(log2fc) > 1), "Significant", "Not significant")
            })
            
            fig = px.scatter(
                volcano_df,
                x="log2FC",
                y="-log10p",
                color="Significant",
                hover_name="ID",
                title=f"{omics_type.capitalize()} 差异火山图",
                color_discrete_map={"Significant": "red", "Not significant": "gray"}
            )
            fig.add_vline(x=-1, line_dash="dash", line_color="black")
            fig.add_vline(x=1, line_dash="dash", line_color="black")
            fig.add_hline(y=-np.log10(0.05), line_dash="dash", line_color="black")
        
        else:
            fig = go.Figure().update_layout(title="不支持的可视化类型")
    
    # 跨组学可视化
    elif omics_type == "cross_omics":
        if plot_type == "scatter":
            # 相关性散点图
            fig = px.scatter(
                df,
                x="Protein_Expression",
                y="Methylation_Level",
                color="Group",
                hover_name="Sample",
                title=f"蛋白表达与甲基化水平相关性（{corr_type}，r={df['Correlation'].iloc[0]:.2f}，p={df['P_Value'].iloc[0]:.4f}）",
                trendline="ols",
                color_discrete_map={"Tumor": "red", "Normal": "green"}
            )
        else:
            fig = go.Figure().update_layout(title="跨组学仅支持散点图可视化")
    
    else:
        fig = go.Figure().update_layout(title="无效的分子类型")
    
    return fig

# 回调5：导出交互式报告
@callback(
    Output("report_link", "children"),
    Input("export_report", "n_clicks"),
    State("filtered_data", "data"),
    State("omics_type", "value"),
    State("plot_type", "value")
)
def export_report(n_clicks, filtered_data, omics_type, plot_type):
    if not n_clicks or not filtered_data:
        return ""
    
    # 生成HTML报告
    report_filename = f"Omics_Dashboard_Report_{pd.Timestamp.now().strftime('%Y%m%d_%H%M%S')}.html"
    
    # 简单的HTML模板
    html_content = f"""
    <!DOCTYPE html>
    <html>
    <head>
        <title>多组学分析报告</title>
        <meta charset="utf-8">
        <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.2.3/dist/css/bootstrap.min.css">
    </head>
    <body class="container mt-4">
        <h1>肝癌多组学分析报告</h1>
        <p>生成时间：{pd.Timestamp.now().strftime('%Y-%m-%d %H:%M:%S')}</p>
        <h2>1. 筛选条件</h2>
        <p>分子类型：{omics_type}</p>
        <p>可视化类型：{plot_type}</p>
        <h2>2. 可视化结果</h2>
        <div>{dcc.Graph(id="report_plot", figure=update_plot(filtered_data, omics_type, plot_type, sample_info["Group"].unique(), "pearson")).to_html()}</div>
        <h2>3. 数据表格</h2>
        <div>{pd.DataFrame(filtered_data).to_html(classes="table table-striped")}</div>
    </body>
    </html>
    """
    
    # 保存报告
    with open(report_filename, "w", encoding="utf-8") as f:
        f.write(html_content)
    
    # 返回下载链接
    return html.A(
        f"下载报告（{report_filename}）",
        href=report_filename,
        download=report_filename,
        className="btn btn-info"
    )

# ====================== 5. 运行App ======================
if __name__ == "__main__":
    app.run_server(debug=True, host="0.0.0.0", port=8050)

3. 本地运行与部署

（1）本地运行

bash 复制代码

python app.py

访问 http://localhost:8050 即可启动 Dashboard。

（2）服务器部署（Gunicorn+Nginx）

安装 Gunicorn：pip install gunicorn；
启动 Gunicorn：gunicorn app:server -w 4 -b 0.0.0.0:8050；

配置 Nginx 反向代理： nginx

复制代码

server {
    listen 80;
    server_name your_domain.com;
    
    location / {
        proxy_pass http://127.0.0.1:8050;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

五、交互式报告生成进阶技巧

1. 自动化报告模板

R Shiny：结合 R Markdown 模板，支持参数化生成报告（如根据用户筛选条件动态调整内容）；
Python Dash ：使用 plotly.io.write_html 导出交互式图表，结合 jinja2 模板生成完整 HTML 报告。

2. 自定义可视化组件

热图优化：添加行 / 列聚类、注释条、颜色刻度调整；
网络图：整合 ceRNA 网络，支持节点拖拽、缩放；
相关性分析：添加置信区间、分组拟合线；
多组学联动：点击一个组学的分子，自动高亮其他组学中的同源分子。

3. 性能优化（大数据量适配）

数据缓存 ：使用 shiny::reactiveCache（R）或 dash.Dcc.Store（Python）缓存筛选后的数据，避免重复计算；
懒加载：仅加载当前页面所需数据，大数据量时分页 / 分块加载；
异步处理 ：使用 shiny::future（R）或 celery（Python）处理耗时操作（如富集分析）；
图表优化：减少图表中的数据点数量（如热图仅展示 TOP100 分子），使用 WebGL 渲染。

4. 数据导出与分享

格式支持：导出 CSV（原始数据）、PNG/PDF（静态图表）、HTML（交互式报告）、JSON（接口数据）；
云端分享：集成 OneDrive/Google Drive，一键上传报告；
权限控制：添加用户登录模块（Shiny Server Pro/Dash Enterprise），限制数据访问。

六、常见问题与解决方案

1. 数据加载慢

原因：多组学数据量过大，一次性加载全部数据；
解决方案 ：分块加载数据、使用 feather/parquet 格式存储数据（比 CSV 快 10 倍）、添加数据压缩。

2. 图表渲染卡顿

原因：图表数据点过多（如热图包含数万个分子）；
解决方案：限制展示的分子数量（如 TOP200）、使用降维分析（PCA/UMAP）简化数据、关闭不必要的交互功能。

3. 跨组学数据整合冲突

原因：不同组学的样本 ID 不匹配、量纲差异大；
解决方案：统一样本 ID 命名规范、对数据进行归一化 / 标准化、添加样本 ID 映射表。

4. 部署后无法访问

原因：端口未开放、权限不足、依赖包缺失；
解决方案 ：开放防火墙端口（如 8050/3838）、使用非 root 用户运行、生成依赖清单（requirements.txt/renv.lock）。

5. 交互式报告导出失败

原因：路径权限不足、中文编码问题；
解决方案：指定可写的导出路径、使用 UTF-8 编码、添加异常捕获。

七、总结与展望

OmicsDashboard 彻底改变了多组学数据的展示方式 ------ 从 "静态图表" 升级为 "交互式探索平台"，不仅提升了生信分析的效率，还降低了非专业人员的使用门槛。R Shiny 适合快速搭建生信特色的 Dashboard，Python Dash 适合高交互性、大数据量的场景，可根据团队技术栈和项目需求选择。

未来，多组学可视化的发展方向包括：

AI 辅助探索：集成大语言模型，通过自然语言查询生成可视化图表；
3D 可视化：使用 WebGL 实现多组学数据的 3D 网络、空间转录组可视化；
移动端适配：优化 Dashboard 响应式布局，支持手机 / 平板访问；
多平台整合：对接数据库（TCGA/GTEx），一键导入公共多组学数据。

掌握 OmicsDashboard 搭建技能，能让你的多组学研究成果更直观、更易复用、更具说服力 ------ 无论是期刊投稿、项目汇报，还是团队协作，交互式可视化都能让数据 "活" 起来，充分展现研究的价值。

多组学可视化进阶：OmicsDashboard 搭建与交互式报告生成（R Shiny/Python Dash 实战）

一、多组学可视化的核心痛点与 OmicsDashboard 价值

1. 多组学数据的核心特征

2. 传统静态可视化的痛点

3. OmicsDashboard 核心价值

二、技术选型与环境配置

1. R Shiny vs Python Dash 对比（生信场景适配）

2. 环境配置

（1）R Shiny 环境配置

（2）Python Dash 环境配置

三、实战 1：基于 R Shiny 搭建多组学 Dashboard（转录组 + 代谢组）

1. Dashboard 架构设计

2. 核心代码实现（完整可运行）

步骤 1：准备多组学测试数据

步骤 2：编写 Shiny App 代码（app.R）

附录：数据文件说明

填充模板参数

提取图表代码（简化版，实际可保存图片后插入）

填充模板

保存报告模板

渲染报告

更新报告路径

下载报告

4. 运行 App

（3）云端部署（Shinyapps.io）

四、实战 2：基于 Python Dash 搭建多组学 Dashboard（蛋白组 + 甲基化组）

1. Dashboard 架构设计

2. 核心代码实现（完整可运行）

步骤 1：准备测试数据

步骤 2：编写 Dash App 代码（app.py）

3. 本地运行与部署

（1）本地运行

（2）服务器部署（Gunicorn+Nginx）

五、交互式报告生成进阶技巧

1. 自动化报告模板

2. 自定义可视化组件

3. 性能优化（大数据量适配）

4. 数据导出与分享

六、常见问题与解决方案

1. 数据加载慢

2. 图表渲染卡顿

3. 跨组学数据整合冲突

4. 部署后无法访问

5. 交互式报告导出失败

七、总结与展望

步骤 2：编写 Shiny App 代码（`app.R`）

步骤 2：编写 Dash App 代码（`app.py`）