【TIDE DIARY 7】临床指南转公众版系统升级详解

临床指南转公众版系统升级详解:功能增强与体验优化

项目概述

近期我们对临床指南转公众版系统(CPG2PVG-AI)进行了一系列重要升级,主要包括C类指南的精细化分类、PDF导入功能、界面交互优化等多个方面。本文将详细介绍各项改进的具体内容和实现方式。

一、C类指南分类体系精细化

1.1 分类体系扩充

我们对C类指南的分类体系进行了全面扩充,从原有的基础分类扩展为6大章节20个子主题的完整体系:

python 复制代码
"C": {
    # I. Prevention-Oriented Recommendations
    "1. Primary Prevention Recommendations (Measures for Non-Diseased Populations)":
        "Are there preventive suggestions for individuals without the disease? Including primary prevention measures such as vaccination, lifestyle interventions, and screening timing.",
    "2. Secondary Prevention Recommendations (Risk Reduction for High-Risk Groups)":
        "Are there risk reduction suggestions for high-risk groups (e.g., those with family history, underlying diseases)? Including secondary prevention measures such as regular check-ups, pharmacological prevention, and behavioral adjustments.",

    # II. Diagnosis-Oriented Recommendations
    "3. Diagnostic Criteria Recommendations (Confirmation Basis)":
        "Is the diagnostic criteria for the disease clearly defined? Including core diagnostic basis such as symptom combinations, laboratory test thresholds, and imaging features.",
    "4. Diagnostic Process Recommendations (Examination Sequence)":
        "Is there a recommended diagnostic process? Including which examinations to perform first, conditions for subsequent further examinations, and examination choices corresponding to different symptoms.",
    "5. Differential Diagnosis Recommendations (Distinguishing Similar Diseases)":
        "Are there suggestions for differential diagnosis? Including how to distinguish the disease from other diseases with similar symptoms, key differential indicators or examinations.",

    # III. Treatment-Oriented Recommendations (Split by Treatment Method)
    "6. Pharmacological Treatment Recommendations (Medication Regimens)":
        "Are there specific suggestions for pharmacological treatment? Including recommended drug types, dosages, courses of treatment, applicable populations, contraindications, and precautions.",
    "7. Surgical Treatment Recommendations (Surgical Indications and Methods)":
        "Are there suggestions for surgical treatment? Including applicable populations for surgery, surgical method selection, surgical timing, and pre- and postoperative precautions.",
    "8. Interventional Treatment Recommendations (Minimally Invasive Interventions)":
        "Are there suggestions for interventional treatment? Including indications for interventional treatment, specific methods, applicable scenarios, and risk prompts.",
    "9. Physical Therapy Recommendations (Non-Pharmacological, Non-Surgical Interventions)":
        "Are there suggestions for physical therapy? Including rehabilitation training, physical therapy methods, applicable stages, and training frequency and intensity.",
    "10. Traditional Chinese Medicine/Alternative Therapy Recommendations (Complementary Treatment Regimens)":
        "Are there recommendations for traditional Chinese medicine or alternative therapies? Including applicable scenarios and precautions for complementary treatment measures such as acupuncture, Chinese herbal medicine, and dietary therapy.",

    # IV. Prognosis and Management-Oriented Recommendations
    "11. Disease Monitoring Recommendations (Condition Tracking)":
        "Are there suggestions for disease monitoring? Including monitoring indicators, monitoring frequency, handling of abnormal situations, and long-term follow-up plans.",
    "12. Complication Management Recommendations (Prevention and Treatment of Complications)":
        "Are there suggestions for the prevention and treatment of complications? Including early warning signs, intervention measures, and treatment regimens for common complications.",
    "13. Recurrence Prevention Recommendations (Postoperative/Cure Recurrence Prevention)":
        "Are there suggestions for preventing disease recurrence? Including postoperative care, lifestyle adjustments after cure, and regular review items.",

    # V. Special Population Recommendations
    "14. Pediatric/Adolescent-Specific Recommendations":
        "Are there exclusive recommendations for pediatric or adolescent patients? Including dosage adjustments, treatment method selection, special precautions, and age-specific care plans.",
    "15. Geriatric-Specific Recommendations":
        "Are there exclusive recommendations for elderly patients? Including medication safety, surgical tolerance assessment, comorbidity management, and age-related physiological adaptation measures.",
    "16. Pregnant/Lactating Women-Specific Recommendations":
        "Are there exclusive recommendations for pregnant or lactating women? Including medication safety, treatment method selection, impacts on fetus/infant, and prenatal/postpartum care adjustments.",
    "17. Immunocompromised Patients-Specific Recommendations":
        "Are there exclusive recommendations for immunocompromised patients (e.g., transplant recipients, HIV patients)? Including infection prevention, treatment efficacy adjustments, and risk mitigation strategies.",

    # VI. Decision-Making and Communication-Oriented Recommendations
    "18. Benefit-Risk Assessment Recommendations (Treatment Selection Basis)":
        "Are there suggestions for assessing treatment benefits and risks? Including how to weigh pros and cons of different regimens to support patients in making informed decisions.",
    "19. Doctor-Patient Communication Focus Recommendations (Key Discussion Points)":
        "Are there recommendations for key doctor-patient communication content? Including core information to discuss to ensure shared decision-making and patient understanding.",
    "20. Informed Consent Guidance Recommendations (Treatment Consent Basis)":
        "Are there suggestions for informed consent? Including key information patients must understand before treatment (e.g., risks, benefits, alternatives) to give valid consent."
},

1.2 公众版生成规则优化

同步重写了C类公众版生成提示词,确保输出严格按照六大章节结构展开:

python 复制代码
"C": """
        You are a **clinical-to-public medical guidelines transformation assistant**.  
        You will receive recommendations extracted from **clinical medical guidelines** (C Category, 20 refined subtypes).  
        Your task is to convert them into a **Public Version Guideline (PVG)** for the general public, following these strict rules:

        ## 1. Core Principle
        Retain ALL recommendations and their original intent. Adjust only technicality and readability---never omit key information. Prioritize patient-centric language over physician-oriented wording.

        ## 2. Structure of the PVG
        Organize recommendations in the following fixed sequence for clarity:
        1. Prevention (Primary → Secondary)
        2. Diagnosis (Criteria → Process → Differential Diagnosis)
        3. Treatment (Pharmacological → Surgical → Interventional → Physical → TCM/Alternative)
        4. Prognosis & Management (Monitoring → Complication Management → Recurrence Prevention)
        5. Special Populations (Pediatric/Adolescent → Geriatric → Pregnant/Lactating Women → Immunocompromised Patients)
        6. Decision-Making & Communication (Benefit-Risk Assessment → Doctor-Patient Communication → Informed Consent Guidance)

        ## 3. Subtype-Specific Rewriting Rules
        - **Prevention**: Use "You can..." or "It is recommended to..." for actionable steps. Explain briefly why each measure works (e.g., "Vaccination helps your body build immunity against the disease").
        - **Diagnosis**: Avoid medical jargon (e.g., replace "serum creatinine" with "a blood test for kidney function"). Use Mermaid flowcharts for diagnostic processes to visualize steps.
        - **Treatment**: Clarify "who is eligible" and "what to expect" for each treatment. Highlight key precautions (e.g., "Do not take this medication if you are allergic to its ingredients") and avoid technical terms (e.g., replace "contraindications" with "who should not use this").
        - **Prognosis & Management**: Use simple timelines (e.g., "Get a follow-up check every 3 months for the first year") and bullet points for monitoring items. Explain abnormal results in plain language (e.g., "If your test result is higher than X, contact your doctor immediately").
        - **Special Populations**: Clearly label each group (e.g., "For Children Under 12:" or "For Pregnant Women:"). Adjust language to be age-appropriate or family-friendly (e.g., for kids: "Your doctor may adjust the medicine dose based on your age and weight").
        - **Decision-Making**: Frame benefit-risk trade-offs as "Pros: ... Cons: ..." to simplify understanding. Emphasize "discuss with your doctor" for personalized choices (e.g., "Talk to your doctor to see which treatment is best for you").

        ## 4. Language & Format
        - Use plain English with short sentences (max 20 words per sentence).
        - Use headings/subheadings for each subtype (e.g., "1. Primary Prevention Recommendations") to improve readability.
        - Use bullet points for lists and Mermaid charts/flowcharts for complex information (diagnostic processes, risk factors, progression).
        - Maintain a supportive, empowering tone (e.g., "You and your doctor can work together to choose the best care plan").

        ## 5. Output Requirements
        Deliver a well-structured PVG directly---no process explanations. Ensure each of the 20 subtypes is addressed (omit only if no relevant content exists). Preserve all key details while simplifying technical language. Do not add external information beyond the original guideline.
        """,

二、PDF导入功能实现

2.1 依赖引入与配置

系统新增了PDF解析能力,通过PyPDF库实现文本提取:

python 复制代码
from streamlit_extras.bottom_container import bottom
from pypdf import PdfReader
import io

# Ensure upload_files directory path is relative to current script
file_path = os.path.join(os.path.dirname(__file__), "upload_files") + os.sep
os.makedirs(file_path, exist_ok=True)  # Ensure directory exists

2.2 PDF文本提取逻辑

新增PDF文本提取与上传逻辑,支持自动生成临时Markdown文件:

python 复制代码
def extract_pdf_text(file) -> str:
    """Extract text content from PDF file"""
    try:
        pdf_reader = PdfReader(file)
        text_content = []
        for page in pdf_reader.pages:
            text_content.append(page.extract_text())
        return "\n\n".join(text_content)
    except Exception as e:
        st.error(f"PDF parsing failed: {str(e)}")
        return ""

def upload_file(file):
    """Upload file and return file path"""
    file_extension = os.path.splitext(file.name)[1].lower()
    os.makedirs(file_path, exist_ok=True)
    
    if file_extension == ".pdf":
        # PDF file: extract text content, save as temporary MD file
        file.seek(0)  # Reset file pointer
        text_content = extract_pdf_text(file)
        if text_content:
            # Save PDF text content as MD file
            md_filename = os.path.splitext(file.name)[0] + "_extracted.md"
            md_path = file_path + md_filename
            with open(md_path, "w", encoding="utf-8") as f:
                f.write(text_content)
            # Also save original PDF file (backup)
            pdf_path = file_path + file.name
            file.seek(0)
            with open(pdf_path, "wb") as f:
                f.write(file.getbuffer())
            return md_path
        else:
            return None
    elif file_extension == ".md":
        # Markdown file: save and return file path
        file_path_full = file_path + file.name
        with open(file_path_full, "wb") as f:
            f.write(file.getbuffer())
        return file_path_full
    else:
        st.error(f"Unsupported file format: {file_extension}")
        return None

2.3 前端上传组件扩展

拓展上传组件支持PDF格式,确保前端入口可选.pdf/.md:

python 复制代码
uploaded_file = st.file_uploader("upload files", type=["md", "pdf"], accept_multiple_files=False, label_visibility="collapsed")

三、界面交互优化

3.1 一键转换功能

将原有的聊天机器人式交互改为简洁的一键转换模式:

python 复制代码
# Add conversion button
convert_button = st.button("Convert to Public Version", type="primary", use_container_width=True)

# If conversion button is clicked, set question to trigger conversion process
if convert_button:
    st.session_state["trigger_conversion"] = True

question = st.session_state.get("trigger_conversion", None)

if question:
    # Clear trigger flag
    st.session_state["trigger_conversion"] = False
    
    # Check if there is an uploaded file
    if not uploaded_file:
        st.warning("Please upload a file (PDF or MD) before clicking the convert button")
    else:
        # Use correct directory path (remove trailing separator)
        upload_folder = file_path.rstrip(os.sep)
        delete_markdown_files(folder_path=upload_folder, recursive=False)
        
        with st.chat_message("user"):
            st.markdown("Converting to public version")

        # Get AI response (using fixed conversion instruction)
        conversion_prompt = "Convert to public version"
        with st.chat_message("assistant"):
            answer = fetch_response(state, conversion_prompt)
            st.markdown(answer)

3.2 用户友好的输出命名

统一下载文件名为PVG_<原始文件名>.md,并在按钮上标注字数:

python 复制代码
# Download button logic
if st.session_state.settings.get("history"):
    assistant_outputs = [item["assistant"] for item in st.session_state.settings["history"]]

    # Use join to improve concatenation performance
    md_content = "\n\n---\n\n".join(assistant_outputs)

    # Generate download filename: PVG_ + original filename
    original_filename = st.session_state.settings.get("original_filename", "public_version")
    download_filename = f"PVG_{original_filename}.md"

    st.download_button(
        label=f"Download Public Version (Markdown) | Characters:{str(len(md_content))}",
        data=md_content.encode("utf-8"),
        file_name=download_filename,
        mime="text/markdown"
    )

四、代码质量与稳定性提升

4.1 路径处理优化

固定上传目录为相对路径,确保跨平台可用性:

python 复制代码
file_path = os.path.join(os.path.dirname(__file__), "upload_files") + os.sep
os.makedirs(file_path, exist_ok=True)  # Ensure directory exists
load_dotenv(verbose=True)

4.2 文件清理安全增强

增强Markdown清理函数,增加根目录保护与缺省容错:

python 复制代码
def delete_markdown_files(folder_path: str, recursive: bool = False) -> None:
    """
    Delete all .md files in specified folder
    :param folder_path: target folder path
    :param recursive: whether to recursively delete subdirectory files
    """
    folder = Path(folder_path).resolve()

    # Security check to avoid accidental deletion of root directory
    if folder in [Path("/"), Path("C:/"), Path("D:/")]:
        raise ValueError(f"Dangerous operation: deletion not allowed in root directory {folder}")

    if not folder.exists() or not folder.is_dir():
        # Return directly if directory doesn't exist, no need to delete
        return

    # Traverse files
    pattern = "**/*.md" if recursive else "*.md"
    md_files = list(folder.glob(pattern))

    if not md_files:
        print("No Markdown files found")
        return

    print(f"Found {len(md_files)} Markdown files")

    for f in md_files:
        try:
            f.unlink()
            print(f"Deleted: {f}")
        except Exception as e:
            print(f"Failed to delete {f}: {e}")

4.3 模型配置优化

调整模型配置加载顺序,优先读取环境变量并提示缺失Key:

python 复制代码
MODEL_CONFIGS: Dict[str,Dict[str,str]] = {
    "deepseek":{
        "base_url": "https://api.deepseek.com/v1",
        # Priority read environment variable DEEPSEEK_API_KEY
        "api_key": os.getenv("DEEPSEEK_API_KEY", "<your api key>"),
        "model": "deepseek-chat"
    },
    "qwen-sf":{
        "base_url":"https://api.siliconflow.cn/v1",
        "api_key": os.getenv("SILICONFLOW_API_KEY", "<your api key>"),
        "model": "Qwen/Qwen2.5-72B-Instruct"
    },
}

# Key parameter check and prompt
if not config.get("api_key"):
    raise ValueError(f"API Key for {llm_type} not found, please set the corresponding KEY in environment variables and try again")

# If still using example/hardcoded Key, give warning
if llm_type == "deepseek" and os.getenv("DEEPSEEK_API_KEY") is None:
    logger.warning("DEEPSEEK_API_KEY not set, using default Key in code (may not be available)")
if llm_type == "qwen-sf" and os.getenv("SILICONFLOW_API_KEY") is None:
    logger.warning("SILICONFLOW_API_KEY not set, using default Key in code (may not be available)")

4.4 追踪系统优化

关闭LangSmith默认采集,避免外网依赖:

python 复制代码
langsmith_used = False 

if langsmith_used:
    os.environ["LANGCHAIN_TRACING_V2"] = "true"
    os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY", "")  

logging.basicConfig(level=logging.INFO,
                   format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
                   encoding="utf=8")
logger = logging.getLogger(__name__)

4.5 模型选择扩展

侧边栏补充SiliconFlow Qwen选项,保证前后端模型配置一致:

python 复制代码
model_options = {"deepseek":"deepseek",
                "gpt-4o-mini":"gpt-4o-mini",
                "gpt-5-mini":"gpt-5-mini",
                "claude-small":"claude-small",
                "gpt-4.1":"gpt-4.1",
                "gpt-5":"gpt-5",
                "qwen-long":"qwen-long",
                "qwen-sf (SiliconFlow)":"qwen-sf"}

with st.sidebar:
    st.header("System Settings")
    chosen_model_key = st.selectbox(
        "Choose Model",
        options=list(model_options.keys()),
        index=list(model_options.values()).index(st.session_state.settings["model_name"])
    )
    st.session_state.settings["model_name"] = model_options[chosen_model_key]
相关推荐
B站_计算机毕业设计之家3 小时前
深度学习:python人脸表情识别系统 情绪识别系统 深度学习 神经网络CNN算法 ✅
python·深度学习·神经网络·算法·yolo·机器学习·cnn
合作小小程序员小小店3 小时前
web网页开发,在线%聚类,微博,舆情%系统,基于python,pycharm,django,nlp,kmeans,mysql
python·pycharm·kmeans·聚类·sklearn·kmean
Dan.Qiao3 小时前
python读文件readline和readlines区别和惰性读
开发语言·python·惰性读文件
闲人编程4 小时前
将你的旧手机变成监控摄像头(Python + OpenCV)
python·opencv·智能手机·监控·codecapsule·oasis
007php0074 小时前
大厂深度面试相关文章:深入探讨底层原理与高性能优化
java·开发语言·git·python·面试·职场和发展·性能优化
SunnyDays10114 小时前
Python 复制和移动 Excel 工作表并保留所有格式:详解
python·复制excel工作表·移动excel工作表·重新排列excel工作表
不会编程的小寒4 小时前
C++初始继承,继承中构造、析构顺序
开发语言·python
Mos_x5 小时前
关于我们的python日记本
开发语言·python
十重幻想5 小时前
reshape的共享内存
python