AIGC多模态工业化实战:Stable Diffusion/Flux/ComfyUI生产流水线

CSDN 2026榜单:AIGC内容生产进入工业化阶段,Stable Diffusion/Flux/ComfyUI成为设计师标配。本文深度解析多模态AI工业化流水线:从Prompt工程、RAG增强生成、到ComfyUI工作流编排,再到AI生成内容的合规与溯源,覆盖完整生产链路。

1. 多模态AI工业化2026格局

1.1 多模态技术演进

复制代码
多模态AI演进时间线:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
2023: 单模态独霸
  → LLM只能处理文本,图像生成独立发展

2024: 多模态融合元年
  → GPT-4V/Gemini Pro多模态理解
  → DALL-E 3/Stable Diffusion 3图像生成
  → Whisper语音识别 + TTS语音合成

2025: Agent + 多模态
  → 多模态Agent:看、听、说、做一体化
  → 端到端视频生成(Runway Gen-3/Sora)

2026: 工业化生产 ⭐
  → AIGC进入企业内容生产流水线
  → ComfyUI成为事实工作流标准
  → Flux 1.0开源,Midjourney Pro主导高端市场
  → AI生成内容合规与溯源成为刚需
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1.2 工业流水线架构

复制代码
┌──────────────────────────────────────────────────────────────────┐
│                  AIGC工业化生产流水线架构                          │
├──────────────────────────────────────────────────────────────────┤
│                                                                   │
│  需求层                                                           │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐              │
│  │电商主图 │  │营销海报 │  │社交内容 │  │视频脚本 │              │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘              │
│       └─────────────┴─────────────┴─────────────┘                 │
│                         ↓                                         │
│  Prompt层 (LLM增强)                                               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐            │
│  │ 结构化Prompt │  │  Prompt RAG  │  │  Prompt模板  │            │
│  │  Template   │  │   检索优化   │  │   库管理     │            │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘            │
│         └──────────────────┴──────────────────┘                   │
│                            ↓                                      │
│  生成层                                                           │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐              │
│  │文本生成 │  │图像生成 │  │音频生成 │  │视频生成 │              │
│  │GPT-4o  │  │Flux/SD3 │  │ElevenLabs│ │Sora/Gen3│              │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘              │
│       └─────────────┴─────────────┴─────────────┘                 │
│                            ↓                                      │
│  后处理层                                                         │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐                          │
│  │图像优化 │  │批量处理 │  │自动标注 │                          │
│  │ upscale │  │Magick    │  │ CLIP     │                          │
│  └────┬────┘  └────┬────┘  └────┬────┘                          │
│       └─────────────┴─────────────┘                              │
│                            ↓                                      │
│  合规层                                                           │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐                          │
│  │C2PA溯源 │  │NSFW过滤 │  │版权检测 │                          │
│  │内容凭证 │  │安全审核 │  │相似度   │                          │
│  └─────────┘  └─────────┘  └─────────┘                          │
│                                                                   │
└──────────────────────────────────────────────────────────────────┘

2. Prompt工程进阶

2.1 结构化Prompt框架

python 复制代码
from pydantic import BaseModel, Field
from typing import List, Optional, Literal
from dataclasses import dataclass, field
import json

# ===== 结构化Prompt模板 =====

class PromptContext(BaseModel):
    """Prompt上下文"""
    brand: str = Field(description="品牌名称")
    product: str = Field(description="产品名称")
    target_audience: str = Field(description="目标受众")
    tone: str = Field(description="语气风格")
    language: str = Field(default="zh-CN", description="语言")

class ImageStyle(BaseModel):
    """图像风格"""
    type: Literal["写实", "插画", "3D渲染", "动漫", "像素"] = "写实"
    mood: Literal["活力", "安静", "科技感", "自然", "奢华"] = "科技感"
    color_palette: List[str] = Field(default_factory=lambda: ["蓝色", "白色"])
    lighting: Literal["自然光", "柔光", "强对比", "暗调"] = "科技感"

class ContentSpec(BaseModel):
    """内容规格"""
    format: Literal["1:1", "16:9", "9:16", "4:3", "3:2"] = "16:9"
    resolution: tuple = (1920, 1080)
    platform: Literal["小红书", "抖音", "微信", "官网", "电商"] = "官网"

@dataclass
class StructuredPrompt:
    """
    结构化Prompt模板
    从配置生成优化Prompt
    """
    context: PromptContext
    style: ImageStyle
    spec: ContentSpec
    extra_requirements: List[str] = field(default_factory=list)
    
    def generate(self) -> str:
        """生成完整Prompt"""
        parts = []
        
        # 1. 场景描述
        parts.append(f"品牌: {self.context.brand}")
        parts.append(f"产品: {self.context.product}")
        parts.append(f"目标受众: {self.context.target_audience}")
        
        # 2. 风格指令
        style_parts = [
            f"风格: {self.style.type}",
            f"情绪: {self.style.mood}",
            f"色调: {', '.join(self.style.color_palette)}",
            f"光照: {self.style.lighting}",
        ]
        parts.append(" | ".join(style_parts))
        
        # 3. 技术规格
        spec_parts = [
            f"比例: {self.spec.format}",
            f"分辨率: {self.spec.resolution[0]}x{self.spec.resolution[1]}",
            f"平台: {self.spec.platform}",
        ]
        parts.append(" | ".join(spec_parts))
        
        # 4. 额外要求
        if self.extra_requirements:
            parts.append("额外要求: " + "; ".join(self.extra_requirements))
        
        # 5. 质量指令
        quality = "4K, 高细节, 专业摄影, 无水印"
        parts.append(f"质量: {quality}")
        
        return "\n".join(parts)
    
    def to_image_prompt(self) -> str:
        """生成图像生成Prompt (英文)"""
        parts = [
            f"{self.context.product}",
            f"product photography",
            f"{self.style.mood} mood",
            f"{self.style.type} style",
            f"{', '.join(self.style.color_palette)} color scheme",
            f"{self.style.lighting} lighting",
            "professional, 4K, high detail",
            f"{self.spec.format} aspect ratio",
        ]
        return ", ".join(parts)
    
    def to_negative_prompt(self) -> str:
        """生成负面Prompt (图像生成)"""
        return (
            "blurry, low quality, watermark, text, logo, "
            "distorted, deformed, ugly, bad anatomy, "
            "extra fingers, mutated hands, poorly drawn, "
            "noise, compression artifacts, jpeg artifacts, "
            "cropped, worst quality"
        )


# ===== Prompt RAG检索 =====

class PromptRetriever:
    """
    Prompt模板RAG检索系统
    基于历史成功Prompt进行检索和复用
    """
    def __init__(self, vector_store):
        self.store = vector_store
        self.templates = {}
    
    def store_template(self, template_id: str, 
                      template: str, 
                      category: str,
                      success_metrics: dict):
        """存储成功的Prompt模板"""
        self.store.add(
            text=template,
            metadata={
                "id": template_id,
                "category": category,
                "success_rate": success_metrics.get("rate", 0),
                "usage_count": success_metrics.get("count", 0),
            }
        )
        self.templates[template_id] = {
            "template": template,
            "category": category,
            "metrics": success_metrics,
        }
    
    def retrieve(self, query: str, category: str = None, 
                top_k: int = 5) -> List[dict]:
        """检索相关模板"""
        results = self.store.search(
            query=query,
            filter={"category": category} if category else None,
            top_k=top_k
        )
        
        # 按成功率加权排序
        scored = []
        for r in results:
            template = self.templates.get(r["id"], {})
            score = r["score"] * 0.5 + template.get("metrics", {}).get("rate", 0) * 0.5
            scored.append({
                "template": template.get("template", r["text"]),
                "category": r["metadata"]["category"],
                "score": score,
                "usage_count": template.get("metrics", {}).get("count", 0),
            })
        
        return sorted(scored, key=lambda x: x["score"], reverse=True)
    
    def generate_from_template(self, base_template: str,
                               context: PromptContext,
                               style: ImageStyle) -> str:
        """基于模板填充上下文"""
        prompt = base_template
        
        # 简单变量替换
        replacements = {
            "{brand}": context.brand,
            "{product}": context.product,
            "{audience}": context.target_audience,
            "{tone}": context.tone,
            "{style_type}": style.type,
            "{mood}": style.mood,
        }
        
        for key, value in replacements.items():
            prompt = prompt.replace(key, str(value))
        
        return prompt


# 使用示例
context = PromptContext(
    brand="华为",
    product="MateBook X Pro",
    target_audience="25-35岁职场精英",
    tone="科技感、高端",
)

style = ImageStyle(
    type="3D渲染",
    mood="科技感",
    color_palette=["深空灰", "流光金"],
    lighting="暗调",
)

spec = ContentSpec(
    format="16:9",
    resolution=(2560, 1440),
    platform="官网",
)

prompt_obj = StructuredPrompt(context, style, spec)
print("完整Prompt:")
print(prompt_obj.generate())
print("\n图像Prompt (英文):")
print(prompt_obj.to_image_prompt())
print("\n负面Prompt:")
print(prompt_obj.to_negative_prompt())

2.2 多模态Prompt模板

python 复制代码
from typing import Dict, List, Any, Optional
import json

# ===== 多模态内容生成Prompt =====

class MultimodalPromptGenerator:
    """
    多模态内容生成Prompt生成器
    支持: 文本 + 图像 + 视频 + 音频 的协调生成
    """
    
    # 预定义模板库
    TEMPLATES = {
        "product_launch": {
            "description": "新品发布内容",
            "text_prompt": """
你是一位{brand}的产品营销专家。请为{product}的新品发布创作营销内容:

## 内容要求
- 受众: {target_audience}
- 语气: {tone}
- 长度: {length}字

## 输出格式
1. 标题 (吸引眼球,不超过20字)
2. 主文案 (情感共鸣+产品卖点)
3. 行动号召CTA (引导转化)
4. 标签建议 (#开头,3-5个)

## 约束
- 禁止夸大宣传
- 禁止提及竞品
- 符合《广告法》要求
""",
            "image_prompt": """
Product: {product}
Brand: {brand}
Style: {style_type}, {mood} mood
Color palette: {color_palette}
Lighting: {lighting}
Platform: {platform}
Quality: 4K, professional, high detail
Aspect ratio: {aspect_ratio}
Avoid: text, watermark, logo
""",
            "video_script": """
[开场 X秒] {mood}风格, 展示{product}外观
[核心卖点 3-5秒] 突出{highlight}
[场景演示 5-8秒] 演示{use_case}
[结尾 2秒] 品牌logo + CTA
背景音乐: {music_style}
字幕: 白色无衬线字体
"""
        },
        
        "social_media": {
            "description": "社交媒体内容",
            "text_prompt": """
创建适合{platform}平台的{content_type}内容:

- 品牌: {brand}
- 产品: {product}  
- 目标: {goal}
- 风格: {style}

要求:
- {platform}平台特性: {platform_tip}
- 互动元素: {interaction_element}
- 发布时间: {posting_time}

生成内容需包含: 标题/文案/标签/配图描述
""",
            "image_prompt": """
社交媒体配图, {content_type}类型
{product}为核心元素
{brand}视觉风格: {visual_style}
情绪氛围: {mood}
色彩方案: {color_palette}
构图: {composition}
文字: 预留标题位置
平台: {platform}
避免: {avoid_elements}
"""
        },
        
        "ecommerce": {
            "description": "电商主图/详情页",
            "text_prompt": """
为电商平台{platform}的{product}创作销售文案:

## 产品信息
- 品牌: {brand}
- 卖点: {key_features}
- 价格带: {price_range}
- 目标人群: {target_audience}

## 平台规则
- {platform}标题规则: {title_rule}
- 详情页规范: {spec_rule}

## 输出
1. 主标题 (30字内,含核心卖点)
2. 副标题 (20字内,强化购买理由)
3. 详情页卖点 (3-5个,每个15字内)
4. SKU描述模板
""",
            "image_prompt": """
电商主图设计规范:
- 产品: {product}, 居中展示, {angle}角度
- 背景: {background}色纯色或渐变
- 文字: {title_text}, 白色/黑色, 无衬线
- 风格: {style}, {mood}
- 比例: {aspect_ratio} (适配{platform})
- 质量: 4K, 无压缩, 无水印
- 差异化: {differentiation}
"""
        }
    }
    
    def __init__(self):
        self.current_template = None
    
    def select_template(self, content_type: str) -> "MultimodalPromptGenerator":
        """选择内容模板"""
        if content_type not in self.TEMPLATES:
            raise ValueError(f"Unknown template: {content_type}")
        self.current_template = content_type
        return self
    
    def fill(self, **kwargs) -> Dict[str, str]:
        """填充模板"""
        if not self.current_template:
            raise ValueError("No template selected")
        
        template = self.TEMPLATES[self.current_template]
        result = {}
        
        for key, tmpl in template.items():
            if key == "description":
                continue
            
            # 简单变量替换
            filled = tmpl
            for k, v in kwargs.items():
                placeholder = f"{{{k}}}"
                filled = filled.replace(placeholder, str(v))
            
            result[key] = filled.strip()
        
        return result
    
    def batch_generate(self, templates: List[str], 
                     context: Dict) -> Dict[str, Dict[str, str]]:
        """批量生成多套内容"""
        results = {}
        for tpl in templates:
            self.current_template = tpl
            results[tpl] = self.fill(**context)
        return results


# ===== Prompt质量评分 =====

class PromptEvaluator:
    """
    Prompt质量评估
    基于多个维度评估Prompt效果
    """
    
    def __init__(self):
        self.weights = {
            "clarity": 0.25,       # 清晰度
            "specificity": 0.25,  # 具体性
            "constraint": 0.20,    # 约束明确
            "safety": 0.20,        # 安全性
            "completeness": 0.10,  # 完整性
        }
    
    def evaluate(self, prompt: str, modality: str = "image") -> Dict[str, Any]:
        """评估Prompt质量"""
        
        # 1. 清晰度
        clarity_score = self._evaluate_clarity(prompt)
        
        # 2. 具体性 (是否包含具体描述)
        specificity_score = self._evaluate_specificity(prompt, modality)
        
        # 3. 约束明确性
        constraint_score = self._evaluate_constraints(prompt)
        
        # 4. 安全性
        safety_score = self._evaluate_safety(prompt)
        
        # 5. 完整性
        completeness_score = self._evaluate_completeness(prompt, modality)
        
        # 加权总分
        total = sum([
            clarity_score * self.weights["clarity"],
            specificity_score * self.weights["specificity"],
            constraint_score * self.weights["constraint"],
            safety_score * self.weights["safety"],
            completeness_score * self.weights["completeness"],
        ])
        
        return {
            "total_score": round(total, 2),
            "clarity": round(clarity_score, 2),
            "specificity": round(specificity_score, 2),
            "constraint": round(constraint_score, 2),
            "safety": round(safety_score, 2),
            "completeness": round(completeness_score, 2),
            "suggestions": self._generate_suggestions(prompt, modality),
        }
    
    def _evaluate_clarity(self, prompt: str) -> float:
        """评估清晰度"""
        if len(prompt) < 10:
            return 0.0
        if len(prompt) > 1000:
            return 0.7
        
        # 检查是否包含模糊词
        vague_words = ["some", "thing", "stuff", "大概", "可能", "也许"]
        vague_count = sum(1 for w in vague_words if w in prompt.lower())
        
        score = 1.0 - (vague_count * 0.1)
        return max(0.0, min(1.0, score))
    
    def _evaluate_specificity(self, prompt: str, modality: str) -> float:
        """评估具体性"""
        if modality == "image":
            # 检查是否包含关键技术参数
            detail_indicators = [
                "style", "lighting", "color", "resolution", "ratio",
                "风格", "光照", "色彩", "分辨率", "比例"
            ]
        else:
            detail_indicators = [
                "长度", "format", "structure", "platform", "audience"
            ]
        
        matched = sum(1 for ind in detail_indicators if ind.lower() in prompt.lower())
        return min(1.0, matched / 5)
    
    def _evaluate_constraints(self, prompt: str) -> float:
        """评估约束明确性"""
        # 检查是否有明确的约束
        constraint_patterns = [
            "avoid", "not include", "without", "禁止", "不要",
            "must include", "should be", "必须", "应该",
        ]
        
        matched = sum(1 for pat in constraint_patterns if pat in prompt.lower())
        return min(1.0, matched * 0.3)
    
    def _evaluate_safety(self, prompt: str) -> float:
        """评估安全性"""
        # 检查是否有危险内容
        unsafe_patterns = [
            "weapon", "violence", "blood", "nude", "nsfw",
            "武器", "暴力", "血腥", "裸露",
        ]
        
        has_unsafe = any(pat in prompt.lower() for pat in unsafe_patterns)
        return 0.0 if has_unsafe else 1.0
    
    def _evaluate_completeness(self, prompt: str, modality: str) -> float:
        """评估完整性"""
        base_score = 0.5
        
        if modality == "image":
            required = ["subject", "style", "quality"]
        else:
            required = ["audience", "platform", "goal"]
        
        matched = sum(1 for r in required if r.lower() in prompt.lower())
        return min(1.0, base_score + matched * 0.1)
    
    def _generate_suggestions(self, prompt: str, modality: str) -> List[str]:
        """生成改进建议"""
        suggestions = []
        
        if len(prompt) < 50:
            suggestions.append("Prompt过短,建议添加更多细节描述")
        
        if self._evaluate_specificity(prompt, modality) < 0.5:
            suggestions.append("建议添加风格、颜色、构图等具体参数")
        
        if self._evaluate_constraints(prompt) < 0.3:
            suggestions.append("建议添加'避免'约束,明确不要的元素")
        
        if self._evaluate_safety(prompt) == 0:
            suggestions.append("检测到潜在不安全内容,请修改Prompt")
        
        return suggestions

3. ComfyUI工作流编排

3.1 ComfyUI核心概念

复制代码
ComfyUI节点体系:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
输入节点:
├── LoadImage → 加载图像
├── LoadVideo → 加载视频  
├── CLIPTextEncode → 文本编码
└── ModelLoader → 加载模型

处理节点:
├── KSampler → 采样生成
├── VAEDecode → 图像解码
├── VAEEncode → 图像编码
├── ImageScale → 图像缩放
├── ImagePad → 图像填充
└── ControlNet → 控制网络

输出节点:
├── SaveImage → 保存图像
├── PreviewImage → 预览
└── VideoCombine → 视频合成

连接规则:
  CLIPTextEncode → KSampler (正面/负面条件)
  ModelLoader → KSampler (模型)
  KSampler → VAEDecode (Latent)
  VAEDecode → SaveImage (图像)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

3.2 ComfyUI API + Python SDK

python 复制代码
import requests
import base64
import json
import time
from typing import List, Dict, Any, Optional
from dataclasses import dataclass
from concurrent.futures import ThreadPoolExecutor, as_completed
from PIL import Image
import io

@dataclass
class ComfyUIWorkflow:
    """ComfyUI工作流"""
    nodes: Dict[str, Dict]
    edges: List[Dict]
    
    @classmethod
    def from_json(cls, json_str: str) -> "ComfyUIWorkflow":
        data = json.loads(json_str)
        return cls(
            nodes=data.get("nodes", {}),
            edges=data.get("edges", [])
        )

class ComfyUIClient:
    """
    ComfyUI Python客户端
    支持: 工作流执行、批量处理、模型管理
    """
    
    def __init__(self, host: str = "http://localhost:8188"):
        self.host = host.rstrip("/")
        self.api_url = f"{self.host}/api"
        self.client_id = f"python_client_{int(time.time())}"
    
    # ===== 系统接口 =====
    
    def get_system_stats(self) -> Dict:
        """获取系统状态"""
        resp = requests.get(f"{self.api_url}/system_stats")
        resp.raise_for_status()
        return resp.json()
    
    def get_models(self) -> Dict:
        """获取可用模型列表"""
        resp = requests.get(f"{self.api_url}/object_info")
        resp.raise_for_status()
        return resp.json()
    
    def get_queue(self) -> Dict:
        """获取当前队列"""
        resp = requests.get(f"{self.api_url}/queue")
        resp.raise_for_status()
        return resp.json()
    
    # ===== 工作流执行 =====
    
    def queue_prompt(self, workflow: Dict) -> Dict:
        """将工作流加入队列"""
        payload = {
            "prompt": workflow,
            "client_id": self.client_id,
        }
        
        resp = requests.post(f"{self.api_url}/prompt", json=payload)
        resp.raise_for_status()
        return resp.json()
    
    def get_history(self, prompt_id: str) -> Dict:
        """获取执行历史"""
        resp = requests.get(f"{self.api_url}/history/{prompt_id}")
        resp.raise_for_status()
        return resp.json()
    
    def wait_for_completion(self, prompt_id: str, 
                          timeout: int = 300,
                          poll_interval: float = 1.0) -> Dict:
        """等待执行完成"""
        start = time.time()
        
        while time.time() - start < timeout:
            history = self.get_history(prompt_id)
            
            if prompt_id in history:
                status = history[prompt_id]
                
                if status.get("status", {}).get("state") == "executed":
                    return {"status": "completed", "result": status}
                
                if status.get("status", {}).get("state") == "failed":
                    return {
                        "status": "failed",
                        "error": status.get("status", {}).get("errors", [])
                    }
            
            time.sleep(poll_interval)
        
        return {"status": "timeout", "prompt_id": prompt_id}
    
    def execute_workflow(self, workflow: Dict, 
                        timeout: int = 300) -> Dict:
        """执行工作流并等待完成"""
        # 入队
        queue_resp = self.queue_prompt(workflow)
        prompt_id = queue_resp["prompt_id"]
        
        # 等待
        result = self.wait_for_completion(prompt_id, timeout)
        result["prompt_id"] = prompt_id
        
        return result
    
    # ===== 图像处理 =====
    
    def upload_image(self, image_path: str, name: str = None) -> Dict:
        """上传图像到ComfyUI"""
        with open(image_path, "rb") as f:
            image_data = base64.b64encode(f.read()).decode()
        
        name = name or image_path.split("/")[-1].split("\\")[-1]
        
        resp = requests.post(
            f"{self.api_url}/upload/image",
            json={
                "image": image_data,
                "name": name,
                "type": "input",
            }
        )
        resp.raise_for_status()
        return resp.json()
    
    def download_output(self, filename: str, output_dir: str = "outputs") -> str:
        """下载输出图像"""
        resp = requests.get(f"{self.host}/view", params={"filename": filename})
        resp.raise_for_status()
        
        # 保存到本地
        save_path = f"{output_dir}/{filename}"
        with open(save_path, "wb") as f:
            f.write(resp.content)
        
        return save_path


# ===== 预定义工作流模板 =====

class ImageGenerationWorkflow:
    """图像生成标准工作流"""
    
    @staticmethod
    def basic_text_to_image(
        model: str,
        prompt: str,
        negative_prompt: str = "",
        steps: int = 30,
        cfg: float = 7.0,
        seed: int = -1,
        width: int = 1024,
        height: int = 1024,
    ) -> Dict:
        """基础文生图工作流"""
        
        workflow = {
            "1": {  # Load Checkpoint
                "class_type": "CheckpointLoaderSimple",
                "inputs": {"ckpt_name": model}
            },
            "2": {  # CLIP Text Encode (Positive)
                "class_type": "CLIPTextEncode",
                "inputs": {
                    "text": prompt,
                    "clip": ["1", 0]  # 连接Checkpoint的CLIP输出
                }
            },
            "3": {  # CLIP Text Encode (Negative)
                "class_type": "CLIPTextEncode", 
                "inputs": {
                    "text": negative_prompt,
                    "clip": ["1", 0]
                }
            },
            "4": {  # KSampler
                "class_type": "KSampler",
                "inputs": {
                    "model": ["1", 0],
                    "positive": ["2", 0],
                    "negative": ["3", 0],
                    "seed": seed if seed >= 0 else int(time.time()),
                    "steps": steps,
                    "cfg": cfg,
                    "sampler_name": "euler",
                    "scheduler": "normal",
                }
            },
            "5": {  # VAE Decode
                "class_type": "VAEDecode",
                "inputs": {
                    "samples": ["4", 0],
                    "vae": ["1", 2]
                }
            },
            "6": {  # Save Image
                "class_type": "SaveImage",
                "inputs": {
                    "images": ["5", 0],
                    "filename_prefix": "ComfyUI_Output"
                }
            }
        }
        
        return workflow
    
    @staticmethod
    def image_to_image(
        model: str,
        input_image: str,
        prompt: str,
        strength: float = 0.7,
        **kwargs
    ) -> Dict:
        """图生图工作流"""
        
        workflow = {
            "1": {  # Load Checkpoint
                "class_type": "CheckpointLoaderSimple",
                "inputs": {"ckpt_name": model}
            },
            "2": {  # Load Image
                "class_type": "LoadImage",
                "inputs": {"image": input_image}
            },
            "3": {  # VAE Encode (for img2img)
                "class_type": "VAEEncode",
                "inputs": {
                    "pixels": ["2", 0],
                    "vae": ["1", 2]
                }
            },
            "4": {  # CLIP Text Encode
                "class_type": "CLIPTextEncode",
                "inputs": {
                    "text": prompt,
                    "clip": ["1", 0]
                }
            },
            "5": {  # KSampler with denoise=strength
                "class_type": "KSampler",
                "inputs": {
                    "model": ["1", 0],
                    "positive": ["4", 0],
                    "negative": ["4", 0],  # 可添加负面Prompt
                    "seed": int(time.time()),
                    "steps": kwargs.get("steps", 30),
                    "cfg": kwargs.get("cfg", 7.0),
                    "sampler_name": "euler",
                    "scheduler": "normal",
                    "denoise": strength
                }
            },
            "6": {  # VAE Decode
                "class_type": "VAEDecode",
                "inputs": {
                    "samples": ["5", 0],
                    "vae": ["1", 2]
                }
            },
            "7": {  # Save Image
                "class_type": "SaveImage",
                "inputs": {
                    "images": ["6", 0],
                    "filename_prefix": "ComfyUI_img2img"
                }
            }
        }
        
        return workflow
    
    @staticmethod
    def inpainting(
        model: str,
        base_image: str,
        mask_image: str,
        prompt: str,
        **kwargs
    ) -> Dict:
        """局部重绘工作流"""
        
        workflow = {
            "1": {"class_type": "CheckpointLoaderSimple", "inputs": {"ckpt_name": model}},
            "2": {"class_type": "LoadImage", "inputs": {"image": base_image}},
            "3": {"class_type": "LoadImage", "inputs": {"image": mask_image}},
            "4": {"class_type": "CLIPTextEncode", "inputs": {"text": prompt, "clip": ["1", 0]}},
            "5": {"class_type": "VAEEncodeForInpaint", "inputs": {
                "pixels": ["2", 0],
                "mask": ["3", 0],
                "vae": ["1", 2]
            }},
            "6": {"class_type": "KSampler", "inputs": {
                "model": ["1", 0],
                "positive": ["4", 0],
                "negative": ["4", 0],
                "seed": int(time.time()),
                "steps": kwargs.get("steps", 30),
                "cfg": kwargs.get("cfg", 7.0),
                "sampler_name": "euler",
                "denoise": kwargs.get("strength", 1.0)
            }},
            "7": {"class_type": "VAEDecode", "inputs": {
                "samples": ["6", 0],
                "vae": ["1", 2]
            }},
            "8": {"class_type": "SaveImage", "inputs": {
                "images": ["7", 0],
                "filename_prefix": "ComfyUI_inpaint"
            }}
        }
        
        return workflow


# ===== 批量处理 =====

class BatchProcessor:
    """批量图像生成处理器"""
    
    def __init__(self, client: ComfyUIClient, max_workers: int = 2):
        self.client = client
        self.max_workers = max_workers
    
    def process_batch(
        self,
        items: List[Dict],
        workflow_template: callable,
        progress_callback=None
    ) -> List[Dict]:
        """批量处理"""
        results = []
        
        with ThreadPoolExecutor(max_workers=self.max_workers) as executor:
            futures = {}
            
            for i, item in enumerate(items):
                # 生成工作流
                workflow = workflow_template(**item)
                
                # 提交任务
                future = executor.submit(self.client.execute_workflow, workflow)
                futures[future] = (i, item)
            
            # 收集结果
            for future in as_completed(futures):
                idx, item = futures[future]
                
                try:
                    result = future.result()
                    results.append({
                        "index": idx,
                        "item": item,
                        "status": result.get("status"),
                        "prompt_id": result.get("prompt_id"),
                        "outputs": self._extract_outputs(result),
                    })
                except Exception as e:
                    results.append({
                        "index": idx,
                        "item": item,
                        "status": "error",
                        "error": str(e),
                    })
                
                if progress_callback:
                    progress_callback(len(results), len(items))
        
        return results
    
    def _extract_outputs(self, result: Dict) -> List[str]:
        """提取输出文件"""
        outputs = []
        
        history = result.get("result", {}).get("result", {})
        for node_id, node_data in history.items():
            if "images" in node_data:
                outputs.extend(node_data["images"])
        
        return outputs


# ===== 使用示例 =====

def main():
    client = ComfyUIClient("http://localhost:8188")
    
    # 检查状态
    stats = client.get_system_stats()
    print(f"GPU: {stats.get('devices', [{}])[0].get('name', 'N/A')}")
    print(f"VRAM: {stats.get('devices', [{}])[0].get('mem_total', 0) / 1024**3:.1f} GB")
    
    # 生成图像
    workflow = ImageGenerationWorkflow.basic_text_to_image(
        model="flux1-dev.safetensors",
        prompt="a sleek smartphone on a marble surface, professional product photography, soft studio lighting, 4K",
        negative_prompt="blurry, low quality, watermark",
        steps=30,
        cfg=7.0,
        width=1024,
        height=1024,
    )
    
    result = client.execute_workflow(workflow)
    print(f"Status: {result['status']}")
    
    if result["status"] == "completed":
        # 下载输出
        for filename in result.get("outputs", []):
            save_path = client.download_output(filename)
            print(f"Saved: {save_path}")
    
    # 批量处理
    batch_items = [
        {"prompt": f"product photography for item {i}", "model": "flux1-dev.safetensors"}
        for i in range(10)
    ]
    
    processor = BatchProcessor(client, max_workers=2)
    batch_results = processor.process_batch(
        items=batch_items,
        workflow_template=ImageGenerationWorkflow.basic_text_to_image,
        progress_callback=lambda done, total: print(f"Progress: {done}/{total}")
    )
    
    success_count = sum(1 for r in batch_results if r["status"] == "completed")
    print(f"Batch completed: {success_count}/{len(batch_items)}")


if __name__ == "__main__":
    main()

4. AI内容合规与溯源

4.1 C2PA内容溯源标准

python 复制代码
import hashlib
import json
import base64
from datetime import datetime, timezone
from typing import Dict, Any, Optional
from dataclasses import dataclass, asdict

# ===== C2PA内容凭证 =====

@dataclass
class ContentCredential:
    """
    C2PA内容凭证
    记录内容的创作来源、修改历史和AI参与程度
    """
    content_id: str
    created_at: str
    creator: str
    
    # AI生成信息
    ai_generated: bool = False
    ai_model: Optional[str] = None
    ai_prompt: Optional[str] = None
    ai_confidence: Optional[float] = None
    
    # 原始素材
    source_materials: list = None
    
    # 修改历史
    modifications: list = None
    
    # 技术元数据
    technical: dict = None
    
    def to_c2pa_manifest(self) -> Dict:
        """生成C2PA Manifest"""
        return {
            "claim": {
                "dc:title": f"Content_{self.content_id}",
                "dc:creator": [self.creator],
                "dc:created": self.created_at,
                "claim_generator": "AIGC-Platform/1.0",
            },
            "assertions": {
                "c2pa.actions": self._build_actions(),
                "c2pa.hashed.uri": self._build_hashed_uri(),
                "stds.schema-org.CreativeWork": self._build_creative_work(),
            },
            "signature_info": {
                "time": self.created_at,
                "issuer": "AIGC-Platform-CA",
            }
        }
    
    def _build_actions(self) -> List[Dict]:
        """构建操作历史"""
        actions = []
        
        # 创作动作
        actions.append({
            "action": "c2pa.created",
            "when": self.created_at,
            "software_agent": self.ai_model or "human",
        })
        
        # AI生成标记
        if self.ai_generated:
            actions.append({
                "action": "c2pa.ai_generated",
                "when": self.created_at,
                "parameters": {
                    "model": self.ai_model,
                    "prompt": self.ai_prompt,
                }
            })
        
        # 修改历史
        for mod in (self.modifications or []):
            actions.append({
                "action": "c2pa.edited",
                "when": mod["timestamp"],
                "software_agent": mod["tool"],
                "parameters": mod.get("description", ""),
            })
        
        return actions
    
    def _build_hashed_uri(self) -> Dict:
        """构建哈希URI"""
        return {
            "content": {
                "hash": self._compute_hash(),
                "alg": "sha256",
            }
        }
    
    def _build_creative_work(self) -> Dict:
        """构建创意作品元数据"""
        return {
            "@type": "CreativeWork",
            "author": self.creator,
            "dateCreated": self.created_at,
        }
    
    def _compute_hash(self) -> str:
        """计算内容哈希"""
        data = f"{self.content_id}{self.created_at}{self.ai_prompt or ''}"
        return hashlib.sha256(data.encode()).hexdigest()


# ===== NSFW内容检测 =====

class ContentSafetyChecker:
    """
    内容安全审核
    多维度检测NSFW、不当内容
    """
    
    def __init__(self, api_endpoint: str = None):
        self.endpoint = api_endpoint
    
    async def check_image(self, image_data: bytes) -> Dict[str, Any]:
        """图像安全检查"""
        # 实际应调用图像审核API(如阿里云、腾讯云)
        # 这里演示结构
        
        results = {
            "nsfw_score": 0.0,
            "violence_score": 0.0,
            "hate_score": 0.0,
            "adult_score": 0.0,
            "sensitive_objects": [],
            "pass": True,
            "reasons": [],
        }
        
        # 阈值
        THRESHOLD = 0.7
        
        if results["nsfw_score"] > THRESHOLD:
            results["pass"] = False
            results["reasons"].append("检测到成人内容")
        
        if results["violence_score"] > THRESHOLD:
            results["pass"] = False
            results["reasons"].append("检测到暴力内容")
        
        if results["sensitive_objects"]:
            results["pass"] = False
            results["reasons"].append(f"检测到敏感物体: {results['sensitive_objects']}")
        
        return results
    
    async def check_text(self, text: str) -> Dict[str, Any]:
        """文本安全检查"""
        results = {
            "spam_score": 0.0,
            "politics_score": 0.0,
            "illegal_score": 0.0,
            "pass": True,
            "reasons": [],
        }
        
        # 敏感词检查
        sensitive_words = ["政治敏感词1", "政治敏感词2"]
        found = [w for w in sensitive_words if w in text]
        
        if found:
            results["pass"] = False
            results["reasons"].append(f"检测到敏感词: {found}")
        
        return results
    
    async def check_batch(self, items: List[Dict]) -> List[Dict]:
        """批量检查"""
        results = []
        for item in items:
            if item["type"] == "image":
                r = await self.check_image(item["data"])
            else:
                r = await self.check_text(item["data"])
            
            results.append({
                "item_id": item["id"],
                "result": r,
                "action": "pass" if r["pass"] else "reject"
            })
        
        return results


# ===== 版权检测 =====

class CopyrightChecker:
    """
    版权检测
    检测内容是否涉及版权问题
    """
    
    def __init__(self):
        self.known_works = {}  # 预存版权库
    
    def check_similarity(self, content_hash: str) -> Dict:
        """相似度检测"""
        # 实际应调用图像指纹API(如Google Vision, Adobe Content Authenticity)
        return {
            "similar_works": [],
            "copyright_risk": "low",
            "requires_attribution": False,
        }
    
    def check_brand(self, image_data: bytes) -> Dict:
        """品牌/商标检测"""
        # 检测图像中的品牌Logo
        return {
            "detected_brands": [],
            "trademark_risk": "low",
            "requires_permission": [],
        }
    
    def generate_report(self, checks: Dict) -> Dict:
        """生成合规报告"""
        return {
            "report_id": f"RPT_{datetime.now(timezone.utc).strftime('%Y%m%d%H%M%S')}",
            "generated_at": datetime.now(timezone.utc).isoformat(),
            "checks": checks,
            "overall_risk": self._calculate_risk(checks),
            "recommendations": self._generate_recommendations(checks),
        }
    
    def _calculate_risk(self, checks: Dict) -> str:
        risks = [
            checks.get("similarity", {}).get("copyright_risk"),
            checks.get("brand", {}).get("trademark_risk"),
        ]
        
        if "high" in risks:
            return "high"
        if "medium" in risks:
            return "medium"
        return "low"
    
    def _generate_recommendations(self, checks: Dict) -> List[str]:
        recs = []
        
        if checks.get("similarity", {}).get("requires_attribution"):
            recs.append("需要注明来源")
        
        if checks.get("brand", {}).get("requires_permission"):
            recs.append("需要品牌方授权")
        
        return recs

5. 工业化生产管理系统

5.1 内容工厂架构

python 复制代码
import asyncio
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional
from enum import Enum
import uuid
import json

class TaskStatus(Enum):
    PENDING = "pending"
    PROCESSING = "processing"
    COMPLETED = "completed"
    FAILED = "failed"

@dataclass
class ContentTask:
    """内容生成任务"""
    task_id: str
    task_type: str  # "text" | "image" | "video" | "multi"
    spec: Dict[str, Any]
    status: TaskStatus = TaskStatus.PENDING
    result: Optional[Dict] = None
    error: Optional[str] = None
    created_at: str = ""
    completed_at: Optional[str] = None

class ContentPipeline(ABC):
    """内容生成流水线基类"""
    
    @abstractmethod
    async def process(self, task: ContentTask) -> ContentTask:
        pass
    
    async def run(self, tasks: List[ContentTask]) -> List[ContentTask]:
        """并发执行任务"""
        async with asyncio.Semaphore(5):  # 最多5个并发
            results = await asyncio.gather(
                *[self.process(task) for task in tasks],
                return_exceptions=True
            )
            
            processed = []
            for task, result in zip(tasks, results):
                if isinstance(result, Exception):
                    task.status = TaskStatus.FAILED
                    task.error = str(result)
                else:
                    task = result
                processed.append(task)
            
            return processed


class EcommerceContentPipeline(ContentPipeline):
    """
    电商内容工厂
    自动化生成: 主图 + 详情页 + 短视频脚本
    """
    
    def __init__(self, llm_client, image_client, safety_checker):
        self.llm = llm_client
        self.image = image_client
        self.safety = safety_checker
    
    async def process(self, task: ContentTask) -> ContentTask:
        """处理电商内容生成"""
        spec = task.spec
        results = {"items": []}
        
        # 1. 生成主图文案
        title_result = await self.llm.generate(
            prompt=self._build_title_prompt(spec),
            max_tokens=200
        )
        
        # 2. 生成详情页
        detail_result = await self.llm.generate(
            prompt=self._build_detail_prompt(spec),
            max_tokens=1000
        )
        
        # 3. 生成主图
        image_prompt = self._build_image_prompt(spec)
        image_result = await self.image.generate(
            prompt=image_prompt,
            style="product_photography",
            size=(1024, 1024)
        )
        
        # 4. 安全检查
        safety_result = await self.safety.check_image(image_result["data"])
        if not safety_result["pass"]:
            task.status = TaskStatus.FAILED
            task.error = f"Safety check failed: {safety_result['reasons']}"
            return task
        
        # 5. 汇总
        results["items"] = [
            {
                "type": "main_title",
                "content": title_result["text"],
                "platform": spec.get("platform"),
            },
            {
                "type": "detail_page",
                "content": detail_result["text"],
                "platform": spec.get("platform"),
            },
            {
                "type": "main_image",
                "url": image_result["url"],
                "hash": image_result["hash"],
            }
        ]
        
        task.status = TaskStatus.COMPLETED
        task.result = results
        
        return task
    
    def _build_title_prompt(self, spec: Dict) -> str:
        return f"""
为电商平台生成商品标题:
产品: {spec['product_name']}
卖点: {spec['key_features']}
目标人群: {spec['target_audience']}
平台: {spec['platform']}

要求:
- 30字内
- 包含核心卖点
- 符合{platform}标题规范
- SEO友好
"""
    
    def _build_detail_prompt(self, spec: Dict) -> str:
        return f"""
生成电商详情页内容:
产品: {spec['product_name']}
卖点: {spec['key_features']}
品牌: {spec['brand']}

要求:
- 3-5个核心卖点
- 每个卖点20字内
- 情感共鸣+功能说明
"""
    
    def _build_image_prompt(self, spec: Dict) -> str:
        return f"""
Product: {spec['product_name']}
Style: {spec.get('style', 'product photography')}
Mood: {spec.get('mood', 'professional')}
Color: {spec.get('color_palette', ['white', 'gray'])}
Platform: {spec['platform']}
"""


class SocialMediaPipeline(ContentPipeline):
    """
    社交媒体内容工厂
    批量生成: 帖子 + 配图 + 标签
    """
    
    def __init__(self, llm_client, image_client, prompt_library):
        self.llm = llm_client
        self.image = image_client
        self.prompts = prompt_library
    
    async def process(self, task: ContentTask) -> ContentTask:
        """处理社交内容"""
        spec = task.spec
        
        # 批量生成多个帖子
        posts = []
        for i in range(spec.get("count", 3)):
            post = await self._generate_single_post(spec, i)
            posts.append(post)
        
        task.status = TaskStatus.COMPLETED
        task.result = {"posts": posts}
        return task
    
    async def _generate_single_post(self, spec: Dict, index: int) -> Dict:
        """生成单个帖子"""
        # 检索相似历史成功模板
        template = self.prompts.retrieve(
            query=spec["topic"],
            category="social_media",
            top_k=1
        )[0]
        
        # 填充模板
        prompt = self.prompts.generate_from_template(
            template["template"],
            context=spec.get("context"),
            style=spec.get("style")
        )
        
        # 生成文案
        text_result = await self.llm.generate(prompt=prompt)
        
        # 生成配图
        image_result = await self.image.generate(
            prompt=f"{spec['topic']}, {spec.get('style', 'vibrant colors')}, social media"
        )
        
        return {
            "index": index,
            "text": text_result["text"],
            "image_url": image_result["url"],
            "hashtags": text_result.get("hashtags", []),
        }


# ===== 内容资产管理 =====

class ContentAssetManager:
    """
    内容资产管理
    管理生成内容的存储、版本、检索
    """
    
    def __init__(self, storage_backend):
        self.storage = storage_backend
        self.metadata_db = {}
    
    async def store(self, content_type: str, data: bytes, 
                   metadata: Dict) -> str:
        """存储内容"""
        asset_id = str(uuid.uuid4())
        
        # 计算内容Hash
        content_hash = hashlib.sha256(data).hexdigest()
        
        # 存储文件
        path = f"{content_type}/{asset_id}"
        await self.storage.put(path, data)
        
        # 存储元数据
        full_metadata = {
            "asset_id": asset_id,
            "content_hash": content_hash,
            "content_type": content_type,
            "size_bytes": len(data),
            "created_at": datetime.now(timezone.utc).isoformat(),
            **metadata,
        }
        
        self.metadata_db[asset_id] = full_metadata
        
        return asset_id
    
    async def retrieve(self, asset_id: str) -> Optional[Dict]:
        """检索内容"""
        return self.metadata_db.get(asset_id)
    
    async def search(self, query: str, filters: Dict = None) -> List[Dict]:
        """搜索内容"""
        results = []
        
        for asset_id, meta in self.metadata_db.items():
            # 简单关键词匹配
            text_fields = [
                meta.get("description", ""),
                meta.get("tags", []),
                meta.get("content_type", ""),
            ]
            
            if query.lower() in " ".join(text_fields).lower():
                if filters:
                    if all(meta.get(k) == v for k, v in filters.items()):
                        results.append(meta)
                else:
                    results.append(meta)
        
        return results

6. 总结

AIGC工业化生产检查清单

复制代码
内容规划层:
□ 明确内容目标和KPI
□ 定义目标受众和平台
□ 建立Prompt模板库
□ 配置RAG增强检索

内容生成层:
□ 文生图/图生图/局部重绘
□ ComfyUI工作流编排
□ 批量生成与自动化
□ 模型版本管理

内容后处理:
□ 图像质量增强(upscale)
□ 批量格式转换
□ 自动标注与分类

合规安全层:
□ NSFW内容过滤
□ 版权检测与溯源
□ C2PA内容凭证
□ 人工审核机制

资产管理:
□ 内容版本控制
□ 标签与分类
□ 版权管理
□ 效果追踪

2026年多模态AI工具链

类别 工具 定位
文生图 Flux 1.0 / SD3 开源/闭源主流
文生视频 Runway Gen-3 / Sora 视频生成
工作流 ComfyUI 节点编排
Prompt Midjourney / DALL-E 3 高端图像
语音 ElevenLabs / Fish Speech 语音合成
合规 C2PA / Adobe CAI 内容溯源