详解Hugging Face Models 的两大核心筛选体系Tasks(任务分类)和Libraries(框架分类)

详解Hugging Face Models 的两大核心筛选体系Tasks(任务分类)和Libraries(框架分类)

本文按 Tasks(模型能做什么)Libraries(模型如何训练/加载/部署) 两条主线整理 Hugging Face 模型生态,并覆盖 Transformers → PEFT → Safetensors → GGUF → vLLM → Ollama 等主流工程链路。


1. Hugging Face 模型生态总览

Hugging Face Hub 的模型筛选体系主要包含:

维度 作用 示例
Tasks 按输入输出与应用任务筛选 Text Generation、Image Classification、ASR
Libraries 按框架、格式、推理运行时筛选 Transformers、Diffusers、PEFT、GGUF、Safetensors
Languages 按语言筛选 English、Chinese、Multilingual
Licenses 按许可证筛选 Apache-2.0、MIT、Llama、Gemma、OpenRAIL
Other 按部署、推理服务、量化等筛选 Inference、vLLM、Ollama、llama.cpp

Tasks 解决"模型能做什么" ,例如生成文本、识别图像、转写语音。

Libraries 解决"模型怎么用",例如用 Transformers 加载、用 PEFT 微调、用 GGUF 本地部署、用 vLLM 提供 API。


2. Tasks 分类总览

大类 子任务
Multimodal 多模态 Audio-Text-to-Text、Image-Text-to-Text、Image-Text-to-Image、Image-Text-to-Video、VQA、Document QA、Video-Text-to-Text、Visual Document Retrieval、Any-to-Any
Computer Vision 计算机视觉 Depth Estimation、Image Classification、Object Detection、Image Segmentation、Text-to-Image、Image-to-Text、Image-to-Image、Image-to-Video、Video Classification、Text-to-Video、Zero-Shot Image Classification、Mask Generation、Zero-Shot Object Detection、Text-to-3D、Image-to-3D、Image Feature Extraction、Keypoint Detection、Video-to-Video
NLP 自然语言处理 Text Classification、Token Classification、Table QA、Question Answering、Zero-Shot Classification、Translation、Summarization、Feature Extraction、Text Generation、Fill-Mask、Sentence Similarity、Text Ranking
Audio 音频 Text-to-Speech、Text-to-Audio、ASR、Audio-to-Audio、Audio Classification、Voice Activity Detection
Tabular 表格 Tabular Classification、Tabular Regression、Time Series Forecasting
Reinforcement Learning 强化学习 Reinforcement Learning、Robotics
Other 其他 Graph Machine Learning

3. Tasks 对应代表模型清单

3.1 Multimodal 多模态

Image-Text-to-Text / 视觉语言模型

用于图片理解、图表问答、OCR、视觉 Agent、UI 自动化。

模型家族 代表模型
Qwen-VL Qwen-VL、Qwen2-VL、Qwen2.5-VL-3B/7B/32B/72B-Instruct
LLaVA LLaVA-1.5、LLaVA-1.6、LLaVA-NeXT、LLaVA-OneVision
InternVL InternVL2、InternVL2.5、InternVL3
MiniCPM-V MiniCPM-V 2.6、MiniCPM-o
Idefics Idefics2、Idefics3
Phi Vision Phi-3.5-Vision、Phi-4-multimodal
Gemma Vision PaliGemma、Gemma 3 Vision
DeepSeek-VL DeepSeek-VL、DeepSeek-VL2
CogVLM CogVLM、CogVLM2
Florence Florence-2
BLIP BLIP、BLIP-2、InstructBLIP
Molmo Molmo-7B、Molmo-72B
GLM-V GLM-4V、CogAgent
Yi-VL Yi-VL-6B/34B

Visual Question Answering

模型家族 代表模型
LLaVA LLaVA-1.5、LLaVA-NeXT
BLIP BLIP、BLIP-2
Qwen-VL Qwen2.5-VL
InternVL InternVL2.5
Idefics Idefics2
PaliGemma PaliGemma

Document Question Answering

模型家族 代表模型
LayoutLM LayoutLM、LayoutLMv2、LayoutLMv3
Donut Donut-base、Donut-docvqa
Pix2Struct Pix2Struct
Nougat Nougat OCR / scientific PDF
ColPali ColPali、ColQwen2
Qwen-VL Qwen2.5-VL 文档理解
InternVL InternVL 文档理解

Audio-Text-to-Text

模型家族 代表模型
Qwen-Audio Qwen-Audio、Qwen2-Audio
SALMONN SALMONN
SeamlessM4T SeamlessM4T
MiniCPM-o MiniCPM-o
Phi Multimodal Phi-4-multimodal

Image-Text-to-Image

模型家族 代表模型
Stable Diffusion SD 1.5、SD 2.1
SDXL Stable Diffusion XL
Flux FLUX.1-dev、FLUX.1-schnell
SD3 Stable Diffusion 3、SD3.5
Kandinsky Kandinsky 2/3
ControlNet ControlNet
IP-Adapter IP-Adapter
InstructPix2Pix InstructPix2Pix

Image-Text-to-Video / Video-Text-to-Text

任务 代表模型
图文到视频 CogVideoX、Stable Video Diffusion、AnimateDiff、VideoCrafter2、Open-Sora、HunyuanVideo、LTX-Video、Wan Video
视频到文本 Video-LLaVA、VideoChatGPT、Qwen2.5-VL、InternVideo2、LongVA、LLaVA-OneVision

Visual Document Retrieval / Any-to-Any

任务 代表模型
视觉文档检索 ColPali、ColQwen2、CLIP、OpenCLIP、SigLIP
任意模态到任意模态 Qwen2.5-Omni、MiniCPM-o、Phi-4-multimodal、SeamlessM4T

3.2 Computer Vision 计算机视觉

Task 代表模型
Image Classification ResNet、EfficientNet、ConvNeXt、ViT、Swin Transformer、DeiT、BEiT、RegNet、MobileNet、DINOv2、SigLIP
Object Detection DETR、Deformable DETR、YOLOv5/v8/v10、YOLO-NAS、RT-DETR、Grounding DINO、OWL-ViT、Faster R-CNN、Mask R-CNN、Florence-2
Image Segmentation SAM、SAM2、Mask2Former、SegFormer、U-Net、DeepLabv3+、OneFormer、CLIPSeg、MobileSAM、FastSAM、HQ-SAM
Depth Estimation MiDaS、DPT、ZoeDepth、Depth Anything、Depth Anything V2、GLPN、Marigold
Text-to-Image SD 1.5、SD 2.1、SDXL、SD3/SD3.5、Flux、Kandinsky、PixArt-alpha、PixArt-Sigma、Playground v2.5、DeepFloyd IF、LCM
Image-to-Image SD img2img、SDXL Refiner、ControlNet、IP-Adapter、InstructPix2Pix、T2I-Adapter、BrushNet、InstantID
Image-to-Text BLIP、BLIP-2、GIT、Donut、TrOCR、Florence-2、Qwen2.5-VL、PaliGemma
Image-to-Video SVD、AnimateDiff、I2VGen-XL、CogVideoX、LTX-Video、HunyuanVideo、Wan Video
Text-to-Video CogVideoX、Open-Sora、VideoCrafter2、ModelScope T2V、HunyuanVideo、LTX-Video、AnimateDiff、Wan Video
Zero-Shot Image Classification CLIP、OpenCLIP、SigLIP、EVA-CLIP、MetaCLIP、LiT
Zero-Shot Object Detection Grounding DINO、OWL-ViT、OWLv2、Florence-2、GLIP
Mask Generation SAM、SAM2、MobileSAM、FastSAM、HQ-SAM
Text-to-3D Shap-E、Point-E、DreamFusion 类、Magic3D 类、LGM
Image-to-3D TripoSR、Zero123、Wonder3D、LGM、CRM、InstantMesh
Image Feature Extraction CLIP、OpenCLIP、DINOv2、SigLIP、EVA-CLIP、ViT、ConvNeXt
Video Classification VideoMAE、TimeSformer、SlowFast、X-CLIP、InternVideo、Video Swin
Keypoint Detection OpenPose、ViTPose、HRNet、RTMPose、MediaPipe Pose
Video-to-Video AnimateDiff、Video ControlNet、TokenFlow、Rerender-A-Video、SVD editing workflows

3.3 NLP 自然语言处理

Text Generation / LLM

模型家族 代表模型
Llama Llama 2、Llama 3、Llama 3.1、Llama 3.2、Llama 3.3
Qwen Qwen1.5、Qwen2、Qwen2.5、Qwen3、Qwen-Coder
DeepSeek DeepSeek-V2、DeepSeek-V3、DeepSeek-R1、DeepSeek-Coder
Mistral Mistral 7B、Mixtral 8x7B、Mixtral 8x22B、Mistral Large
Gemma Gemma、Gemma 2、Gemma 3
Phi Phi-2、Phi-3、Phi-3.5、Phi-4
Yi Yi-6B、Yi-34B、Yi-1.5
GLM ChatGLM、GLM-4、GLM-Z1
Baichuan Baichuan2、Baichuan-M1
InternLM InternLM2、InternLM2.5、InternLM3
Falcon Falcon、Falcon2、Falcon3
Command R Command R、Command R+
StarCoder StarCoder、StarCoder2
CodeLlama CodeLlama
Codestral Codestral
Granite IBM Granite
OLMo OLMo、OLMo 2
Zephyr / Hermes / OpenChat Zephyr、Nous Hermes、OpenChat、Vicuna、WizardLM、Tulu、Solar、Aya

其他 NLP Tasks

Task 代表模型
Text Classification BERT、RoBERTa、DeBERTa、DistilBERT、ALBERT、ELECTRA、XLNet、ModernBERT、MacBERT、XLM-R
Token Classification / NER BERT NER、RoBERTa NER、DeBERTa NER、XLM-R NER、MacBERT NER、Flair、spaCy Transformers
Question Answering BERT SQuAD、RoBERTa SQuAD、DeBERTa QA、DistilBERT QA、ALBERT QA、Longformer QA、BigBird QA
Table QA TAPAS、TaBERT、TURL、TAPEX、Pix2Struct、LLM 表格问答微调模型
Zero-Shot Classification BART-MNLI、DeBERTa-MNLI、XLM-R-XNLI、ModernBERT-NLI、T5 NLI
Translation MarianMT、M2M100、NLLB-200、SeamlessM4T、mBART50、T5、mT5、MADLAD-400、OPUS-MT
Summarization BART-large-CNN、PEGASUS、T5、Flan-T5、LED、LongT5、PRIMERA、Llama/Qwen/Mistral 指令模型
Feature Extraction / Embedding BGE、BGE-M3、E5、multilingual-E5、GTE、GTE-Qwen、Jina Embeddings、Sentence-BERT、Instructor、UAE、Nomic、Arctic Embed、Stella、ColBERT、Contriever
Sentence Similarity all-MiniLM、all-mpnet-base、BGE、E5、GTE、Jina、Nomic
Text Ranking / Reranker BGE-Reranker、Jina Reranker、Cohere Rerank、ColBERTv2、MonoT5、RankT5、GTE Reranker、Qwen Reranker
Fill-Mask BERT、RoBERTa、DeBERTa、ALBERT、ELECTRA、XLM-R、MacBERT

3.4 Audio 音频

Task 代表模型
Automatic Speech Recognition Whisper tiny/base/small/medium/large-v2/large-v3、Distil-Whisper、Wav2Vec2、HuBERT、WavLM、SeamlessM4T、Paraformer、Conformer、NeMo ASR、SenseVoice
Text-to-Speech Bark、VITS、MMS-TTS、SpeechT5、XTTS-v2、ChatTTS、CosyVoice、F5-TTS、Fish Speech、Parler-TTS
Text-to-Audio AudioLDM、AudioLDM2、MusicGen、Bark、Stable Audio、AudioGen、Tango、Make-An-Audio
Audio-to-Audio RVC、So-VITS-SVC、VoiceFixer、Demucs、AudioSep、MetricGAN+、SepFormer
Audio Classification AST、YAMNet、PANNs、Wav2Vec2 classification、HuBERT classification、BEATs、CLAP
Voice Activity Detection Silero VAD、WebRTC VAD、pyannote.audio、NeMo VAD、SpeechBrain VAD

3.5 Tabular / RL / Graph

大类 Task 代表模型/框架
Tabular Tabular Classification TabNet、FT-Transformer、TabTransformer、SAINT、AutoGluon、XGBoost、LightGBM、CatBoost
Tabular Tabular Regression XGBoost、LightGBM、CatBoost、TabNet、FT-Transformer、AutoGluon
Tabular Time Series Forecasting TimeSeries Transformer、Informer、Autoformer、PatchTST、TimesFM、Chronos、Lag-Llama、TFT、N-BEATS、DeepAR
RL Reinforcement Learning Stable-Baselines3、CleanRL、TRL、Decision Transformer、CQL、IQL、RLHF、RLAIF
Robotics Robotics RT-1、RT-2、OpenVLA、Octo、Diffusion Policy、ACT、RoboFlamingo、LeRobot
Graph Graph Machine Learning GCN、GraphSAGE、GAT、GIN、R-GCN、Graphormer、PyG、DGL、TransE、RotatE、ComplEx

4. Libraries 分类详解

4.1 Transformers

Hugging Face 最核心模型库,覆盖文本、视觉、音频、视频和多模态模型的训练与推理。

常见模型:Llama、Qwen、Gemma、Mistral、DeepSeek、BERT、RoBERTa、Whisper、ViT、CLIP、LLaVA。

python 复制代码
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Qwen/Qwen2.5-7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

4.2 Diffusers

扩散模型生态,适合文生图、图生图、图像修复、ControlNet、LoRA 风格模型、文生视频、图生视频。

代表模型:Stable Diffusion、SDXL、SD3/SD3.5、Flux、Kandinsky、CogVideoX、AnimateDiff、Stable Video Diffusion。

4.3 PEFT

Parameter-Efficient Fine-Tuning,参数高效微调。

技术 说明
LoRA 低秩适配器微调
QLoRA 量化基础模型 + LoRA 微调
AdaLoRA 自适应 LoRA
Prefix Tuning 前缀参数微调
Prompt Tuning 可训练软提示
IA3 激活缩放类高效微调

4.4 Safetensors

安全权重格式,替代 pickle 格式,常见文件包括 model.safetensorsadapter_model.safetensorsmodel-00001-of-000xx.safetensors

优势:安全、加载快、适合大模型分片、支持 metadata。

4.5 GGUF

llama.cpp 生态模型文件格式,适合本地 CPU/GPU 混合推理。

量化 特点
F16 高质量,体积大
Q8_0 高质量量化
Q6_K 质量较高,体积适中
Q5_K_M 常用平衡选择
Q4_K_M 低显存常用
Q3_K_M 更小但质量下降
IQ4 / IQ3 新型极低比特量化

4.6 vLLM

高吞吐 LLM 推理服务框架,适合生产环境 API 服务。

特点:PagedAttention、高并发、连续批处理、OpenAI API 兼容。

bash 复制代码
vllm serve Qwen/Qwen2.5-7B-Instruct

4.7 Ollama

本地大模型运行工具,常基于 GGUF / llama.cpp 生态。

bash 复制代码
ollama run qwen2.5
ollama run llama3.2
ollama run deepseek-r1

4.8 其他重要 Libraries

Library 作用
llama.cpp C/C++ 轻量推理框架,GGUF 核心运行时
Sentence Transformers Embedding、语义搜索、相似度、Reranking
Transformers.js 浏览器 / Node.js 端运行模型
ONNX 跨平台推理格式
OpenVINO Intel CPU/GPU/边缘设备推理优化
MLX Apple Silicon 本地推理与训练
timm PyTorch 图像模型库
TensorFlow / Keras Google 深度学习生态
JAX / Flax TPU 与研究型高性能训练
OpenCLIP CLIP 开源实现
spaCy 工业 NLP 管线
NeMo NVIDIA 语音和大模型训练生态
PaddlePaddle / PaddleOCR 中文 OCR 与飞桨生态
Rust / Candle Rust 高性能端侧推理

5. Libraries 完整生态关系图

5.1 总体生态图

#mermaid-svg-dIiAsd7aIS7YtB4p{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-dIiAsd7aIS7YtB4p .error-icon{fill:#552222;}#mermaid-svg-dIiAsd7aIS7YtB4p .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-dIiAsd7aIS7YtB4p .marker{fill:#333333;stroke:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p .marker.cross{stroke:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-dIiAsd7aIS7YtB4p p{margin:0;}#mermaid-svg-dIiAsd7aIS7YtB4p .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster-label text{fill:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster-label span{color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster-label span p{background-color:transparent;}#mermaid-svg-dIiAsd7aIS7YtB4p .label text,#mermaid-svg-dIiAsd7aIS7YtB4p span{fill:#333;color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .node rect,#mermaid-svg-dIiAsd7aIS7YtB4p .node circle,#mermaid-svg-dIiAsd7aIS7YtB4p .node ellipse,#mermaid-svg-dIiAsd7aIS7YtB4p .node polygon,#mermaid-svg-dIiAsd7aIS7YtB4p .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .rough-node .label text,#mermaid-svg-dIiAsd7aIS7YtB4p .node .label text,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape .label,#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape .label{text-anchor:middle;}#mermaid-svg-dIiAsd7aIS7YtB4p .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .rough-node .label,#mermaid-svg-dIiAsd7aIS7YtB4p .node .label,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape .label,#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape .label{text-align:center;}#mermaid-svg-dIiAsd7aIS7YtB4p .node.clickable{cursor:pointer;}#mermaid-svg-dIiAsd7aIS7YtB4p .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p .arrowheadPath{fill:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-dIiAsd7aIS7YtB4p .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-dIiAsd7aIS7YtB4p .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dIiAsd7aIS7YtB4p .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-dIiAsd7aIS7YtB4p .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dIiAsd7aIS7YtB4p .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster text{fill:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster span{color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-dIiAsd7aIS7YtB4p .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p rect.text{fill:none;stroke-width:0;}#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape p,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape .label rect,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dIiAsd7aIS7YtB4p .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-dIiAsd7aIS7YtB4p .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-dIiAsd7aIS7YtB4p :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Hugging Face Hub
Transformers
Diffusers
Datasets
Safetensors
PEFT
Sentence Transformers
GGUF Models
LLM / NLP / VLM / ASR / Vision
Text-to-Image / Image-to-Image / Video
LoRA / QLoRA / Adapter
Safe Weight Storage
Embedding / Retrieval / Reranking
llama.cpp
vLLM
TGI
Transformers Pipeline
Accelerate / DeepSpeed / FSDP
Ollama
LM Studio
Jan
llama-cpp-python
OpenAI-Compatible API
Local Chat / Open WebUI
ComfyUI
AUTOMATIC1111
InvokeAI

5.2 LLM 训练、微调、量化、部署链路

#mermaid-svg-Jtwl2m9RNulMTkIw{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Jtwl2m9RNulMTkIw .error-icon{fill:#552222;}#mermaid-svg-Jtwl2m9RNulMTkIw .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Jtwl2m9RNulMTkIw .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw .marker.cross{stroke:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Jtwl2m9RNulMTkIw p{margin:0;}#mermaid-svg-Jtwl2m9RNulMTkIw .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster-label text{fill:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster-label span{color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster-label span p{background-color:transparent;}#mermaid-svg-Jtwl2m9RNulMTkIw .label text,#mermaid-svg-Jtwl2m9RNulMTkIw span{fill:#333;color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .node rect,#mermaid-svg-Jtwl2m9RNulMTkIw .node circle,#mermaid-svg-Jtwl2m9RNulMTkIw .node ellipse,#mermaid-svg-Jtwl2m9RNulMTkIw .node polygon,#mermaid-svg-Jtwl2m9RNulMTkIw .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .rough-node .label text,#mermaid-svg-Jtwl2m9RNulMTkIw .node .label text,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape .label,#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape .label{text-anchor:middle;}#mermaid-svg-Jtwl2m9RNulMTkIw .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .rough-node .label,#mermaid-svg-Jtwl2m9RNulMTkIw .node .label,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape .label,#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape .label{text-align:center;}#mermaid-svg-Jtwl2m9RNulMTkIw .node.clickable{cursor:pointer;}#mermaid-svg-Jtwl2m9RNulMTkIw .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw .arrowheadPath{fill:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-Jtwl2m9RNulMTkIw .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-Jtwl2m9RNulMTkIw .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Jtwl2m9RNulMTkIw .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-Jtwl2m9RNulMTkIw .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Jtwl2m9RNulMTkIw .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster text{fill:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster span{color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Jtwl2m9RNulMTkIw .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw rect.text{fill:none;stroke-width:0;}#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape p,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape .label rect,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Jtwl2m9RNulMTkIw .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Jtwl2m9RNulMTkIw .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Jtwl2m9RNulMTkIw :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Base Model: Llama/Qwen/Mistral/Gemma/DeepSeek
Transformers Load
PEFT LoRA / QLoRA Fine-tuning
Adapter safetensors
Merge LoRA
Full Safetensors Model
vLLM / TGI Production Serving
Convert to GGUF
llama.cpp
Ollama / LM Studio / Jan

5.3 RAG 应用链路

#mermaid-svg-kT5NJM2qUpaV33DB{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-kT5NJM2qUpaV33DB .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-kT5NJM2qUpaV33DB .error-icon{fill:#552222;}#mermaid-svg-kT5NJM2qUpaV33DB .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-kT5NJM2qUpaV33DB .marker{fill:#333333;stroke:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB .marker.cross{stroke:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-kT5NJM2qUpaV33DB p{margin:0;}#mermaid-svg-kT5NJM2qUpaV33DB .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster-label text{fill:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster-label span{color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster-label span p{background-color:transparent;}#mermaid-svg-kT5NJM2qUpaV33DB .label text,#mermaid-svg-kT5NJM2qUpaV33DB span{fill:#333;color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .node rect,#mermaid-svg-kT5NJM2qUpaV33DB .node circle,#mermaid-svg-kT5NJM2qUpaV33DB .node ellipse,#mermaid-svg-kT5NJM2qUpaV33DB .node polygon,#mermaid-svg-kT5NJM2qUpaV33DB .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .rough-node .label text,#mermaid-svg-kT5NJM2qUpaV33DB .node .label text,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape .label,#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape .label{text-anchor:middle;}#mermaid-svg-kT5NJM2qUpaV33DB .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .rough-node .label,#mermaid-svg-kT5NJM2qUpaV33DB .node .label,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape .label,#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape .label{text-align:center;}#mermaid-svg-kT5NJM2qUpaV33DB .node.clickable{cursor:pointer;}#mermaid-svg-kT5NJM2qUpaV33DB .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB .arrowheadPath{fill:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-kT5NJM2qUpaV33DB .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-kT5NJM2qUpaV33DB .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kT5NJM2qUpaV33DB .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-kT5NJM2qUpaV33DB .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kT5NJM2qUpaV33DB .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-kT5NJM2qUpaV33DB .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster text{fill:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster span{color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-kT5NJM2qUpaV33DB .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-kT5NJM2qUpaV33DB rect.text{fill:none;stroke-width:0;}#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape p,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape .label rect,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kT5NJM2qUpaV33DB .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-kT5NJM2qUpaV33DB .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-kT5NJM2qUpaV33DB :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Documents
Chunking
Embedding: BGE/E5/GTE/Jina
Vector Database
Retriever
Reranker: BGE/Jina/ColBERT
Context
LLM: Qwen/Llama/Mistral/DeepSeek
Answer with Citations

5.4 文生图 / 图生图生态链路

#mermaid-svg-dKgIggMFY2fs47s3{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-dKgIggMFY2fs47s3 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-dKgIggMFY2fs47s3 .error-icon{fill:#552222;}#mermaid-svg-dKgIggMFY2fs47s3 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-dKgIggMFY2fs47s3 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 .marker.cross{stroke:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-dKgIggMFY2fs47s3 p{margin:0;}#mermaid-svg-dKgIggMFY2fs47s3 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster-label text{fill:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster-label span{color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster-label span p{background-color:transparent;}#mermaid-svg-dKgIggMFY2fs47s3 .label text,#mermaid-svg-dKgIggMFY2fs47s3 span{fill:#333;color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .node rect,#mermaid-svg-dKgIggMFY2fs47s3 .node circle,#mermaid-svg-dKgIggMFY2fs47s3 .node ellipse,#mermaid-svg-dKgIggMFY2fs47s3 .node polygon,#mermaid-svg-dKgIggMFY2fs47s3 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .rough-node .label text,#mermaid-svg-dKgIggMFY2fs47s3 .node .label text,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape .label,#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape .label{text-anchor:middle;}#mermaid-svg-dKgIggMFY2fs47s3 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .rough-node .label,#mermaid-svg-dKgIggMFY2fs47s3 .node .label,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape .label,#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape .label{text-align:center;}#mermaid-svg-dKgIggMFY2fs47s3 .node.clickable{cursor:pointer;}#mermaid-svg-dKgIggMFY2fs47s3 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 .arrowheadPath{fill:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-dKgIggMFY2fs47s3 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-dKgIggMFY2fs47s3 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dKgIggMFY2fs47s3 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-dKgIggMFY2fs47s3 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dKgIggMFY2fs47s3 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-dKgIggMFY2fs47s3 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster text{fill:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster span{color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-dKgIggMFY2fs47s3 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-dKgIggMFY2fs47s3 rect.text{fill:none;stroke-width:0;}#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape p,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape .label rect,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dKgIggMFY2fs47s3 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-dKgIggMFY2fs47s3 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-dKgIggMFY2fs47s3 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Prompt
Diffusers Pipeline
Base Model: SDXL / Flux / SD3
LoRA / ControlNet / IP-Adapter
Scheduler / Sampler
Generated Image
Upscale / Inpaint / Edit

5.5 本地部署生态链路

#mermaid-svg-y0FwiSxai3nwghoy{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-y0FwiSxai3nwghoy .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-y0FwiSxai3nwghoy .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-y0FwiSxai3nwghoy .error-icon{fill:#552222;}#mermaid-svg-y0FwiSxai3nwghoy .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-y0FwiSxai3nwghoy .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-y0FwiSxai3nwghoy .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-y0FwiSxai3nwghoy .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-y0FwiSxai3nwghoy .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-y0FwiSxai3nwghoy .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-y0FwiSxai3nwghoy .marker{fill:#333333;stroke:#333333;}#mermaid-svg-y0FwiSxai3nwghoy .marker.cross{stroke:#333333;}#mermaid-svg-y0FwiSxai3nwghoy svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-y0FwiSxai3nwghoy p{margin:0;}#mermaid-svg-y0FwiSxai3nwghoy .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster-label text{fill:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster-label span{color:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster-label span p{background-color:transparent;}#mermaid-svg-y0FwiSxai3nwghoy .label text,#mermaid-svg-y0FwiSxai3nwghoy span{fill:#333;color:#333;}#mermaid-svg-y0FwiSxai3nwghoy .node rect,#mermaid-svg-y0FwiSxai3nwghoy .node circle,#mermaid-svg-y0FwiSxai3nwghoy .node ellipse,#mermaid-svg-y0FwiSxai3nwghoy .node polygon,#mermaid-svg-y0FwiSxai3nwghoy .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .rough-node .label text,#mermaid-svg-y0FwiSxai3nwghoy .node .label text,#mermaid-svg-y0FwiSxai3nwghoy .image-shape .label,#mermaid-svg-y0FwiSxai3nwghoy .icon-shape .label{text-anchor:middle;}#mermaid-svg-y0FwiSxai3nwghoy .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .rough-node .label,#mermaid-svg-y0FwiSxai3nwghoy .node .label,#mermaid-svg-y0FwiSxai3nwghoy .image-shape .label,#mermaid-svg-y0FwiSxai3nwghoy .icon-shape .label{text-align:center;}#mermaid-svg-y0FwiSxai3nwghoy .node.clickable{cursor:pointer;}#mermaid-svg-y0FwiSxai3nwghoy .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-y0FwiSxai3nwghoy .arrowheadPath{fill:#333333;}#mermaid-svg-y0FwiSxai3nwghoy .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-y0FwiSxai3nwghoy .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-y0FwiSxai3nwghoy .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-y0FwiSxai3nwghoy .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-y0FwiSxai3nwghoy .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-y0FwiSxai3nwghoy .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-y0FwiSxai3nwghoy .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .cluster text{fill:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster span{color:#333;}#mermaid-svg-y0FwiSxai3nwghoy div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-y0FwiSxai3nwghoy .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-y0FwiSxai3nwghoy rect.text{fill:none;stroke-width:0;}#mermaid-svg-y0FwiSxai3nwghoy .icon-shape,#mermaid-svg-y0FwiSxai3nwghoy .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-y0FwiSxai3nwghoy .icon-shape p,#mermaid-svg-y0FwiSxai3nwghoy .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-y0FwiSxai3nwghoy .icon-shape .label rect,#mermaid-svg-y0FwiSxai3nwghoy .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-y0FwiSxai3nwghoy .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-y0FwiSxai3nwghoy .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-y0FwiSxai3nwghoy :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} HF Transformers Model
Safetensors
Quantization
GGUF
llama.cpp
Ollama
LM Studio
Jan
Open WebUI
AnythingLLM
Continue / Cursor / VSCode


6. 常见模型选型路线

6.1 中文通用聊天模型

场景 推荐
本地轻量 Qwen2.5-3B/7B、Gemma 2B/9B、Phi-3.5
中文能力优先 Qwen2.5/Qwen3、InternLM、GLM
推理能力 DeepSeek-R1、Qwen reasoning variants
长文本 Qwen Long、Mistral Large、Llama long-context variants
代码 Qwen-Coder、DeepSeek-Coder、CodeLlama、StarCoder2、Codestral

6.2 RAG 检索系统

模块 推荐
Embedding 中文/多语 BGE-M3、multilingual-E5、GTE-Qwen、Jina Embeddings
Reranker BGE-Reranker、Jina Reranker、ColBERT
Generator Qwen2.5、Llama 3.1/3.3、Mistral、DeepSeek
部署 vLLM / TGI / Ollama
向量库 Milvus、Qdrant、Weaviate、FAISS、pgvector

6.3 图像生成

场景 推荐
高质量通用文生图 Flux、SDXL、SD3.5
本地低成本 SD1.5、SDXL Turbo、LCM
可控生成 ControlNet、IP-Adapter、T2I-Adapter
工作流 ComfyUI、Diffusers
LoRA 风格 Diffusers + safetensors

6.4 语音系统

场景 推荐
ASR Whisper large-v3、Distil-Whisper、SenseVoice
TTS XTTS-v2、CosyVoice、F5-TTS、Bark
VAD Silero VAD、pyannote
说话人分离 pyannote.audio
音频理解 Qwen2-Audio、SALMONN

6.5 多模态 Agent

模块 推荐
视觉理解 Qwen2.5-VL、InternVL、LLaVA-OneVision
OCR/文档 Qwen2.5-VL、Donut、LayoutLM、ColPali
视频理解 Video-LLaVA、InternVideo、Qwen-VL
工具调用 Qwen、Llama、DeepSeek、GLM
部署 Transformers / vLLM / Ollama

7. 本地部署与生产部署推荐组合

7.1 个人电脑本地运行

硬件 推荐方案
Mac M 系列 Ollama、MLX、LM Studio
Windows NVIDIA GPU Ollama、LM Studio、llama.cpp CUDA、vLLM
无独显 CPU GGUF Q4_K_M / Q5_K_M
低显存 8GB 3B/7B 量化模型
16GB 显存 7B/14B 量化或半精度
24GB 显存 14B/32B 量化,部分 7B FP16
48GB+ 32B/70B 量化或多卡

7.2 企业生产服务

场景 推荐
高并发聊天 vLLM
Hugging Face 原生部署 TGI
多模型服务 Ray Serve / KServe
GPU 集群 vLLM + Kubernetes
RAG 服务 vLLM + embedding service + vector DB
边缘部署 ONNX / OpenVINO / llama.cpp
浏览器端 Transformers.js

7.3 微调路线

需求 推荐
小数据指令微调 PEFT LoRA
低显存微调 QLoRA
全量微调 DeepSpeed / FSDP
偏好对齐 DPO / ORPO / GRPO / PPO
图像 LoRA Diffusers LoRA
语音微调 Transformers / NeMo / SpeechBrain

8. Tasks × Libraries 速查表

Task 常用 Libraries 常见模型
Text Generation Transformers、vLLM、GGUF、Ollama Llama、Qwen、DeepSeek、Mistral、Gemma
Image-Text-to-Text Transformers Qwen-VL、LLaVA、InternVL、Idefics
Text-to-Image Diffusers SDXL、Flux、SD3、Kandinsky
Image-to-Image Diffusers ControlNet、IP-Adapter、InstructPix2Pix
Text-to-Video Diffusers CogVideoX、Open-Sora、HunyuanVideo
ASR Transformers、SpeechBrain、NeMo Whisper、Wav2Vec2、SenseVoice
TTS Transformers、SpeechBrain Bark、VITS、XTTS、CosyVoice
Embedding Sentence Transformers、Transformers BGE、E5、GTE、Jina
Reranker Sentence Transformers、Transformers BGE-Reranker、Jina、ColBERT
Image Classification Transformers、timm ViT、ConvNeXt、ResNet
Object Detection Transformers、timm DETR、YOLO、GroundingDINO
Segmentation Transformers、timm SAM、SAM2、SegFormer
Tabular Scikit-learn、Joblib、PyTorch XGBoost、LightGBM、TabNet
Time Series Transformers PatchTST、Chronos、TimesFM
Robotics LeRobot、Transformers OpenVLA、RT-2、Octo

9. 学习路线

应用开发者路线

text 复制代码
Transformers 基础
  → Pipeline / AutoModel / AutoTokenizer
  → Embedding + RAG
  → vLLM / Ollama 部署
  → Agent / Tool Calling

本地大模型路线

text 复制代码
Ollama / LM Studio
  → GGUF 量化理解
  → Modelfile / Prompt Template
  → Open WebUI
  → RAG 本地知识库

训练微调路线

text 复制代码
PyTorch
  → Transformers Trainer
  → PEFT LoRA
  → QLoRA
  → DPO / RLHF
  → 模型合并 / 量化 / 发布到 Hub

多模态路线

text 复制代码
CLIP / BLIP
  → LLaVA / Qwen-VL / InternVL
  → OCR / Document QA
  → Video Understanding
  → Multimodal Agent

生成式视觉路线

text 复制代码
Diffusers
  → Stable Diffusion / SDXL
  → ControlNet
  → LoRA
  → Flux / SD3
  → ComfyUI 工作流

10. 总结

Hugging Face 的核心价值在于把模型、数据集、框架、格式、推理服务和社区生态连接在一起。

最重要的生态主线:

text 复制代码
模型发现:Hugging Face Hub
模型加载:Transformers / Diffusers
模型微调:PEFT / TRL / Accelerate
模型权重:Safetensors
模型量化:bitsandbytes / GPTQ / AWQ / GGUF
本地推理:llama.cpp / Ollama / LM Studio / MLX
生产部署:vLLM / TGI / Kubernetes
应用系统:RAG / Agent / 多模态 / 语音 / 图像生成

建议优先掌握:Transformers、Sentence Transformers、PEFT、Diffusers、GGUF/Ollama、vLLM、RAG 与 Agent 工程化。


参考资料