详解Hugging Face Models 的两大核心筛选体系Tasks(任务分类)和Libraries(框架分类)
本文按 Tasks(模型能做什么) 与 Libraries(模型如何训练/加载/部署) 两条主线整理 Hugging Face 模型生态,并覆盖 Transformers → PEFT → Safetensors → GGUF → vLLM → Ollama 等主流工程链路。
1. Hugging Face 模型生态总览
Hugging Face Hub 的模型筛选体系主要包含:
| 维度 |
作用 |
示例 |
| Tasks |
按输入输出与应用任务筛选 |
Text Generation、Image Classification、ASR |
| Libraries |
按框架、格式、推理运行时筛选 |
Transformers、Diffusers、PEFT、GGUF、Safetensors |
| Languages |
按语言筛选 |
English、Chinese、Multilingual |
| Licenses |
按许可证筛选 |
Apache-2.0、MIT、Llama、Gemma、OpenRAIL |
| Other |
按部署、推理服务、量化等筛选 |
Inference、vLLM、Ollama、llama.cpp |
Tasks 解决"模型能做什么" ,例如生成文本、识别图像、转写语音。
Libraries 解决"模型怎么用",例如用 Transformers 加载、用 PEFT 微调、用 GGUF 本地部署、用 vLLM 提供 API。
2. Tasks 分类总览
| 大类 |
子任务 |
| Multimodal 多模态 |
Audio-Text-to-Text、Image-Text-to-Text、Image-Text-to-Image、Image-Text-to-Video、VQA、Document QA、Video-Text-to-Text、Visual Document Retrieval、Any-to-Any |
| Computer Vision 计算机视觉 |
Depth Estimation、Image Classification、Object Detection、Image Segmentation、Text-to-Image、Image-to-Text、Image-to-Image、Image-to-Video、Video Classification、Text-to-Video、Zero-Shot Image Classification、Mask Generation、Zero-Shot Object Detection、Text-to-3D、Image-to-3D、Image Feature Extraction、Keypoint Detection、Video-to-Video |
| NLP 自然语言处理 |
Text Classification、Token Classification、Table QA、Question Answering、Zero-Shot Classification、Translation、Summarization、Feature Extraction、Text Generation、Fill-Mask、Sentence Similarity、Text Ranking |
| Audio 音频 |
Text-to-Speech、Text-to-Audio、ASR、Audio-to-Audio、Audio Classification、Voice Activity Detection |
| Tabular 表格 |
Tabular Classification、Tabular Regression、Time Series Forecasting |
| Reinforcement Learning 强化学习 |
Reinforcement Learning、Robotics |
| Other 其他 |
Graph Machine Learning |
3. Tasks 对应代表模型清单
3.1 Multimodal 多模态
Image-Text-to-Text / 视觉语言模型
用于图片理解、图表问答、OCR、视觉 Agent、UI 自动化。
| 模型家族 |
代表模型 |
| Qwen-VL |
Qwen-VL、Qwen2-VL、Qwen2.5-VL-3B/7B/32B/72B-Instruct |
| LLaVA |
LLaVA-1.5、LLaVA-1.6、LLaVA-NeXT、LLaVA-OneVision |
| InternVL |
InternVL2、InternVL2.5、InternVL3 |
| MiniCPM-V |
MiniCPM-V 2.6、MiniCPM-o |
| Idefics |
Idefics2、Idefics3 |
| Phi Vision |
Phi-3.5-Vision、Phi-4-multimodal |
| Gemma Vision |
PaliGemma、Gemma 3 Vision |
| DeepSeek-VL |
DeepSeek-VL、DeepSeek-VL2 |
| CogVLM |
CogVLM、CogVLM2 |
| Florence |
Florence-2 |
| BLIP |
BLIP、BLIP-2、InstructBLIP |
| Molmo |
Molmo-7B、Molmo-72B |
| GLM-V |
GLM-4V、CogAgent |
| Yi-VL |
Yi-VL-6B/34B |
Visual Question Answering
| 模型家族 |
代表模型 |
| LLaVA |
LLaVA-1.5、LLaVA-NeXT |
| BLIP |
BLIP、BLIP-2 |
| Qwen-VL |
Qwen2.5-VL |
| InternVL |
InternVL2.5 |
| Idefics |
Idefics2 |
| PaliGemma |
PaliGemma |
Document Question Answering
| 模型家族 |
代表模型 |
| LayoutLM |
LayoutLM、LayoutLMv2、LayoutLMv3 |
| Donut |
Donut-base、Donut-docvqa |
| Pix2Struct |
Pix2Struct |
| Nougat |
Nougat OCR / scientific PDF |
| ColPali |
ColPali、ColQwen2 |
| Qwen-VL |
Qwen2.5-VL 文档理解 |
| InternVL |
InternVL 文档理解 |
Audio-Text-to-Text
| 模型家族 |
代表模型 |
| Qwen-Audio |
Qwen-Audio、Qwen2-Audio |
| SALMONN |
SALMONN |
| SeamlessM4T |
SeamlessM4T |
| MiniCPM-o |
MiniCPM-o |
| Phi Multimodal |
Phi-4-multimodal |
Image-Text-to-Image
| 模型家族 |
代表模型 |
| Stable Diffusion |
SD 1.5、SD 2.1 |
| SDXL |
Stable Diffusion XL |
| Flux |
FLUX.1-dev、FLUX.1-schnell |
| SD3 |
Stable Diffusion 3、SD3.5 |
| Kandinsky |
Kandinsky 2/3 |
| ControlNet |
ControlNet |
| IP-Adapter |
IP-Adapter |
| InstructPix2Pix |
InstructPix2Pix |
Image-Text-to-Video / Video-Text-to-Text
| 任务 |
代表模型 |
| 图文到视频 |
CogVideoX、Stable Video Diffusion、AnimateDiff、VideoCrafter2、Open-Sora、HunyuanVideo、LTX-Video、Wan Video |
| 视频到文本 |
Video-LLaVA、VideoChatGPT、Qwen2.5-VL、InternVideo2、LongVA、LLaVA-OneVision |
Visual Document Retrieval / Any-to-Any
| 任务 |
代表模型 |
| 视觉文档检索 |
ColPali、ColQwen2、CLIP、OpenCLIP、SigLIP |
| 任意模态到任意模态 |
Qwen2.5-Omni、MiniCPM-o、Phi-4-multimodal、SeamlessM4T |
3.2 Computer Vision 计算机视觉
| Task |
代表模型 |
| Image Classification |
ResNet、EfficientNet、ConvNeXt、ViT、Swin Transformer、DeiT、BEiT、RegNet、MobileNet、DINOv2、SigLIP |
| Object Detection |
DETR、Deformable DETR、YOLOv5/v8/v10、YOLO-NAS、RT-DETR、Grounding DINO、OWL-ViT、Faster R-CNN、Mask R-CNN、Florence-2 |
| Image Segmentation |
SAM、SAM2、Mask2Former、SegFormer、U-Net、DeepLabv3+、OneFormer、CLIPSeg、MobileSAM、FastSAM、HQ-SAM |
| Depth Estimation |
MiDaS、DPT、ZoeDepth、Depth Anything、Depth Anything V2、GLPN、Marigold |
| Text-to-Image |
SD 1.5、SD 2.1、SDXL、SD3/SD3.5、Flux、Kandinsky、PixArt-alpha、PixArt-Sigma、Playground v2.5、DeepFloyd IF、LCM |
| Image-to-Image |
SD img2img、SDXL Refiner、ControlNet、IP-Adapter、InstructPix2Pix、T2I-Adapter、BrushNet、InstantID |
| Image-to-Text |
BLIP、BLIP-2、GIT、Donut、TrOCR、Florence-2、Qwen2.5-VL、PaliGemma |
| Image-to-Video |
SVD、AnimateDiff、I2VGen-XL、CogVideoX、LTX-Video、HunyuanVideo、Wan Video |
| Text-to-Video |
CogVideoX、Open-Sora、VideoCrafter2、ModelScope T2V、HunyuanVideo、LTX-Video、AnimateDiff、Wan Video |
| Zero-Shot Image Classification |
CLIP、OpenCLIP、SigLIP、EVA-CLIP、MetaCLIP、LiT |
| Zero-Shot Object Detection |
Grounding DINO、OWL-ViT、OWLv2、Florence-2、GLIP |
| Mask Generation |
SAM、SAM2、MobileSAM、FastSAM、HQ-SAM |
| Text-to-3D |
Shap-E、Point-E、DreamFusion 类、Magic3D 类、LGM |
| Image-to-3D |
TripoSR、Zero123、Wonder3D、LGM、CRM、InstantMesh |
| Image Feature Extraction |
CLIP、OpenCLIP、DINOv2、SigLIP、EVA-CLIP、ViT、ConvNeXt |
| Video Classification |
VideoMAE、TimeSformer、SlowFast、X-CLIP、InternVideo、Video Swin |
| Keypoint Detection |
OpenPose、ViTPose、HRNet、RTMPose、MediaPipe Pose |
| Video-to-Video |
AnimateDiff、Video ControlNet、TokenFlow、Rerender-A-Video、SVD editing workflows |
3.3 NLP 自然语言处理
Text Generation / LLM
| 模型家族 |
代表模型 |
| Llama |
Llama 2、Llama 3、Llama 3.1、Llama 3.2、Llama 3.3 |
| Qwen |
Qwen1.5、Qwen2、Qwen2.5、Qwen3、Qwen-Coder |
| DeepSeek |
DeepSeek-V2、DeepSeek-V3、DeepSeek-R1、DeepSeek-Coder |
| Mistral |
Mistral 7B、Mixtral 8x7B、Mixtral 8x22B、Mistral Large |
| Gemma |
Gemma、Gemma 2、Gemma 3 |
| Phi |
Phi-2、Phi-3、Phi-3.5、Phi-4 |
| Yi |
Yi-6B、Yi-34B、Yi-1.5 |
| GLM |
ChatGLM、GLM-4、GLM-Z1 |
| Baichuan |
Baichuan2、Baichuan-M1 |
| InternLM |
InternLM2、InternLM2.5、InternLM3 |
| Falcon |
Falcon、Falcon2、Falcon3 |
| Command R |
Command R、Command R+ |
| StarCoder |
StarCoder、StarCoder2 |
| CodeLlama |
CodeLlama |
| Codestral |
Codestral |
| Granite |
IBM Granite |
| OLMo |
OLMo、OLMo 2 |
| Zephyr / Hermes / OpenChat |
Zephyr、Nous Hermes、OpenChat、Vicuna、WizardLM、Tulu、Solar、Aya |
其他 NLP Tasks
| Task |
代表模型 |
| Text Classification |
BERT、RoBERTa、DeBERTa、DistilBERT、ALBERT、ELECTRA、XLNet、ModernBERT、MacBERT、XLM-R |
| Token Classification / NER |
BERT NER、RoBERTa NER、DeBERTa NER、XLM-R NER、MacBERT NER、Flair、spaCy Transformers |
| Question Answering |
BERT SQuAD、RoBERTa SQuAD、DeBERTa QA、DistilBERT QA、ALBERT QA、Longformer QA、BigBird QA |
| Table QA |
TAPAS、TaBERT、TURL、TAPEX、Pix2Struct、LLM 表格问答微调模型 |
| Zero-Shot Classification |
BART-MNLI、DeBERTa-MNLI、XLM-R-XNLI、ModernBERT-NLI、T5 NLI |
| Translation |
MarianMT、M2M100、NLLB-200、SeamlessM4T、mBART50、T5、mT5、MADLAD-400、OPUS-MT |
| Summarization |
BART-large-CNN、PEGASUS、T5、Flan-T5、LED、LongT5、PRIMERA、Llama/Qwen/Mistral 指令模型 |
| Feature Extraction / Embedding |
BGE、BGE-M3、E5、multilingual-E5、GTE、GTE-Qwen、Jina Embeddings、Sentence-BERT、Instructor、UAE、Nomic、Arctic Embed、Stella、ColBERT、Contriever |
| Sentence Similarity |
all-MiniLM、all-mpnet-base、BGE、E5、GTE、Jina、Nomic |
| Text Ranking / Reranker |
BGE-Reranker、Jina Reranker、Cohere Rerank、ColBERTv2、MonoT5、RankT5、GTE Reranker、Qwen Reranker |
| Fill-Mask |
BERT、RoBERTa、DeBERTa、ALBERT、ELECTRA、XLM-R、MacBERT |
3.4 Audio 音频
| Task |
代表模型 |
| Automatic Speech Recognition |
Whisper tiny/base/small/medium/large-v2/large-v3、Distil-Whisper、Wav2Vec2、HuBERT、WavLM、SeamlessM4T、Paraformer、Conformer、NeMo ASR、SenseVoice |
| Text-to-Speech |
Bark、VITS、MMS-TTS、SpeechT5、XTTS-v2、ChatTTS、CosyVoice、F5-TTS、Fish Speech、Parler-TTS |
| Text-to-Audio |
AudioLDM、AudioLDM2、MusicGen、Bark、Stable Audio、AudioGen、Tango、Make-An-Audio |
| Audio-to-Audio |
RVC、So-VITS-SVC、VoiceFixer、Demucs、AudioSep、MetricGAN+、SepFormer |
| Audio Classification |
AST、YAMNet、PANNs、Wav2Vec2 classification、HuBERT classification、BEATs、CLAP |
| Voice Activity Detection |
Silero VAD、WebRTC VAD、pyannote.audio、NeMo VAD、SpeechBrain VAD |
3.5 Tabular / RL / Graph
| 大类 |
Task |
代表模型/框架 |
| Tabular |
Tabular Classification |
TabNet、FT-Transformer、TabTransformer、SAINT、AutoGluon、XGBoost、LightGBM、CatBoost |
| Tabular |
Tabular Regression |
XGBoost、LightGBM、CatBoost、TabNet、FT-Transformer、AutoGluon |
| Tabular |
Time Series Forecasting |
TimeSeries Transformer、Informer、Autoformer、PatchTST、TimesFM、Chronos、Lag-Llama、TFT、N-BEATS、DeepAR |
| RL |
Reinforcement Learning |
Stable-Baselines3、CleanRL、TRL、Decision Transformer、CQL、IQL、RLHF、RLAIF |
| Robotics |
Robotics |
RT-1、RT-2、OpenVLA、Octo、Diffusion Policy、ACT、RoboFlamingo、LeRobot |
| Graph |
Graph Machine Learning |
GCN、GraphSAGE、GAT、GIN、R-GCN、Graphormer、PyG、DGL、TransE、RotatE、ComplEx |
4. Libraries 分类详解
Hugging Face 最核心模型库,覆盖文本、视觉、音频、视频和多模态模型的训练与推理。
常见模型:Llama、Qwen、Gemma、Mistral、DeepSeek、BERT、RoBERTa、Whisper、ViT、CLIP、LLaVA。
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Qwen/Qwen2.5-7B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
4.2 Diffusers
扩散模型生态,适合文生图、图生图、图像修复、ControlNet、LoRA 风格模型、文生视频、图生视频。
代表模型:Stable Diffusion、SDXL、SD3/SD3.5、Flux、Kandinsky、CogVideoX、AnimateDiff、Stable Video Diffusion。
4.3 PEFT
Parameter-Efficient Fine-Tuning,参数高效微调。
| 技术 |
说明 |
| LoRA |
低秩适配器微调 |
| QLoRA |
量化基础模型 + LoRA 微调 |
| AdaLoRA |
自适应 LoRA |
| Prefix Tuning |
前缀参数微调 |
| Prompt Tuning |
可训练软提示 |
| IA3 |
激活缩放类高效微调 |
4.4 Safetensors
安全权重格式,替代 pickle 格式,常见文件包括 model.safetensors、adapter_model.safetensors、model-00001-of-000xx.safetensors。
优势:安全、加载快、适合大模型分片、支持 metadata。
4.5 GGUF
llama.cpp 生态模型文件格式,适合本地 CPU/GPU 混合推理。
| 量化 |
特点 |
| F16 |
高质量,体积大 |
| Q8_0 |
高质量量化 |
| Q6_K |
质量较高,体积适中 |
| Q5_K_M |
常用平衡选择 |
| Q4_K_M |
低显存常用 |
| Q3_K_M |
更小但质量下降 |
| IQ4 / IQ3 |
新型极低比特量化 |
4.6 vLLM
高吞吐 LLM 推理服务框架,适合生产环境 API 服务。
特点:PagedAttention、高并发、连续批处理、OpenAI API 兼容。
vllm serve Qwen/Qwen2.5-7B-Instruct
4.7 Ollama
本地大模型运行工具,常基于 GGUF / llama.cpp 生态。
ollama run qwen2.5
ollama run llama3.2
ollama run deepseek-r1
4.8 其他重要 Libraries
| Library |
作用 |
| llama.cpp |
C/C++ 轻量推理框架,GGUF 核心运行时 |
| Sentence Transformers |
Embedding、语义搜索、相似度、Reranking |
| Transformers.js |
浏览器 / Node.js 端运行模型 |
| ONNX |
跨平台推理格式 |
| OpenVINO |
Intel CPU/GPU/边缘设备推理优化 |
| MLX |
Apple Silicon 本地推理与训练 |
| timm |
PyTorch 图像模型库 |
| TensorFlow / Keras |
Google 深度学习生态 |
| JAX / Flax |
TPU 与研究型高性能训练 |
| OpenCLIP |
CLIP 开源实现 |
| spaCy |
工业 NLP 管线 |
| NeMo |
NVIDIA 语音和大模型训练生态 |
| PaddlePaddle / PaddleOCR |
中文 OCR 与飞桨生态 |
| Rust / Candle |
Rust 高性能端侧推理 |
5. Libraries 完整生态关系图
5.1 总体生态图
#mermaid-svg-dIiAsd7aIS7YtB4p{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-dIiAsd7aIS7YtB4p .error-icon{fill:#552222;}#mermaid-svg-dIiAsd7aIS7YtB4p .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-dIiAsd7aIS7YtB4p .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-dIiAsd7aIS7YtB4p .marker{fill:#333333;stroke:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p .marker.cross{stroke:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-dIiAsd7aIS7YtB4p p{margin:0;}#mermaid-svg-dIiAsd7aIS7YtB4p .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster-label text{fill:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster-label span{color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster-label span p{background-color:transparent;}#mermaid-svg-dIiAsd7aIS7YtB4p .label text,#mermaid-svg-dIiAsd7aIS7YtB4p span{fill:#333;color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .node rect,#mermaid-svg-dIiAsd7aIS7YtB4p .node circle,#mermaid-svg-dIiAsd7aIS7YtB4p .node ellipse,#mermaid-svg-dIiAsd7aIS7YtB4p .node polygon,#mermaid-svg-dIiAsd7aIS7YtB4p .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .rough-node .label text,#mermaid-svg-dIiAsd7aIS7YtB4p .node .label text,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape .label,#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape .label{text-anchor:middle;}#mermaid-svg-dIiAsd7aIS7YtB4p .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .rough-node .label,#mermaid-svg-dIiAsd7aIS7YtB4p .node .label,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape .label,#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape .label{text-align:center;}#mermaid-svg-dIiAsd7aIS7YtB4p .node.clickable{cursor:pointer;}#mermaid-svg-dIiAsd7aIS7YtB4p .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p .arrowheadPath{fill:#333333;}#mermaid-svg-dIiAsd7aIS7YtB4p .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-dIiAsd7aIS7YtB4p .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-dIiAsd7aIS7YtB4p .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dIiAsd7aIS7YtB4p .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-dIiAsd7aIS7YtB4p .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dIiAsd7aIS7YtB4p .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster text{fill:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p .cluster span{color:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-dIiAsd7aIS7YtB4p .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-dIiAsd7aIS7YtB4p rect.text{fill:none;stroke-width:0;}#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape p,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-dIiAsd7aIS7YtB4p .icon-shape .label rect,#mermaid-svg-dIiAsd7aIS7YtB4p .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dIiAsd7aIS7YtB4p .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-dIiAsd7aIS7YtB4p .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-dIiAsd7aIS7YtB4p :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Hugging Face Hub
Transformers
Diffusers
Datasets
Safetensors
PEFT
Sentence Transformers
GGUF Models
LLM / NLP / VLM / ASR / Vision
Text-to-Image / Image-to-Image / Video
LoRA / QLoRA / Adapter
Safe Weight Storage
Embedding / Retrieval / Reranking
llama.cpp
vLLM
TGI
Transformers Pipeline
Accelerate / DeepSpeed / FSDP
Ollama
LM Studio
Jan
llama-cpp-python
OpenAI-Compatible API
Local Chat / Open WebUI
ComfyUI
AUTOMATIC1111
InvokeAI
5.2 LLM 训练、微调、量化、部署链路
#mermaid-svg-Jtwl2m9RNulMTkIw{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Jtwl2m9RNulMTkIw .error-icon{fill:#552222;}#mermaid-svg-Jtwl2m9RNulMTkIw .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Jtwl2m9RNulMTkIw .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Jtwl2m9RNulMTkIw .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw .marker.cross{stroke:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Jtwl2m9RNulMTkIw p{margin:0;}#mermaid-svg-Jtwl2m9RNulMTkIw .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster-label text{fill:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster-label span{color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster-label span p{background-color:transparent;}#mermaid-svg-Jtwl2m9RNulMTkIw .label text,#mermaid-svg-Jtwl2m9RNulMTkIw span{fill:#333;color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .node rect,#mermaid-svg-Jtwl2m9RNulMTkIw .node circle,#mermaid-svg-Jtwl2m9RNulMTkIw .node ellipse,#mermaid-svg-Jtwl2m9RNulMTkIw .node polygon,#mermaid-svg-Jtwl2m9RNulMTkIw .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .rough-node .label text,#mermaid-svg-Jtwl2m9RNulMTkIw .node .label text,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape .label,#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape .label{text-anchor:middle;}#mermaid-svg-Jtwl2m9RNulMTkIw .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .rough-node .label,#mermaid-svg-Jtwl2m9RNulMTkIw .node .label,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape .label,#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape .label{text-align:center;}#mermaid-svg-Jtwl2m9RNulMTkIw .node.clickable{cursor:pointer;}#mermaid-svg-Jtwl2m9RNulMTkIw .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw .arrowheadPath{fill:#333333;}#mermaid-svg-Jtwl2m9RNulMTkIw .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-Jtwl2m9RNulMTkIw .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-Jtwl2m9RNulMTkIw .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Jtwl2m9RNulMTkIw .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-Jtwl2m9RNulMTkIw .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Jtwl2m9RNulMTkIw .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster text{fill:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw .cluster span{color:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Jtwl2m9RNulMTkIw .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Jtwl2m9RNulMTkIw rect.text{fill:none;stroke-width:0;}#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape p,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-Jtwl2m9RNulMTkIw .icon-shape .label rect,#mermaid-svg-Jtwl2m9RNulMTkIw .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Jtwl2m9RNulMTkIw .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Jtwl2m9RNulMTkIw .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Jtwl2m9RNulMTkIw :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Base Model: Llama/Qwen/Mistral/Gemma/DeepSeek
Transformers Load
PEFT LoRA / QLoRA Fine-tuning
Adapter safetensors
Merge LoRA
Full Safetensors Model
vLLM / TGI Production Serving
Convert to GGUF
llama.cpp
Ollama / LM Studio / Jan
5.3 RAG 应用链路
#mermaid-svg-kT5NJM2qUpaV33DB{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-kT5NJM2qUpaV33DB .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-kT5NJM2qUpaV33DB .error-icon{fill:#552222;}#mermaid-svg-kT5NJM2qUpaV33DB .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-kT5NJM2qUpaV33DB .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-kT5NJM2qUpaV33DB .marker{fill:#333333;stroke:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB .marker.cross{stroke:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-kT5NJM2qUpaV33DB p{margin:0;}#mermaid-svg-kT5NJM2qUpaV33DB .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster-label text{fill:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster-label span{color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster-label span p{background-color:transparent;}#mermaid-svg-kT5NJM2qUpaV33DB .label text,#mermaid-svg-kT5NJM2qUpaV33DB span{fill:#333;color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .node rect,#mermaid-svg-kT5NJM2qUpaV33DB .node circle,#mermaid-svg-kT5NJM2qUpaV33DB .node ellipse,#mermaid-svg-kT5NJM2qUpaV33DB .node polygon,#mermaid-svg-kT5NJM2qUpaV33DB .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .rough-node .label text,#mermaid-svg-kT5NJM2qUpaV33DB .node .label text,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape .label,#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape .label{text-anchor:middle;}#mermaid-svg-kT5NJM2qUpaV33DB .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .rough-node .label,#mermaid-svg-kT5NJM2qUpaV33DB .node .label,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape .label,#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape .label{text-align:center;}#mermaid-svg-kT5NJM2qUpaV33DB .node.clickable{cursor:pointer;}#mermaid-svg-kT5NJM2qUpaV33DB .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB .arrowheadPath{fill:#333333;}#mermaid-svg-kT5NJM2qUpaV33DB .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-kT5NJM2qUpaV33DB .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-kT5NJM2qUpaV33DB .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kT5NJM2qUpaV33DB .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-kT5NJM2qUpaV33DB .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kT5NJM2qUpaV33DB .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-kT5NJM2qUpaV33DB .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster text{fill:#333;}#mermaid-svg-kT5NJM2qUpaV33DB .cluster span{color:#333;}#mermaid-svg-kT5NJM2qUpaV33DB div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-kT5NJM2qUpaV33DB .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-kT5NJM2qUpaV33DB rect.text{fill:none;stroke-width:0;}#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape p,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-kT5NJM2qUpaV33DB .icon-shape .label rect,#mermaid-svg-kT5NJM2qUpaV33DB .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kT5NJM2qUpaV33DB .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-kT5NJM2qUpaV33DB .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-kT5NJM2qUpaV33DB :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Documents
Chunking
Embedding: BGE/E5/GTE/Jina
Vector Database
Retriever
Reranker: BGE/Jina/ColBERT
Context
LLM: Qwen/Llama/Mistral/DeepSeek
Answer with Citations
5.4 文生图 / 图生图生态链路
#mermaid-svg-dKgIggMFY2fs47s3{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-dKgIggMFY2fs47s3 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-dKgIggMFY2fs47s3 .error-icon{fill:#552222;}#mermaid-svg-dKgIggMFY2fs47s3 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-dKgIggMFY2fs47s3 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-dKgIggMFY2fs47s3 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 .marker.cross{stroke:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-dKgIggMFY2fs47s3 p{margin:0;}#mermaid-svg-dKgIggMFY2fs47s3 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster-label text{fill:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster-label span{color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster-label span p{background-color:transparent;}#mermaid-svg-dKgIggMFY2fs47s3 .label text,#mermaid-svg-dKgIggMFY2fs47s3 span{fill:#333;color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .node rect,#mermaid-svg-dKgIggMFY2fs47s3 .node circle,#mermaid-svg-dKgIggMFY2fs47s3 .node ellipse,#mermaid-svg-dKgIggMFY2fs47s3 .node polygon,#mermaid-svg-dKgIggMFY2fs47s3 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .rough-node .label text,#mermaid-svg-dKgIggMFY2fs47s3 .node .label text,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape .label,#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape .label{text-anchor:middle;}#mermaid-svg-dKgIggMFY2fs47s3 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .rough-node .label,#mermaid-svg-dKgIggMFY2fs47s3 .node .label,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape .label,#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape .label{text-align:center;}#mermaid-svg-dKgIggMFY2fs47s3 .node.clickable{cursor:pointer;}#mermaid-svg-dKgIggMFY2fs47s3 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 .arrowheadPath{fill:#333333;}#mermaid-svg-dKgIggMFY2fs47s3 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-dKgIggMFY2fs47s3 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-dKgIggMFY2fs47s3 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dKgIggMFY2fs47s3 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-dKgIggMFY2fs47s3 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dKgIggMFY2fs47s3 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-dKgIggMFY2fs47s3 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster text{fill:#333;}#mermaid-svg-dKgIggMFY2fs47s3 .cluster span{color:#333;}#mermaid-svg-dKgIggMFY2fs47s3 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-dKgIggMFY2fs47s3 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-dKgIggMFY2fs47s3 rect.text{fill:none;stroke-width:0;}#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape p,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-dKgIggMFY2fs47s3 .icon-shape .label rect,#mermaid-svg-dKgIggMFY2fs47s3 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-dKgIggMFY2fs47s3 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-dKgIggMFY2fs47s3 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-dKgIggMFY2fs47s3 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Prompt
Diffusers Pipeline
Base Model: SDXL / Flux / SD3
LoRA / ControlNet / IP-Adapter
Scheduler / Sampler
Generated Image
Upscale / Inpaint / Edit
5.5 本地部署生态链路
#mermaid-svg-y0FwiSxai3nwghoy{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-y0FwiSxai3nwghoy .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-y0FwiSxai3nwghoy .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-y0FwiSxai3nwghoy .error-icon{fill:#552222;}#mermaid-svg-y0FwiSxai3nwghoy .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-y0FwiSxai3nwghoy .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-y0FwiSxai3nwghoy .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-y0FwiSxai3nwghoy .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-y0FwiSxai3nwghoy .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-y0FwiSxai3nwghoy .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-y0FwiSxai3nwghoy .marker{fill:#333333;stroke:#333333;}#mermaid-svg-y0FwiSxai3nwghoy .marker.cross{stroke:#333333;}#mermaid-svg-y0FwiSxai3nwghoy svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-y0FwiSxai3nwghoy p{margin:0;}#mermaid-svg-y0FwiSxai3nwghoy .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster-label text{fill:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster-label span{color:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster-label span p{background-color:transparent;}#mermaid-svg-y0FwiSxai3nwghoy .label text,#mermaid-svg-y0FwiSxai3nwghoy span{fill:#333;color:#333;}#mermaid-svg-y0FwiSxai3nwghoy .node rect,#mermaid-svg-y0FwiSxai3nwghoy .node circle,#mermaid-svg-y0FwiSxai3nwghoy .node ellipse,#mermaid-svg-y0FwiSxai3nwghoy .node polygon,#mermaid-svg-y0FwiSxai3nwghoy .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .rough-node .label text,#mermaid-svg-y0FwiSxai3nwghoy .node .label text,#mermaid-svg-y0FwiSxai3nwghoy .image-shape .label,#mermaid-svg-y0FwiSxai3nwghoy .icon-shape .label{text-anchor:middle;}#mermaid-svg-y0FwiSxai3nwghoy .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .rough-node .label,#mermaid-svg-y0FwiSxai3nwghoy .node .label,#mermaid-svg-y0FwiSxai3nwghoy .image-shape .label,#mermaid-svg-y0FwiSxai3nwghoy .icon-shape .label{text-align:center;}#mermaid-svg-y0FwiSxai3nwghoy .node.clickable{cursor:pointer;}#mermaid-svg-y0FwiSxai3nwghoy .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-y0FwiSxai3nwghoy .arrowheadPath{fill:#333333;}#mermaid-svg-y0FwiSxai3nwghoy .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-y0FwiSxai3nwghoy .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-y0FwiSxai3nwghoy .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-y0FwiSxai3nwghoy .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-y0FwiSxai3nwghoy .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-y0FwiSxai3nwghoy .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-y0FwiSxai3nwghoy .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-y0FwiSxai3nwghoy .cluster text{fill:#333;}#mermaid-svg-y0FwiSxai3nwghoy .cluster span{color:#333;}#mermaid-svg-y0FwiSxai3nwghoy div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-y0FwiSxai3nwghoy .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-y0FwiSxai3nwghoy rect.text{fill:none;stroke-width:0;}#mermaid-svg-y0FwiSxai3nwghoy .icon-shape,#mermaid-svg-y0FwiSxai3nwghoy .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-y0FwiSxai3nwghoy .icon-shape p,#mermaid-svg-y0FwiSxai3nwghoy .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-y0FwiSxai3nwghoy .icon-shape .label rect,#mermaid-svg-y0FwiSxai3nwghoy .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-y0FwiSxai3nwghoy .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-y0FwiSxai3nwghoy .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-y0FwiSxai3nwghoy :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} HF Transformers Model
Safetensors
Quantization
GGUF
llama.cpp
Ollama
LM Studio
Jan
Open WebUI
AnythingLLM
Continue / Cursor / VSCode
6. 常见模型选型路线
6.1 中文通用聊天模型
| 场景 |
推荐 |
| 本地轻量 |
Qwen2.5-3B/7B、Gemma 2B/9B、Phi-3.5 |
| 中文能力优先 |
Qwen2.5/Qwen3、InternLM、GLM |
| 推理能力 |
DeepSeek-R1、Qwen reasoning variants |
| 长文本 |
Qwen Long、Mistral Large、Llama long-context variants |
| 代码 |
Qwen-Coder、DeepSeek-Coder、CodeLlama、StarCoder2、Codestral |
6.2 RAG 检索系统
| 模块 |
推荐 |
| Embedding 中文/多语 |
BGE-M3、multilingual-E5、GTE-Qwen、Jina Embeddings |
| Reranker |
BGE-Reranker、Jina Reranker、ColBERT |
| Generator |
Qwen2.5、Llama 3.1/3.3、Mistral、DeepSeek |
| 部署 |
vLLM / TGI / Ollama |
| 向量库 |
Milvus、Qdrant、Weaviate、FAISS、pgvector |
6.3 图像生成
| 场景 |
推荐 |
| 高质量通用文生图 |
Flux、SDXL、SD3.5 |
| 本地低成本 |
SD1.5、SDXL Turbo、LCM |
| 可控生成 |
ControlNet、IP-Adapter、T2I-Adapter |
| 工作流 |
ComfyUI、Diffusers |
| LoRA 风格 |
Diffusers + safetensors |
6.4 语音系统
| 场景 |
推荐 |
| ASR |
Whisper large-v3、Distil-Whisper、SenseVoice |
| TTS |
XTTS-v2、CosyVoice、F5-TTS、Bark |
| VAD |
Silero VAD、pyannote |
| 说话人分离 |
pyannote.audio |
| 音频理解 |
Qwen2-Audio、SALMONN |
6.5 多模态 Agent
| 模块 |
推荐 |
| 视觉理解 |
Qwen2.5-VL、InternVL、LLaVA-OneVision |
| OCR/文档 |
Qwen2.5-VL、Donut、LayoutLM、ColPali |
| 视频理解 |
Video-LLaVA、InternVideo、Qwen-VL |
| 工具调用 |
Qwen、Llama、DeepSeek、GLM |
| 部署 |
Transformers / vLLM / Ollama |
7. 本地部署与生产部署推荐组合
7.1 个人电脑本地运行
| 硬件 |
推荐方案 |
| Mac M 系列 |
Ollama、MLX、LM Studio |
| Windows NVIDIA GPU |
Ollama、LM Studio、llama.cpp CUDA、vLLM |
| 无独显 CPU |
GGUF Q4_K_M / Q5_K_M |
| 低显存 8GB |
3B/7B 量化模型 |
| 16GB 显存 |
7B/14B 量化或半精度 |
| 24GB 显存 |
14B/32B 量化,部分 7B FP16 |
| 48GB+ |
32B/70B 量化或多卡 |
7.2 企业生产服务
| 场景 |
推荐 |
| 高并发聊天 |
vLLM |
| Hugging Face 原生部署 |
TGI |
| 多模型服务 |
Ray Serve / KServe |
| GPU 集群 |
vLLM + Kubernetes |
| RAG 服务 |
vLLM + embedding service + vector DB |
| 边缘部署 |
ONNX / OpenVINO / llama.cpp |
| 浏览器端 |
Transformers.js |
7.3 微调路线
| 需求 |
推荐 |
| 小数据指令微调 |
PEFT LoRA |
| 低显存微调 |
QLoRA |
| 全量微调 |
DeepSpeed / FSDP |
| 偏好对齐 |
DPO / ORPO / GRPO / PPO |
| 图像 LoRA |
Diffusers LoRA |
| 语音微调 |
Transformers / NeMo / SpeechBrain |
8. Tasks × Libraries 速查表
| Task |
常用 Libraries |
常见模型 |
| Text Generation |
Transformers、vLLM、GGUF、Ollama |
Llama、Qwen、DeepSeek、Mistral、Gemma |
| Image-Text-to-Text |
Transformers |
Qwen-VL、LLaVA、InternVL、Idefics |
| Text-to-Image |
Diffusers |
SDXL、Flux、SD3、Kandinsky |
| Image-to-Image |
Diffusers |
ControlNet、IP-Adapter、InstructPix2Pix |
| Text-to-Video |
Diffusers |
CogVideoX、Open-Sora、HunyuanVideo |
| ASR |
Transformers、SpeechBrain、NeMo |
Whisper、Wav2Vec2、SenseVoice |
| TTS |
Transformers、SpeechBrain |
Bark、VITS、XTTS、CosyVoice |
| Embedding |
Sentence Transformers、Transformers |
BGE、E5、GTE、Jina |
| Reranker |
Sentence Transformers、Transformers |
BGE-Reranker、Jina、ColBERT |
| Image Classification |
Transformers、timm |
ViT、ConvNeXt、ResNet |
| Object Detection |
Transformers、timm |
DETR、YOLO、GroundingDINO |
| Segmentation |
Transformers、timm |
SAM、SAM2、SegFormer |
| Tabular |
Scikit-learn、Joblib、PyTorch |
XGBoost、LightGBM、TabNet |
| Time Series |
Transformers |
PatchTST、Chronos、TimesFM |
| Robotics |
LeRobot、Transformers |
OpenVLA、RT-2、Octo |
9. 学习路线
应用开发者路线
Transformers 基础
→ Pipeline / AutoModel / AutoTokenizer
→ Embedding + RAG
→ vLLM / Ollama 部署
→ Agent / Tool Calling
本地大模型路线
Ollama / LM Studio
→ GGUF 量化理解
→ Modelfile / Prompt Template
→ Open WebUI
→ RAG 本地知识库
训练微调路线
PyTorch
→ Transformers Trainer
→ PEFT LoRA
→ QLoRA
→ DPO / RLHF
→ 模型合并 / 量化 / 发布到 Hub
多模态路线
CLIP / BLIP
→ LLaVA / Qwen-VL / InternVL
→ OCR / Document QA
→ Video Understanding
→ Multimodal Agent
生成式视觉路线
Diffusers
→ Stable Diffusion / SDXL
→ ControlNet
→ LoRA
→ Flux / SD3
→ ComfyUI 工作流
10. 总结
Hugging Face 的核心价值在于把模型、数据集、框架、格式、推理服务和社区生态连接在一起。
最重要的生态主线:
模型发现:Hugging Face Hub
模型加载:Transformers / Diffusers
模型微调:PEFT / TRL / Accelerate
模型权重:Safetensors
模型量化:bitsandbytes / GPTQ / AWQ / GGUF
本地推理:llama.cpp / Ollama / LM Studio / MLX
生产部署:vLLM / TGI / Kubernetes
应用系统:RAG / Agent / 多模态 / 语音 / 图像生成
建议优先掌握:Transformers、Sentence Transformers、PEFT、Diffusers、GGUF/Ollama、vLLM、RAG 与 Agent 工程化。
参考资料