《solopreneur》从零到一第 3 期

多 LLM 配置实战：OpenRouter / 本地模型 / 火山引擎全接入 | 第 3 期

系列：《solopreneur 从零到一》第 3 期

GitHub：https://github.com/lllooollpp/solopreneur-

上一篇：第 2 期 - 5 分钟快速上手

solopreneur 通过 LiteLLM 做 Provider 统一抽象层，理论上支持 LiteLLM 能接的所有模型。本篇系统讲解 6 类常用配置方案，并对比它们的适用场景。

Provider 架构总览

复制代码

你的 Prompt
    │
    ▼
solopreneur Agent Loop
    │
    ▼
Provider Layer（内置选择逻辑）
    ├─ GitHub Copilot Token Pool  ← 优先级最高（可配置）
    ├─ OpenRouter                 ← 推荐首选
    ├─ Anthropic                  ← Claude 官方
    ├─ OpenAI                     ← GPT 系列
    ├─ Gemini                     ← Google
    ├─ Groq                       ← 超快推理
    ├─ 火山引擎（智谱）             ← 国内友好
    └─ vLLM / Ollama              ← 本地模型

选择逻辑：

如果 copilot_priority: true 且 Token 池可用 → 用 Copilot
否则按 model 字段前缀路由到对应 Provider
本地模型检测到 api_base 包含 localhost → 自动锁模型

方案一：OpenRouter（推荐新手 & 多模型切换）

为什么推荐 OpenRouter？

一个 Key，访问 200+ 模型（Claude、GPT-4o、Gemini、Mistral、Llama 等）
按 token 计费，无月费
支持免费额度（部分模型）
国内直连（相对稳定）

配置

json 复制代码

{
  "providers": {
    "openrouter": {
      "apiKey": "sk-or-v1-xxxxxxxxxxxxxx"
    }
  },
  "agents": {
    "defaults": {
      "model": "anthropic/claude-sonnet-4"
    }
  }
}

常用模型列表

模型 ID	说明	参考价格（每百万 token）
`anthropic/claude-sonnet-4`	最强综合	$3/$ 15
`anthropic/claude-3-5-haiku`	快速轻量	$0.25/$ 1.25
`openai/gpt-4o`	GPT 旗舰	$2.5/$ 10
`openai/gpt-4o-mini`	经济快速	$0.15/$ 0.6
`google/gemini-2.0-flash-exp`	免费！	$0
`meta-llama/llama-3.3-70b`	开源顶配	$0.27/$ 0.85

在 solopreneur Web UI 的对话框，可以实时切换模型：

复制代码

对话框 → 模型选择下拉 → 选择任意已配置的模型

方案二：Anthropic 官方（Claude 官方接口）

适合中高频使用，直接走 Anthropic 账单，不经中间商。

json 复制代码

{
  "providers": {
    "anthropic": {
      "apiKey": "sk-ant-api03-xxxxxxxxx",
      "apiBase": ""
    }
  },
  "agents": {
    "defaults": {
      "model": "claude-3-5-sonnet-20241022"
    }
  }
}

注意：未填 apiBase 时使用官方默认地址。如果你有中转代理，填入代理地址即可。

方案三：OpenAI（GPT 系列）

json 复制代码

{
  "providers": {
    "openai": {
      "apiKey": "sk-xxxxxxxx",
      "apiBase": "https://api.openai.com/v1"
    }
  },
  "agents": {
    "defaults": {
      "model": "gpt-4o"
    }
  }
}

如果使用国内代理（如 api2d、one-api），把 apiBase 改为代理地址：

json 复制代码

"apiBase": "https://your-proxy.com/v1"

方案四：火山引擎（智谱 AI / GLM-4 系列）

国内首选，延迟低，支持中文场景，注册即送免费额度。

注册获取 Key

访问 https://open.bigmodel.cn/
注册账户，进入「API 密钥」页面
创建一个 Key

配置 solopreneur

json 复制代码

{
  "providers": {
    "zhipu": {
      "apiKey": "xxxxxxxxxxxxxxxxx.xxxxxxxxxxxxxxxx"
    }
  },
  "agents": {
    "defaults": {
      "model": "glm-4-plus"
    }
  }
}

可用模型

模型	特点
`glm-4-plus`	性能最强
`glm-4`	标准版
`glm-4-flash`	极速版，免费
`glm-4-air`	高性价比

GLM-4-Flash 完全免费，日常测试和轻量任务用它就够了：

json 复制代码

"model": "glm-4-flash"

方案五：本地模型（Ollama / vLLM）

完全本地运行，零费用，适合：

代码量大、不想付费的场景
数据敏感、不能上传到云端的项目
研究和实验

5.1 使用 Ollama

先安装 Ollama：https://ollama.ai

bash 复制代码

# 下载模型
ollama pull qwen2.5-coder:32b    # 推荐：代码能力强
ollama pull llama3.3:70b         # 通用大模型

# 启动服务（默认端口 11434）
ollama serve

配置 solopreneur：

json 复制代码

{
  "providers": {
    "vllm": {
      "apiKey": "ollama",
      "api_base": "http://localhost:11434/v1"
    }
  },
  "agents": {
    "defaults": {
      "model": "ollama/qwen2.5-coder:32b"
    }
  }
}

5.2 使用 vLLM

适合有 GPU 显卡的用户：

bash 复制代码

pip install vllm
python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen2.5-Coder-32B-Instruct \
    --port 8001

配置：

json 复制代码

{
  "providers": {
    "vllm": {
      "apiKey": "EMPTY",
      "api_base": "http://localhost:8001/v1"
    }
  },
  "agents": {
    "defaults": {
      "model": "vllm/Qwen/Qwen2.5-Coder-32B-Instruct"
    }
  }
}

本地模型特殊优化

solopreneur 对本地模型有专门优化：

自动锁模型 ：检测到 api_base 包含 localhost 时，Web UI 的模型选择器会自动禁用，防止误覆盖
Token 估算：本地模型通常不返回 usage 数据，solopreneur 会基于字符数自动估算，避免统计为 0
从配置读取模型名：避免后端 API 调用，直接用配置文件指定的名称

方案六：多 Provider 混用

最灵活的方案：同时配置多个 Provider，按任务切换。

json 复制代码

{
  "providers": {
    "copilot_priority": false,
    "openrouter": { "apiKey": "sk-or-v1-xxx" },
    "anthropic": { "apiKey": "sk-ant-xxx" },
    "openai": { "apiKey": "sk-xxx" },
    "zhipu": { "apiKey": "xxx" },
    "vllm": { "apiKey": "none", "api_base": "http://localhost:11434/v1" }
  },
  "agents": {
    "defaults": {
      "model": "anthropic/claude-sonnet-4"
    }
  }
}

然后在 Web UI 对话框实时切换：

复制代码

精细代码任务  → anthropic/claude-sonnet-4
日常问答      → glm-4-flash（免费）
本地代码分析  → ollama/qwen2.5-coder:32b

验证配置

运行以下命令检查配置状态：

bash 复制代码

solopreneur status

输出示例：

复制代码

✅ OpenRouter     sk-or-v1-xxx... (configured)
✅ Anthropic      sk-ant-xxx...   (configured)
❌ OpenAI         (not configured)
✅ Zhipu          xxx...          (configured)
✅ vLLM           localhost:11434 (configured)
❌ GitHub Copilot (no tokens)

Active model: anthropic/claude-sonnet-4

各 Provider 选型建议

场景	推荐 Provider
刚上手、想快速试用	OpenRouter + GLM-4-Flash（免费额度）
追求最强编码能力	Anthropic Claude Sonnet 4
数据敏感/完全本地	Ollama + Qwen2.5-Coder
国内高频使用、低延迟	火山引擎 GLM-4-Flash
白嫖 GitHub Copilot	见第 4 期 🔥

下一期预告

第 4 期：GitHub Copilot 白嫖指南 ------ 多账号 Token 池详解，429 自动熔断

如果你有多个 GitHub 账号订阅了 Copilot，solopreneur 提供了完整的 Token 池管理能力，轮询负载均衡 + 429 自动熔断，官方 API 零费用跑 Claude/GPT。

GitHub：https://github.com/lllooollpp/solopreneur-

《solopreneur》 从零到一 第 3 期

多 LLM 配置实战：OpenRouter / 本地模型 / 火山引擎全接入 | 第 3 期

Provider 架构总览

方案一：OpenRouter（推荐新手 & 多模型切换）

为什么推荐 OpenRouter？

配置

常用模型列表

方案二：Anthropic 官方（Claude 官方接口）

方案三：OpenAI（GPT 系列）

方案四：火山引擎（智谱 AI / GLM-4 系列）

注册获取 Key

配置 solopreneur

可用模型

方案五：本地模型（Ollama / vLLM）

5.1 使用 Ollama

5.2 使用 vLLM

本地模型特殊优化

方案六：多 Provider 混用

验证配置

各 Provider 选型建议

下一期预告

《solopreneur》从零到一第 3 期