一展使用gpt-5-mini和gemini-3.1-flash-image-preview-0.5k的运行demo代码
前言:
一展api平台是母校的token中转平台。
最近个人做agent实验,会反复涉及到文生文和文生图,所以记录一个可以跑通的demo代码。避免后续自己遗忘。这个过程中使用的模型是:
- gpt-5-mini (文生文)
- gemini-3.1-flash-image-preview-0.5k (文生图)
demo代码:
python
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
中转 Demo,可以把这个文件直接放进 agent 项目里,然后让 agent 参考这个文件改造项目。
-----------------------------------------------------------------------
根据你当前测试报告,已经验证成功过的调用方式有:
1) 文本(OpenAI 兼容)
- 候选 base_url 可以写成:
a. https://api.shunyu.tech
b. https://api.shunyu.tech/v1
c. https://api.shunyu.tech/v1/chat/completions
- 但它们最后都打到了同一个实际地址:
https://api.shunyu.tech/v1/chat/completions
- 所以项目里最推荐保留的写法是:
base_url = https://api.shunyu.tech/v1
endpoint = /chat/completions
2) 图片(OpenAI 兼容 images/generations)
- 候选 base_url 测通的有:
a. https://api.shunyu.tech
b. https://api.shunyu.tech/v1
c. https://api.shunyu.tech/v1/images/generations
d. https://api.shunyu.tech/v1/chat/completions
- 但它们最终被重写后,成功请求的核心地址都是:
https://api.shunyu.tech/v1/images/generations
- 所以项目里最推荐保留的写法是:
base_url = https://api.shunyu.tech/v1
endpoint = /images/generations
3) 图片(其他也测通,但这里不作为正文实现)
- OpenAI chat/completions 产图:可用
- Gemini 原生 /v1beta/models/...:generateContent:可用
- 但这两种在 agent 项目里都不如 /images/generations 简洁,所以这里只写最稳、最适合接入的一种。
-----------------------------------------------------------------------
"""
from __future__ import annotations
import base64
from pathlib import Path
from typing import Any, Dict, Optional
import requests
# ============================================================
# 1) 直接配置区
# ============================================================
# 如果后面换 key,只改这里即可。
API_KEY = "这里填入你自己的key"
# ------------------------------
# 文本:正文只保留最推荐的成功方式
# 实际请求会打到:https://api.shunyu.tech/v1/chat/completions
# ------------------------------
TEXT_BASE_URL = "https://api.shunyu.tech/v1"
TEXT_MODEL = "gpt-5-mini"
# ------------------------------
# 图片:正文只保留最推荐的成功方式
# 实际请求会打到:https://api.shunyu.tech/v1/images/generations
# ------------------------------
IMAGE_BASE_URL = "https://api.shunyu.tech/v1"
IMAGE_MODEL = "gemini-3.1-flash-image-preview-0.5k"
TIMEOUT = 90
OUTPUT_DIR = Path("./_shunyu_demo_outputs")
# ============================================================
# 2) 基础工具函数
# ============================================================
def _ensure_dir(path: Path) -> None:
path.mkdir(parents=True, exist_ok=True)
def _headers() -> Dict[str, str]:
return {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
}
def _save_base64_image(b64_data: str, output_path: Path) -> Path:
_ensure_dir(output_path.parent)
raw = base64.b64decode(b64_data)
output_path.write_bytes(raw)
return output_path
# ============================================================
# 3) 核心函数一:generate_text()
# ============================================================
def generate_text(
prompt: str,
*,
system_prompt: str = "You are a helpful assistant.",
max_tokens: int = 256,
temperature: float = 0.2,
return_raw: bool = False,
) -> str | Dict[str, Any]:
"""
调用顺语中转的文本接口。
为什么这里默认 max_tokens=256:
- 你之前的测试里 max_tokens=32 时,32 token 全被 reasoning 吃掉了,
最后 message.content 为空。
- 所以正式接入 agent 时,建议至少 128~256 起步。
参数:
- prompt: 用户输入
- system_prompt: 系统提示词
- max_tokens: 最大输出 token
- temperature: 采样温度
- return_raw: True 时返回完整响应 dict;False 时只返回文本字符串
"""
endpoint = f"{TEXT_BASE_URL.rstrip('/')}/chat/completions"
payload = {
"model": TEXT_MODEL,
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt},
],
"temperature": temperature,
"max_tokens": max_tokens,
}
resp = requests.post(endpoint, headers=_headers(), json=payload, timeout=TIMEOUT)
resp.raise_for_status()
data = resp.json()
text = ""
try:
text = data["choices"][0]["message"]["content"] or ""
except Exception:
text = ""
if return_raw:
return {
"ok": True,
"endpoint": endpoint,
"model": data.get("model", TEXT_MODEL),
"text": text,
"raw": data,
}
return text
# ============================================================
# 4) 核心函数二:generate_image()
# ============================================================
def generate_image(
prompt: str,
*,
output_path: str = "./_shunyu_demo_outputs/generated_image.jpg",
return_raw: bool = False,
) -> str | Dict[str, Any]:
"""
调用顺语中转的图片接口。
正文里只保留最推荐的这条:
POST https://api.shunyu.tech/v1/images/generations
这样最适合接入 agent 项目,因为:
- 接口结构清晰
- 返回格式稳定
- 比 chat/completions 产图更直观
- 不需要像 Gemini 原生那样单独维护另一套请求体结构
参数:
- prompt: 图片提示词
- output_path: 本地保存路径
- return_raw: True 时返回完整响应 dict;False 时只返回保存后的图片路径
"""
endpoint = f"{IMAGE_BASE_URL.rstrip('/')}/images/generations"
payload = {
"model": IMAGE_MODEL,
"prompt": prompt,
"n": 1,
"response_format": "b64_json",
}
resp = requests.post(endpoint, headers=_headers(), json=payload, timeout=TIMEOUT)
resp.raise_for_status()
data = resp.json()
b64_data: Optional[str] = None
try:
b64_data = data["data"][0]["b64_json"]
except Exception:
b64_data = None
if not b64_data:
raise RuntimeError("图片接口返回成功,但没有拿到 data[0].b64_json。")
saved_path = _save_base64_image(b64_data, Path(output_path))
if return_raw:
return {
"ok": True,
"endpoint": endpoint,
"model": IMAGE_MODEL,
"image_path": str(saved_path),
"raw": data,
}
return str(saved_path)
# ============================================================
# 5) 最小运行示例
# ============================================================
if __name__ == "__main__":
_ensure_dir(OUTPUT_DIR)
print("=" * 80)
print("1) 文本调用 demo")
print("=" * 80)
text = generate_text("请用一句中文回答:你是谁?", max_tokens=256)
print("文本结果:", repr(text))
print("\n" + "=" * 80)
print("2) 图片调用 demo")
print("=" * 80)
image_path = generate_image(
"Generate a clean square image of a cute orange cat drinking coffee at a desk, simple background, no text.",
output_path="./_shunyu_demo_outputs/generated_image.jpg",
)
print("图片已保存到:", image_path)