一展使用gpt-5-mini和gemini-3.1-flash-image-preview-0.5k的运行demo代码

前言：
demo代码：
前言：

一展api平台是母校的token中转平台。
最近个人做agent实验，会反复涉及到文生文和文生图，所以记录一个可以跑通的demo代码。避免后续自己遗忘。这个过程中使用的模型是：
gpt-5-mini （文生文）
gemini-3.1-flash-image-preview-0.5k （文生图）
demo代码：

python 复制代码
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
中转 Demo，可以把这个文件直接放进 agent 项目里，然后让 agent 参考这个文件改造项目。

-----------------------------------------------------------------------
根据你当前测试报告，已经验证成功过的调用方式有：

1) 文本（OpenAI 兼容）
   - 候选 base_url 可以写成：
     a. https://api.shunyu.tech
     b. https://api.shunyu.tech/v1
     c. https://api.shunyu.tech/v1/chat/completions
   - 但它们最后都打到了同一个实际地址：
     https://api.shunyu.tech/v1/chat/completions
   - 所以项目里最推荐保留的写法是：
     base_url = https://api.shunyu.tech/v1
     endpoint = /chat/completions

2) 图片（OpenAI 兼容 images/generations）
   - 候选 base_url 测通的有：
     a. https://api.shunyu.tech
     b. https://api.shunyu.tech/v1
     c. https://api.shunyu.tech/v1/images/generations
     d. https://api.shunyu.tech/v1/chat/completions
   - 但它们最终被重写后，成功请求的核心地址都是：
     https://api.shunyu.tech/v1/images/generations
   - 所以项目里最推荐保留的写法是：
     base_url = https://api.shunyu.tech/v1
     endpoint = /images/generations

3) 图片（其他也测通，但这里不作为正文实现）
   - OpenAI chat/completions 产图：可用
   - Gemini 原生 /v1beta/models/...:generateContent：可用
   - 但这两种在 agent 项目里都不如 /images/generations 简洁，所以这里只写最稳、最适合接入的一种。
-----------------------------------------------------------------------

"""

from __future__ import annotations

import base64
from pathlib import Path
from typing import Any, Dict, Optional

import requests

# ============================================================
# 1) 直接配置区
# ============================================================
# 如果后面换 key，只改这里即可。
API_KEY = "这里填入你自己的key"

# ------------------------------
# 文本：正文只保留最推荐的成功方式
# 实际请求会打到：https://api.shunyu.tech/v1/chat/completions
# ------------------------------
TEXT_BASE_URL = "https://api.shunyu.tech/v1"
TEXT_MODEL = "gpt-5-mini"

# ------------------------------
# 图片：正文只保留最推荐的成功方式
# 实际请求会打到：https://api.shunyu.tech/v1/images/generations
# ------------------------------
IMAGE_BASE_URL = "https://api.shunyu.tech/v1"
IMAGE_MODEL = "gemini-3.1-flash-image-preview-0.5k"

TIMEOUT = 90
OUTPUT_DIR = Path("./_shunyu_demo_outputs")


# ============================================================
# 2) 基础工具函数
# ============================================================
def _ensure_dir(path: Path) -> None:
    path.mkdir(parents=True, exist_ok=True)


def _headers() -> Dict[str, str]:
    return {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json",
    }


def _save_base64_image(b64_data: str, output_path: Path) -> Path:
    _ensure_dir(output_path.parent)
    raw = base64.b64decode(b64_data)
    output_path.write_bytes(raw)
    return output_path


# ============================================================
# 3) 核心函数一：generate_text()
# ============================================================
def generate_text(
    prompt: str,
    *,
    system_prompt: str = "You are a helpful assistant.",
    max_tokens: int = 256,
    temperature: float = 0.2,
    return_raw: bool = False,
) -> str | Dict[str, Any]:
    """
    调用顺语中转的文本接口。

    为什么这里默认 max_tokens=256：
    - 你之前的测试里 max_tokens=32 时，32 token 全被 reasoning 吃掉了，
      最后 message.content 为空。
    - 所以正式接入 agent 时，建议至少 128~256 起步。

    参数：
    - prompt: 用户输入
    - system_prompt: 系统提示词
    - max_tokens: 最大输出 token
    - temperature: 采样温度
    - return_raw: True 时返回完整响应 dict；False 时只返回文本字符串
    """
    endpoint = f"{TEXT_BASE_URL.rstrip('/')}/chat/completions"
    payload = {
        "model": TEXT_MODEL,
        "messages": [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": prompt},
        ],
        "temperature": temperature,
        "max_tokens": max_tokens,
    }

    resp = requests.post(endpoint, headers=_headers(), json=payload, timeout=TIMEOUT)
    resp.raise_for_status()
    data = resp.json()

    text = ""
    try:
        text = data["choices"][0]["message"]["content"] or ""
    except Exception:
        text = ""

    if return_raw:
        return {
            "ok": True,
            "endpoint": endpoint,
            "model": data.get("model", TEXT_MODEL),
            "text": text,
            "raw": data,
        }
    return text


# ============================================================
# 4) 核心函数二：generate_image()
# ============================================================
def generate_image(
    prompt: str,
    *,
    output_path: str = "./_shunyu_demo_outputs/generated_image.jpg",
    return_raw: bool = False,
) -> str | Dict[str, Any]:
    """
    调用顺语中转的图片接口。

    正文里只保留最推荐的这条：
    POST https://api.shunyu.tech/v1/images/generations

    这样最适合接入 agent 项目，因为：
    - 接口结构清晰
    - 返回格式稳定
    - 比 chat/completions 产图更直观
    - 不需要像 Gemini 原生那样单独维护另一套请求体结构

    参数：
    - prompt: 图片提示词
    - output_path: 本地保存路径
    - return_raw: True 时返回完整响应 dict；False 时只返回保存后的图片路径
    """
    endpoint = f"{IMAGE_BASE_URL.rstrip('/')}/images/generations"
    payload = {
        "model": IMAGE_MODEL,
        "prompt": prompt,
        "n": 1,
        "response_format": "b64_json",
    }

    resp = requests.post(endpoint, headers=_headers(), json=payload, timeout=TIMEOUT)
    resp.raise_for_status()
    data = resp.json()

    b64_data: Optional[str] = None
    try:
        b64_data = data["data"][0]["b64_json"]
    except Exception:
        b64_data = None

    if not b64_data:
        raise RuntimeError("图片接口返回成功，但没有拿到 data[0].b64_json。")

    saved_path = _save_base64_image(b64_data, Path(output_path))

    if return_raw:
        return {
            "ok": True,
            "endpoint": endpoint,
            "model": IMAGE_MODEL,
            "image_path": str(saved_path),
            "raw": data,
        }
    return str(saved_path)


# ============================================================
# 5) 最小运行示例
# ============================================================
if __name__ == "__main__":
    _ensure_dir(OUTPUT_DIR)

    print("=" * 80)
    print("1) 文本调用 demo")
    print("=" * 80)
    text = generate_text("请用一句中文回答：你是谁？", max_tokens=256)
    print("文本结果:", repr(text))

    print("\n" + "=" * 80)
    print("2) 图片调用 demo")
    print("=" * 80)
    image_path = generate_image(
        "Generate a clean square image of a cute orange cat drinking coffee at a desk, simple background, no text.",
        output_path="./_shunyu_demo_outputs/generated_image.jpg",
    )
    print("图片已保存到:", image_path)