1. 通信协议
HTTPS + HTTP,就是普通的 REST API 调用,无任何特殊协议。流式返回是 HTTP SSE,同样基于 HTTP。
2. 请求结构
URL :https://api.openai.com/v1/chat/completions
Method:POST
请求头:
Authorization: Bearer <API_KEY>
Content-Type: application/json
请求体核心字段:
| 字段 | 必填 | 说明 |
|---|---|---|
model |
✅ | 模型 ID,如 gpt-4o、gpt-4o-mini |
messages |
✅ | 消息列表,每条含 role + content |
max_tokens |
否 | 最大输出 token 数 |
temperature |
否 | 随机性,0=确定,2=高随机 |
top_p |
否 | 核采样,与 temperature 二选一 |
stream |
否 | 是否流式返回(默认 false) |
frequency_penalty |
否 | 降低重复,-2.0~2.0 |
presence_penalty |
否 | 鼓励话题扩展,-2.0~2.0 |
messages 角色类型:
| role | 说明 |
|---|---|
system |
系统指令,设置模型行为 |
user |
用户输入 |
assistant |
模型回复(用于多轮上下文) |
tool |
工具执行结果(函数调用时用) |
3. 响应结构
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1746123456,
"model": "gpt-4o",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "回复内容"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 142,
"total_tokens": 170
}
}
流式响应(SSE):将上述内容打散成多个 HTTP chunk:
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"你好"}}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content","}}]}
data: [DONE]
4. 函数调用(Tool Calls)
让模型触发外部工具,分三步循环:
用户提问
│
▼
模型判断调用工具 → 返回 tool_calls(finish_reason = "tool_calls")
│
▼
业务代码执行工具,得到结果
│
▼
将结果以 role="tool" 追加到 messages,再发请求
│
▼
模型整合结果,生成最终回复(finish_reason = "stop")
tool_calls 格式:
{
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"北京\",\"unit\":\"celsius\"}"
}
}]
}
tool 消息格式:
{
"role": "tool",
"tool_call_id": "call_abc123",
"name": "get_weather",
"content": "{\"temperature\":\"22°C\",\"weather\":\"晴\"}"
}
可以链式调用多个工具,不断循环"执行 → 回传 → 再请求"。
5. MCP(Model Context Protocol)
Anthropic 主导的标准化工具调用协议。本质是函数调用的统一封装:
-
MCP Server 自动暴露工具定义
-
OpenAI 运行时动态发现并调用
-
无需每个应用单独维护 JSON Schema
{
"tools": [{
"type": "mcp",
"mcp": {
"name": "gitmcp",
"server_label": "gitmcp",
"server_url": "https://gitmcp.io/openai/codex"
}
}]
}
6. 状态管理
无状态 :每次请求都要把完整对话历史塞进 messages 数组,服务器不保存任何上下文。你自己维护 messages 列表,每次请求完整传过去。
messages = [] # 自己维护
while True:
user_input = input("你: ")
messages.append({"role": "user", "content": user_input})
resp = requests.post(url, headers=headers, json={
"model": "gpt-4o",
"messages": messages # 每次都传完整历史
})
assistant_msg = resp.json()["choices"][0]["message"]
messages.append(assistant_msg)
print(f"AI: {assistant_msg['content']}")
7. 常见错误码
| HTTP 状态码 | 含义 |
|---|---|
| 200 | 成功 |
| 400 | 请求格式错误(缺少必填字段等) |
| 401 | API Key 无效或缺失 |
| 403 | 无权限(组织/项目限制) |
| 429 | 限流,看 Retry-After 响应头 |
| 500 | OpenAI 服务器内部错误 |
8. 错误响应示例
401 --- API Key 无效或缺失
{
"error": {
"message": "Invalid API key provided. You can find your API key at https://platform.openai.com/account/api-keys.",
"type": "invalid_request_error",
"code": "invalid_api_key",
"param": null,
"code": "invalid_api_key"
}
}
常见原因:
-
API Key 写错了或复制少了字符
-
Key 被删除了
-
用错了环境变量的 Key
400 --- 请求格式错误
{
"error": {
"message": "Missing required parameter: 'messages'",
"type": "invalid_request_error",
"param": "messages",
"code": null
}
}
{
"error": {
"message": "Invalid JSON: Unexpected token at position 42.",
"type": "invalid_request_error",
"param": null,
"code": "json_parse_error"
}
}
常见原因:
-
缺少必填字段如
model或messages -
JSON 格式不合法(多逗号、少引号等)
-
messages数组格式不对
429 --- 请求限流
{
"error": {
"message": "Rate limit reached for model 'gpt-4o' with TPM limit of 300000. Limit will reset at 1746124800.",
"type": "rate_limit_exceeded_error",
"param": null,
"code": "tpm_limit_exceeded"
}
}
响应头会附带限流信息:
X-RateLimit-Limit: 5000000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1746124800
Retry-After: 45
常见原因:
-
TPM(每分钟 Token)超限
-
RPM(每分钟请求数)超限
-
超出套餐配额
处理方式 :等待 Retry-After 秒后重试
400 --- 模型不支持某功能
{
"error": {
"message": "Model 'gpt-3.5-turbo' does not support response_format parameter.",
"type": "invalid_request_error",
"param": "response_format",
"code": null
}
}
400 --- 上下文超长
{
"error": {
"message": "This model's maximum context window is 128000 tokens. Your messages resulted in 150000 tokens.",
"type": "invalid_request_error",
"param": "messages",
"code": "context_window_exceeded"
}
}
常见原因:对话历史太长,超出模型单次上下文窗口上限。
处理方式 :需要截断或压缩 messages 历史
500 --- OpenAI 服务器内部错误
{
"error": {
"message": "The server had an error while processing your request. Please try again in 30 seconds.",
"type": "server_error",
"param": null,
"code": null
}
}
常见原因:OpenAI 服务器自身故障,通常是临时性的。
处理方式:等待后重试,通常几次重试后会成功
403 --- 组织或项目权限不足
{
"error": {
"message": "Your organization does not have access to this model.",
"type": "invalid_request_error",
"param": "model",
"code": "model_not_found"
}
}
常见原因:
-
使用的模型不在自己订阅套餐内
-
项目级 API Key 没有该模型权限
9. 错误处理建议代码
import time
import requests
def chat_with_retry(messages, model="gpt-4o"):
url = "https://api.openai.com/v1/chat/completions"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
payload = {"model": model, "messages": messages}
while True:
try:
resp = requests.post(url, headers=headers, json=payload, timeout=60)
data = resp.json()
if resp.status_code == 200:
return data
elif resp.status_code == 429:
# 限流,等待后重试
retry_after = int(resp.headers.get("Retry-After", 5))
print(f"限流,等待 {retry_after} 秒...")
time.sleep(retry_after)
elif resp.status_code == 500:
# 服务器错误,短暂等待后重试
print("服务器错误,重试...")
time.sleep(2)
elif resp.status_code == 401:
raise Exception("API Key 无效:" + data["error"]["message"])
elif resp.status_code == 400:
# 上下文超长等错误,无法自动恢复
raise Exception(f"请求错误:{data['error']['message']}")
else:
raise Exception(f"未知错误 {resp.status_code}: {data}")
except requests.exceptions.Timeout:
print("请求超时,重试...")
time.sleep(2)