前言
💡 痛点:调用 OpenAI API 总是报错?不知道如何处理流式响应?函数调用不会用?成本控制没思路?
🎯 解决方案 :掌握 OpenAI API 实战 --- 从基础调用、到高级特性、再到生产级最佳实践。
OpenAI API 能力全景:
#mermaid-svg-bcWUcIjYHtMl97Z0{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-bcWUcIjYHtMl97Z0 .error-icon{fill:#552222;}#mermaid-svg-bcWUcIjYHtMl97Z0 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-bcWUcIjYHtMl97Z0 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .marker.cross{stroke:#333333;}#mermaid-svg-bcWUcIjYHtMl97Z0 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-bcWUcIjYHtMl97Z0 p{margin:0;}#mermaid-svg-bcWUcIjYHtMl97Z0 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .cluster-label text{fill:#333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .cluster-label span{color:#333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .cluster-label span p{background-color:transparent;}#mermaid-svg-bcWUcIjYHtMl97Z0 .label text,#mermaid-svg-bcWUcIjYHtMl97Z0 span{fill:#333;color:#333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .node rect,#mermaid-svg-bcWUcIjYHtMl97Z0 .node circle,#mermaid-svg-bcWUcIjYHtMl97Z0 .node ellipse,#mermaid-svg-bcWUcIjYHtMl97Z0 .node polygon,#mermaid-svg-bcWUcIjYHtMl97Z0 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-bcWUcIjYHtMl97Z0 .rough-node .label text,#mermaid-svg-bcWUcIjYHtMl97Z0 .node .label text,#mermaid-svg-bcWUcIjYHtMl97Z0 .image-shape .label,#mermaid-svg-bcWUcIjYHtMl97Z0 .icon-shape .label{text-anchor:middle;}#mermaid-svg-bcWUcIjYHtMl97Z0 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-bcWUcIjYHtMl97Z0 .rough-node .label,#mermaid-svg-bcWUcIjYHtMl97Z0 .node .label,#mermaid-svg-bcWUcIjYHtMl97Z0 .image-shape .label,#mermaid-svg-bcWUcIjYHtMl97Z0 .icon-shape .label{text-align:center;}#mermaid-svg-bcWUcIjYHtMl97Z0 .node.clickable{cursor:pointer;}#mermaid-svg-bcWUcIjYHtMl97Z0 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .arrowheadPath{fill:#333333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-bcWUcIjYHtMl97Z0 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-bcWUcIjYHtMl97Z0 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-bcWUcIjYHtMl97Z0 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-bcWUcIjYHtMl97Z0 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-bcWUcIjYHtMl97Z0 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-bcWUcIjYHtMl97Z0 .cluster text{fill:#333;}#mermaid-svg-bcWUcIjYHtMl97Z0 .cluster span{color:#333;}#mermaid-svg-bcWUcIjYHtMl97Z0 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-bcWUcIjYHtMl97Z0 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-bcWUcIjYHtMl97Z0 rect.text{fill:none;stroke-width:0;}#mermaid-svg-bcWUcIjYHtMl97Z0 .icon-shape,#mermaid-svg-bcWUcIjYHtMl97Z0 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-bcWUcIjYHtMl97Z0 .icon-shape p,#mermaid-svg-bcWUcIjYHtMl97Z0 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-bcWUcIjYHtMl97Z0 .icon-shape .label rect,#mermaid-svg-bcWUcIjYHtMl97Z0 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-bcWUcIjYHtMl97Z0 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-bcWUcIjYHtMl97Z0 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-bcWUcIjYHtMl97Z0 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} OpenAI API
文本生成
多模态
语音
向量与搜索
Chat Completions
函数调用
JSON 模式
Vision
图像生成 DALL-E
语音转文字 Whisper
文字转语音 TTS
Embeddings
向量存储
API 版本演进:
| 版本 | 发布时间 | 核心变化 |
|---|---|---|
| 0.27.0 之前 | 2023 年前 | 旧版 Completion API |
| 1.0.0 | 2023 年中 | Chat Completions 成为主流 |
| 1.2.0 | 2023 年末 | 函数调用正式支持 |
| 1.3.0+ | 2024 年 | Vision/JSON 模式/并行函数调用 |
一、快速开始
1.1 安装与配置
bash
# ===== 安装 OpenAI SDK =====
# Python
pip install openai>=1.0.0
# Node.js
npm install openai
# Go
go get github.com/openai/openai-go
# 环境变量配置
export OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://api.openai.com/v1" # 可选,自定义端点
# Windows PowerShell
$env:OPENAI_API_KEY = "sk-..."
python
# ===== 基础配置 =====
import os
from openai import OpenAI
# 初始化客户端
client = OpenAI(
api_key=os.getenv("OPENAI_API_KEY"),
# base_url="https://api.openai.com/v1", # 默认
# timeout=60.0, # 超时时间
# max_retries=2, # 重试次数
)
# 简单调用
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "你是一个有帮助的助手。"},
{"role": "user", "content": "你好!"}
]
)
print(response.choices[0].message.content)
1.2 模型选择
python
# ===== 模型选择指南 =====
models = {
"gpt-4o": {
"description": "最新多模态模型,速度最快",
"context_window": 128000,
"max_output": 16384,
"price_input": 5.00, # $/1M tokens
"price_output": 15.00,
"best_for": ["多模态", "高速场景", "成本敏感"]
},
"gpt-4o-mini": {
"description": "低成本多模态模型",
"context_window": 128000,
"max_output": 16384,
"price_input": 0.15,
"price_output": 0.60,
"best_for": ["大规模应用", "简单任务"]
},
"gpt-4-turbo": {
"description": "强大推理能力",
"context_window": 128000,
"max_output": 4096,
"price_input": 10.00,
"price_output": 30.00,
"best_for": ["复杂推理", "长文档"]
},
"gpt-3.5-turbo": {
"description": "经典模型,成本低",
"context_window": 16385,
"max_output": 4096,
"price_input": 0.50,
"price_output": 1.50,
"best_for": ["简单任务", "成本优先"]
}
}
# 选择建议
def select_model(task_complexity, need_vision, budget_sensitive):
if need_vision:
return "gpt-4o" if not budget_sensitive else "gpt-4o-mini"
if task_complexity == "high":
return "gpt-4-turbo"
if budget_sensitive:
return "gpt-4o-mini" if not task_complexity == "high" else "gpt-3.5-turbo"
return "gpt-4o"
1.3 基础对话
python
# ===== 基础对话示例 =====
from openai import OpenAI
import os
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
def simple_chat():
"""简单对话"""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "你是一个专业的 Python 开发者。"},
{"role": "user", "content": "写一个快速排序算法"}
],
temperature=0.7, # 0-2,越高越随机
max_tokens=1024, # 最大输出 tokens
top_p=1.0, # nucleus sampling
frequency_penalty=0.0, # 频率惩罚
presence_penalty=0.0, # 存在惩罚
)
return response.choices[0].message.content
def multi_turn_chat():
"""多轮对话"""
messages = [
{"role": "system", "content": "你是一个有帮助的助手。"}
]
while True:
user_input = input("你: ")
if user_input.lower() in ['exit', 'quit']:
break
messages.append({"role": "user", "content": user_input})
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.7
)
assistant_reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": assistant_reply})
print(f"AI: {assistant_reply}")
def stream_chat():
"""流式响应"""
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "讲一个长故事"}
],
stream=True # 启用流式
)
print("AI: ", end="")
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
print() # 换行
二、高级特性
2.1 函数调用(Function Calling)
python
# ===== 函数调用实战 =====
import json
from openai import OpenAI
client = OpenAI()
# 定义可用函数
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "获取指定城市的天气",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "城市名称,例如:北京"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "温度单位"
}
},
"required": ["city"]
}
}
},
{
"type": "function",
"function": {
"name": "search_database",
"description": "搜索数据库",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "搜索关键词"
}
},
"required": ["query"]
}
}
}
]
# 模拟函数实现
def get_weather(city: str, unit: str = "celsius"):
"""获取天气(模拟)"""
weather_data = {
"北京": {"temp": 25, "condition": "晴天"},
"上海": {"temp": 28, "condition": "多云"}
}
data = weather_data.get(city, {"temp": 20, "condition": "未知"})
return {
"city": city,
"temperature": f"{data['temp']}°{'C' if unit == 'celsius' else 'F'}",
"condition": data['condition']
}
def search_database(query: str):
"""搜索数据库(模拟)"""
return [
{"id": 1, "title": f"关于 {query} 的结果1"},
{"id": 2, "title": f"关于 {query} 的结果2"}
]
# 主流程
def chat_with_functions(user_message: str):
"""带函数调用的对话"""
messages = [
{"role": "system", "content": "你是一个有帮助的助手,可以使用工具获取信息。"},
{"role": "user", "content": user_message}
]
# 第一次调用:让模型决定是否调用函数
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
tool_choice="auto" # 自动选择是否调用函数
)
message = response.choices[0].message
# 检查是否有函数调用
if message.tool_calls:
# 添加助手消息(包含工具调用)
messages.append({
"role": "assistant",
"content": message.content,
"tool_calls": [
{
"id": tc.id,
"type": tc.type,
"function": {
"name": tc.function.name,
"arguments": tc.function.arguments
}
}
for tc in message.tool_calls
]
})
# 执行函数调用
for tool_call in message.tool_calls:
function_name = tool_call.function.name
function_args = json.loads(tool_call.function.arguments)
# 调用对应函数
if function_name == "get_weather":
result = get_weather(**function_args)
elif function_name == "search_database":
result = search_database(**function_args)
else:
result = {"error": "Unknown function"}
# 添加函数结果
messages.append({
"tool_call_id": tool_call.id,
"role": "tool",
"name": function_name,
"content": json.dumps(result, ensure_ascii=False)
})
# 第二次调用:让模型根据函数结果生成回复
final_response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
return final_response.choices[0].message.content
# 没有函数调用,直接返回回复
return message.content
# 测试
if __name__ == '__main__':
result = chat_with_functions("北京今天天气怎么样?")
print(result)
2.2 并行函数调用
python
# ===== 并行函数调用 =====
# OpenAI 支持一次调用多个函数
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "获取天气",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
},
{
"type": "function",
"function": {
"name": "get_news",
"description": "获取新闻",
"parameters": {
"type": "object",
"properties": {
"topic": {"type": "string"}
},
"required": ["topic"]
}
}
}
]
# 用户提问:"北京天气怎么样?今天有什么科技新闻?"
# 模型会并行调用两个函数
def parallel_function_calls():
messages = [
{"role": "user", "content": "北京天气怎么样?今天有什么科技新闻?"}
]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
message = response.choices[0].message
# 可能包含两个 tool_calls
if message.tool_calls:
print(f"模型调用了 {len(message.tool_calls)} 个函数:")
for tc in message.tool_calls:
print(f" - {tc.function.name}: {tc.function.arguments}")
# 并行执行(实际可以用 asyncio 并行)
results = {}
for tc in message.tool_calls:
func_name = tc.function.name
args = json.loads(tc.function.arguments)
if func_name == "get_weather":
results[tc.id] = get_weather(**args)
elif func_name == "get_news":
results[tc.id] = get_news(**args)
return results
2.3 JSON 模式
python
# ===== JSON 模式 =====
# 强制模型返回 JSON 格式
def get_json_response():
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "你是一个 JSON 生成器。只返回 JSON。"},
{"role": "user", "content": "生成一个包含姓名、年龄、城市的用户 JSON"}
],
response_format={ "type": "json_object" } # 强制 JSON
)
content = response.choices[0].message.content
print(content)
# 输出: {"name": "张三", "age": 30, "city": "北京"}
# 可以直接解析
import json
data = json.loads(content)
return data
# JSON 模式 + 函数调用
def structured_output():
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": "提取以下文本的实体:苹果公司成立于1976年,总部位于加州库比蒂诺。"}
],
tools=[{
"type": "function",
"function": {
"name": "extract_entities",
"description": "提取实体",
"parameters": {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {"type": "string"},
"type": {"type": "string", "enum": ["PERSON", "ORG", "LOC", "DATE"]}
}
}
}
},
"required": ["entities"]
}
}
}],
tool_choice={"type": "function", "function": {"name": "extract_entities"}}
)
message = response.choices[0].message
if message.tool_calls:
args = json.loads(message.tool_calls[0].function.arguments)
return args
三、多模态
3.1 视觉理解(Vision)
python
# ===== 视觉理解 =====
import base64
from openai import OpenAI
client = OpenAI()
def encode_image(image_path: str) -> str:
"""将图片编码为 base64"""
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
def vision_chat(image_path: str, question: str):
"""视觉理解对话"""
base64_image = encode_image(image_path)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": question},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}",
"detail": "high" # "low" | "high" | "auto"
}
}
]
}
],
max_tokens=1024
)
return response.choices[0].message.content
def vision_with_url(image_url: str, question: str):
"""使用图片 URL"""
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": question},
{
"type": "image_url",
"image_url": {
"url": image_url,
"detail": "high"
}
}
]
}
]
)
return response.choices[0].message.content
# 多图片输入
def multiple_images():
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "比较这两张图片的异同"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image1.jpg"}
},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image2.jpg"}
}
]
}
]
)
return response.choices[0].message.content
3.2 图像生成(DALL-E)
python
# ===== DALL-E 图像生成 =====
from openai import OpenAI
import os
client = OpenAI()
def generate_image(prompt: str, size: str = "1024x1024"):
"""生成图像"""
response = client.images.generate(
model="dall-e-3", # 或 "dall-e-2"
prompt=prompt,
size=size, # "256x256" | "512x512" | "1024x1024" (dall-e-2)
# "1024x1024" | "1792x1024" | "1024x1792" (dall-e-3)
quality="standard", # "standard" | "hd" (dall-e-3)
n=1, # dall-e-2 可以生成多张,dall-e-3 只能 1 张
style="vivid" # "vivid" | "natural" (dall-e-3)
)
image_url = response.data[0].url
revised_prompt = response.data[0].revised_prompt # dall-e-3 会优化 prompt
print(f"图片 URL: {image_url}")
print(f"优化后的 Prompt: {revised_prompt}")
return image_url
def edit_image(image_path: str, mask_path: str, prompt: str):
"""编辑图像(需要 mask)"""
response = client.images.edit(
model="dall-e-2", # 只有 dall-e-2 支持编辑
image=open(image_path, "rb"),
mask=open(mask_path, "rb"),
prompt=prompt,
n=1,
size="1024x1024"
)
return response.data[0].url
def create_variation(image_path: str):
"""生成变体"""
response = client.images.create_variation(
model="dall-e-2", # 只有 dall-e-2 支持变体
image=open(image_path, "rb"),
n=1,
size="1024x1024"
)
return response.data[0].url
3.3 语音转文字(Whisper)
python
# ===== Whisper 语音识别 =====
from openai import OpenAI
client = OpenAI()
def transcribe_audio(audio_path: str):
"""转录音频"""
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=open(audio_path, "rb"),
language="zh", # 可选,指定语言提高准确率
response_format="text", # "json" | "text" | "srt" | "vtt"
temperature=0.0
)
return transcript
def translate_audio(audio_path: str):
"""翻译音频(任意语言 → 英文)"""
translation = client.audio.translations.create(
model="whisper-1",
file=open(audio_path, "rb"),
response_format="text"
)
return translation
# 批量处理
def batch_transcribe(audio_files: list):
"""批量转录"""
results = {}
for audio_path in audio_files:
try:
transcript = transcribe_audio(audio_path)
results[audio_path] = transcript
except Exception as e:
results[audio_path] = f"ERROR: {str(e)}"
return results
3.4 文字转语音(TTS)
python
# ===== TTS 文字转语音 =====
from openai import OpenAI
client = OpenAI()
def text_to_speech(text: str, output_path: str):
"""文字转语音"""
response = client.audio.speech.create(
model="tts-1", # "tts-1" | "tts-1-hd"
voice="alloy", # "alloy" | "echo" | "fable" | "onyx" | "nova" | "shimmer"
input=text,
speed=1.0, # 0.25 - 4.0
response_format="mp3" # "mp3" | "opus" | "aac" | "flac"
)
response.stream_to_file(output_path)
print(f"语音已保存到: {output_path}")
def stream_speech(text: str):
"""流式返回语音(用于实时播放)"""
response = client.audio.speech.create(
model="tts-1",
voice="alloy",
input=text
)
# 返回二进制音频数据
return response.content
四、向量与搜索
4.1 Embeddings
python
# ===== Embeddings 向量化 =====
from openai import OpenAI
import numpy as np
client = OpenAI()
def get_embedding(text: str, model: str = "text-embedding-3-small"):
"""获取文本向量"""
response = client.embeddings.create(
model=model,
input=text,
encoding_format="float" # "float" | "base64"
)
embedding = response.data[0].embedding
return np.array(embedding)
def cosine_similarity(vec1, vec2):
"""计算余弦相似度"""
return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))
def semantic_search(query: str, documents: list):
"""语义搜索"""
# 获取查询向量
query_embedding = get_embedding(query)
# 获取文档向量
doc_embeddings = [get_embedding(doc) for doc in documents]
# 计算相似度
similarities = [
(doc, cosine_similarity(query_embedding, doc_emb))
for doc, doc_emb in zip(documents, doc_embeddings)
]
# 按相似度排序
similarities.sort(key=lambda x: x[1], reverse=True)
return similarities
# 批量获取向量(更高效)
def batch_embeddings(texts: list, batch_size: int = 100):
"""批量获取向量"""
all_embeddings = []
for i in range(0, len(texts), batch_size):
batch = texts[i:i+batch_size]
response = client.embeddings.create(
model="text-embedding-3-small",
input=batch
)
batch_embeddings = [d.embedding for d in response.data]
all_embeddings.extend(batch_embeddings)
return np.array(all_embeddings)
4.2 向量数据库集成
python
# ===== 向量数据库集成(Pinecone 示例)=====
from openai import OpenAI
from pinecone import Pinecone, ServerlessSpec
import numpy as np
# 初始化
client = OpenAI()
pc = Pinecone(api_key="your-pinecone-key")
# 创建索引
index_name = "document-embeddings"
if index_name not in pc.list_indexes().names():
pc.create_index(
name=index_name,
dimension=1536, # text-embedding-3-small 的维度
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-west-2")
)
index = pc.Index(index_name)
def upsert_documents(documents: list):
"""插入文档到向量数据库"""
# 批量获取向量
embeddings = []
for doc in documents:
emb = client.embeddings.create(
model="text-embedding-3-small",
input=doc["content"]
)
embeddings.append({
"id": doc["id"],
"values": emb.data[0].embedding,
"metadata": {"content": doc["content"], "title": doc["title"]}
})
# 插入到 Pinecone
index.upsert(vectors=embeddings)
print(f"已插入 {len(embeddings)} 个文档")
def search_documents(query: str, top_k: int = 5):
"""搜索文档"""
# 获取查询向量
query_emb = client.embeddings.create(
model="text-embedding-3-small",
input=query
)
# 搜索
results = index.query(
vector=query_emb.data[0].embedding,
top_k=top_k,
include_metadata=True
)
return [
{
"score": match.score,
"title": match.metadata["title"],
"content": match.metadata["content"]
}
for match in results.matches
]
五、错误处理与重试
5.1 错误类型
python
# ===== OpenAI API 错误类型 =====
from openai import OpenAI
from openai import (
APIError,
APIConnectionError,
RateLimitError,
AuthenticationError,
BadRequestError,
ConflictError,
InternalServerError,
NotFoundError,
PermissionDeniedError,
Timeout,
UnprocessableEntityError
)
client = OpenAI()
def robust_api_call(messages: list, max_retries: int = 3):
"""健壮的 API 调用(带重试)"""
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
timeout=30.0 # 30 秒超时
)
return response.choices[0].message.content
except RateLimitError as e:
# 速率限制
wait_time = 2 ** attempt # 指数退避
print(f"速率限制,{wait_time}秒后重试...")
time.sleep(wait_time)
except APIConnectionError as e:
# 网络连接错误
print(f"网络连接错误: {e}")
if attempt == max_retries - 1:
raise
time.sleep(2)
except InternalServerError as e:
# OpenAI 服务器错误(500 等)
print(f"OpenAI 服务器错误: {e}")
time.sleep(5)
except AuthenticationError as e:
# API Key 错误
print(f"认证失败,请检查 API Key: {e}")
raise
except BadRequestError as e:
# 请求参数错误
print(f"请求参数错误: {e}")
raise
except Timeout as e:
# 超时
print(f"请求超时: {e}")
if attempt == max_retries - 1:
raise
time.sleep(2)
except Exception as e:
# 其他错误
print(f"未知错误: {e}")
if attempt == max_retries - 1:
raise
time.sleep(1)
raise Exception("达到最大重试次数")
5.2 重试装饰器
python
# ===== 重试装饰器 =====
import time
import functools
from openai import RateLimitError, APIConnectionError, InternalServerError
def retry_on_error(max_retries: int = 3, backoff_factor: float = 2.0):
"""重试装饰器"""
def decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except (RateLimitError, APIConnectionError, InternalServerError) as e:
if attempt == max_retries - 1:
raise
wait_time = backoff_factor ** attempt
print(f"错误: {e}, {wait_time}秒后重试...")
time.sleep(wait_time)
raise Exception("达到最大重试次数")
return wrapper
return decorator
# 使用装饰器
@retry_on_error(max_retries=5, backoff_factor=2.0)
def call_openai_api(messages: list):
"""调用 OpenAI API"""
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
return response.choices[0].message.content
# 异步版本
import asyncio
from openai import AsyncOpenAI
def async_retry_on_error(max_retries: int = 3, backoff_factor: float = 2.0):
"""异步重试装饰器"""
def decorator(func):
@functools.wraps(func)
async def wrapper(*args, **kwargs):
for attempt in range(max_retries):
try:
return await func(*args, **kwargs)
except (RateLimitError, APIConnectionError, InternalServerError) as e:
if attempt == max_retries - 1:
raise
wait_time = backoff_factor ** attempt
print(f"错误: {e}, {wait_time}秒后重试...")
await asyncio.sleep(wait_time)
raise Exception("达到最大重试次数")
return wrapper
return decorator
@async_retry_on_error(max_retries=5)
async def async_call_openai(messages: list):
"""异步调用 OpenAI API"""
client = AsyncOpenAI()
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
return response.choices[0].message.content
六、成本控制
6.1 Token 计算
python
# ===== Token 计算 =====
import tiktoken
from openai import OpenAI
client = OpenAI()
def count_tokens(text: str, model: str = "gpt-4o"):
"""计算文本的 token 数"""
try:
encoding = tiktoken.encoding_for_model(model)
except KeyError:
# 未知模型使用 cl100k_base
encoding = tiktoken.get_encoding("cl100k_base")
tokens = encoding.encode(text)
return len(tokens)
def estimate_cost(input_text: str, output_text: str, model: str = "gpt-4o-mini"):
"""估算成本"""
pricing = {
"gpt-4o": {"input": 5.00, "output": 15.00},
"gpt-4o-mini": {"input": 0.15, "output": 0.60},
"gpt-4-turbo": {"input": 10.00, "output": 30.00},
"gpt-3.5-turbo": {"input": 0.50, "output": 1.50}
}
if model not in pricing:
return None
input_tokens = count_tokens(input_text, model)
output_tokens = count_tokens(output_text, model)
input_cost = (input_tokens / 1_000_000) * pricing[model]["input"]
output_cost = (output_tokens / 1_000_000) * pricing[model]["output"]
total_cost = input_cost + output_cost
return {
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"input_cost_usd": input_cost,
"output_cost_usd": output_cost,
"total_cost_usd": total_cost
}
# 监控 token 使用
def track_token_usage(response):
"""跟踪 token 使用"""
usage = response.usage
print(f"输入 tokens: {usage.prompt_tokens}")
print(f"输出 tokens: {usage.completion_tokens}")
print(f"总计 tokens: {usage.total_tokens}")
# 计算成本
cost = estimate_cost("", "", model="gpt-4o-mini") # 需要根据实际使用计算
return usage
6.2 成本优化策略
python
# ===== 成本优化策略 =====
from openai import OpenAI
import tiktoken
client = OpenAI()
# 策略 1:使用更便宜的模型
def use_cheaper_model():
"""简单任务使用便宜模型"""
response = client.chat.completions.create(
model="gpt-4o-mini", # 比 gpt-4o 便宜 30 倍
messages=[
{"role": "user", "content": "1+1=?"}
]
)
return response.choices[0].message.content
# 策略 2:减少输入 token(压缩上下文)
def compress_context(messages: list, max_tokens: int = 4096):
"""压缩上下文"""
# 保留系统消息
system_msgs = [m for m in messages if m["role"] == "system"]
# 保留最近的对话
recent_msgs = [m for m in messages if m["role"] != "system"]
recent_msgs = recent_msgs[-10:] # 只保留最近 10 条
# 合并
compressed = system_msgs + recent_msgs
# 检查 token 数
total_tokens = sum(count_tokens(m["content"]) for m in compressed)
if total_tokens > max_tokens:
# 进一步压缩
compressed = system_msgs + recent_msgs[-5:]
return compressed
# 策略 3:使用缓存(避免重复调用)
import hashlib
import json
class CachedOpenAI:
"""带缓存的 OpenAI 客户端"""
def __init__(self, cache_dir: str = ".openai_cache"):
self.client = OpenAI()
self.cache_dir = cache_dir
os.makedirs(cache_dir, exist_ok=True)
def _cache_key(self, messages: list, model: str):
"""生成缓存 key"""
content = json.dumps({"messages": messages, "model": model}, ensure_ascii=False)
return hashlib.md5(content.encode()).hexdigest()
def _get_cache(self, cache_key: str):
"""获取缓存"""
cache_path = os.path.join(self.cache_dir, f"{cache_key}.json")
if os.path.exists(cache_path):
with open(cache_path, 'r', encoding='utf-8') as f:
return json.load(f)
return None
def _set_cache(self, cache_key: str, response):
"""设置缓存"""
cache_path = os.path.join(self.cache_dir, f"{cache_key}.json")
with open(cache_path, 'w', encoding='utf-8') as f:
json.dump(response, f, ensure_ascii=False)
def chat(self, messages: list, model: str = "gpt-4o-mini"):
"""带缓存的对话"""
cache_key = self._cache_key(messages, model)
# 检查缓存
cached = self._get_cache(cache_key)
if cached:
print("使用缓存")
return cached
# 调用 API
response = self.client.chat.completions.create(
model=model,
messages=messages
)
result = response.choices[0].message.content
# 写入缓存
self._set_cache(cache_key, result)
return result
# 策略 4:批量处理(Batch API)
def batch_api_example():
"""使用 Batch API(更便宜)"""
# Batch API 比实时 API 便宜 50%
# 适合离线处理任务
# 创建批量任务
batch = client.batches.create(
input_file_id="file-xxx", # 上传包含任务的 JSONL 文件
endpoint="/v1/chat/completions",
completion_window="24h"
)
batch_id = batch.id
print(f"Batch ID: {batch_id}")
# 查询批量任务状态
batch_status = client.batches.retrieve(batch_id)
print(f"状态: {batch_status.status}")
# 等待完成
while batch_status.status not in ['completed', 'failed', 'cancelled']:
time.sleep(60)
batch_status = client.batches.retrieve(batch_id)
# 获取结果
if batch_status.status == 'completed':
result_file_id = batch_status.output_file_id
# 下载结果文件
# ...
七、生产案例
7.1 案例:智能客服系统
python
# ===== 案例:智能客服系统 =====
from openai import OpenAI
import json
import time
client = OpenAI()
class CustomerServiceBot:
"""智能客服机器人"""
def __init__(self, knowledge_base: list):
self.client = client
self.knowledge_base = knowledge_base
self.conversation_history = []
# 定义工具
self.tools = [
{
"type": "function",
"function": {
"name": "search_knowledge_base",
"description": "搜索知识库",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "create_ticket",
"description": "创建工单",
"parameters": {
"type": "object",
"properties": {
"title": {"type": "string"},
"description": {"type": "string"}
},
"required": ["title", "description"]
}
}
}
]
def search_knowledge_base(self, query: str):
"""搜索知识库(模拟)"""
results = []
for doc in self.knowledge_base:
if query.lower() in doc["content"].lower():
results.append(doc)
return results[:3] # 返回前 3 条
def create_ticket(self, title: str, description: str):
"""创建工单(模拟)"""
ticket_id = f"TICKET-{int(time.time())}"
return {
"ticket_id": ticket_id,
"title": title,
"description": description,
"status": "open"
}
def chat(self, user_message: str):
"""处理用户消息"""
# 添加用户消息到历史
self.conversation_history.append({
"role": "user",
"content": user_message
})
# 调用 API
response = self.client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "你是智能客服助手,使用工具帮助客户解决问题。"}
] + self.conversation_history,
tools=self.tools,
tool_choice="auto"
)
message = response.choices[0].message
# 处理工具调用
if message.tool_calls:
self.conversation_history.append({
"role": "assistant",
"content": message.content,
"tool_calls": [
{
"id": tc.id,
"type": tc.type,
"function": {
"name": tc.function.name,
"arguments": tc.function.arguments
}
}
for tc in message.tool_calls
]
})
# 执行工具
for tool_call in message.tool_calls:
func_name = tool_call.function.name
func_args = json.loads(tool_call.function.arguments)
if func_name == "search_knowledge_base":
result = self.search_knowledge_base(**func_args)
elif func_name == "create_ticket":
result = self.create_ticket(**func_args)
else:
result = {"error": "Unknown function"}
self.conversation_history.append({
"tool_call_id": tool_call.id,
"role": "tool",
"name": func_name,
"content": json.dumps(result, ensure_ascii=False)
})
# 再次调用 API
final_response = self.client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "你是智能客服助手。"}
] + self.conversation_history
)
assistant_reply = final_response.choices[0].message.content
self.conversation_history.append({
"role": "assistant",
"content": assistant_reply
})
return assistant_reply
# 没有工具调用
assistant_reply = message.content
self.conversation_history.append({
"role": "assistant",
"content": assistant_reply
})
return assistant_reply
# 使用
if __name__ == '__main__':
# 初始化知识库
kb = [
{"id": 1, "title": "如何退款", "content": "退款流程..."},
{"id": 2, "title": " shipping 时间", "content": "Shipping 需要 3-5 天..."}
]
bot = CustomerServiceBot(kb)
print("智能客服已启动!输入 'exit' 退出。")
while True:
user_input = input("客户: ")
if user_input.lower() == 'exit':
break
reply = bot.chat(user_input)
print(f"客服: {reply}")
7.2 案例:内容审核系统
python
# ===== 案例:内容审核系统 =====
from openai import OpenAI
import json
client = OpenAI()
class ContentModerator:
"""内容审核系统"""
def __init__(self):
self.client = client
self.moderation_cache = {} # 缓存审核结果
def moderate_text(self, text: str):
"""文本审核"""
# 检查缓存
text_hash = hash(text)
if text_hash in self.moderation_cache:
return self.moderation_cache[text_hash]
# 调用 Moderation API
response = self.client.moderations.create(
model="text-moderation-latest",
input=text
)
result = response.results[0]
# 缓存结果
self.moderation_cache[text_hash] = result
return result
def is_flagged(self, text: str):
"""检查是否被标记"""
result = self.moderate_text(text)
return result.flagged
def get_violations(self, text: str):
"""获取违规类型"""
result = self.moderate_text(text)
violations = []
categories = result.categories
scores = result.category_scores
for category, flagged in categories.__dict__.items():
if flagged:
score = getattr(scores, category)
violations.append({
"category": category,
"score": score
})
return violations
def batch_moderate(self, texts: list):
"""批量审核"""
results = []
# Moderation API 支持批量
response = self.client.moderations.create(
model="text-moderation-latest",
input=texts
)
for i, result in enumerate(response.results):
results.append({
"text": texts[i],
"flagged": result.flagged,
"violations": self.get_violations(texts[i])
})
return results
# 使用
if __name__ == '__main__':
moderator = ContentModerator()
# 单条审核
text = "这篇内容包含违规信息..."
if moderator.is_flagged(text):
violations = moderator.get_violations(text)
print(f"违规: {violations}")
else:
print("内容安全")
# 批量审核
texts = [
"正常内容",
"违规内容1",
"违规内容2"
]
results = moderator.batch_moderate(texts)
for r in results:
print(f"文本: {r['text'][:20]}..., 违规: {r['flagged']}")
八、总结
8.1 核心要点
#mermaid-svg-kOzyPAKCKukIBUcT{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-kOzyPAKCKukIBUcT .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-kOzyPAKCKukIBUcT .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-kOzyPAKCKukIBUcT .error-icon{fill:#552222;}#mermaid-svg-kOzyPAKCKukIBUcT .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-kOzyPAKCKukIBUcT .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-kOzyPAKCKukIBUcT .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-kOzyPAKCKukIBUcT .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-kOzyPAKCKukIBUcT .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-kOzyPAKCKukIBUcT .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-kOzyPAKCKukIBUcT .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-kOzyPAKCKukIBUcT .marker{fill:#333333;stroke:#333333;}#mermaid-svg-kOzyPAKCKukIBUcT .marker.cross{stroke:#333333;}#mermaid-svg-kOzyPAKCKukIBUcT svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-kOzyPAKCKukIBUcT p{margin:0;}#mermaid-svg-kOzyPAKCKukIBUcT .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-kOzyPAKCKukIBUcT .cluster-label text{fill:#333;}#mermaid-svg-kOzyPAKCKukIBUcT .cluster-label span{color:#333;}#mermaid-svg-kOzyPAKCKukIBUcT .cluster-label span p{background-color:transparent;}#mermaid-svg-kOzyPAKCKukIBUcT .label text,#mermaid-svg-kOzyPAKCKukIBUcT span{fill:#333;color:#333;}#mermaid-svg-kOzyPAKCKukIBUcT .node rect,#mermaid-svg-kOzyPAKCKukIBUcT .node circle,#mermaid-svg-kOzyPAKCKukIBUcT .node ellipse,#mermaid-svg-kOzyPAKCKukIBUcT .node polygon,#mermaid-svg-kOzyPAKCKukIBUcT .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-kOzyPAKCKukIBUcT .rough-node .label text,#mermaid-svg-kOzyPAKCKukIBUcT .node .label text,#mermaid-svg-kOzyPAKCKukIBUcT .image-shape .label,#mermaid-svg-kOzyPAKCKukIBUcT .icon-shape .label{text-anchor:middle;}#mermaid-svg-kOzyPAKCKukIBUcT .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-kOzyPAKCKukIBUcT .rough-node .label,#mermaid-svg-kOzyPAKCKukIBUcT .node .label,#mermaid-svg-kOzyPAKCKukIBUcT .image-shape .label,#mermaid-svg-kOzyPAKCKukIBUcT .icon-shape .label{text-align:center;}#mermaid-svg-kOzyPAKCKukIBUcT .node.clickable{cursor:pointer;}#mermaid-svg-kOzyPAKCKukIBUcT .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-kOzyPAKCKukIBUcT .arrowheadPath{fill:#333333;}#mermaid-svg-kOzyPAKCKukIBUcT .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-kOzyPAKCKukIBUcT .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-kOzyPAKCKukIBUcT .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kOzyPAKCKukIBUcT .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-kOzyPAKCKukIBUcT .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kOzyPAKCKukIBUcT .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-kOzyPAKCKukIBUcT .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-kOzyPAKCKukIBUcT .cluster text{fill:#333;}#mermaid-svg-kOzyPAKCKukIBUcT .cluster span{color:#333;}#mermaid-svg-kOzyPAKCKukIBUcT div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-kOzyPAKCKukIBUcT .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-kOzyPAKCKukIBUcT rect.text{fill:none;stroke-width:0;}#mermaid-svg-kOzyPAKCKukIBUcT .icon-shape,#mermaid-svg-kOzyPAKCKukIBUcT .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-kOzyPAKCKukIBUcT .icon-shape p,#mermaid-svg-kOzyPAKCKukIBUcT .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-kOzyPAKCKukIBUcT .icon-shape .label rect,#mermaid-svg-kOzyPAKCKukIBUcT .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-kOzyPAKCKukIBUcT .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-kOzyPAKCKukIBUcT .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-kOzyPAKCKukIBUcT :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} OpenAI API 实战
基础调用
高级特性
多模态
生产实践
模型选择
参数调优
函数调用
JSON 模式
流式响应
Vision
DALL-E
Whisper
TTS
错误处理
成本控制
性能优化
8.2 最佳实践
| 实践 | 说明 |
|---|---|
| 模型选择 | 简单任务用 gpt-4o-mini,复杂任务用 gpt-4o |
| 函数调用 | 让模型调用外部工具,增强能力 |
| 流式响应 | 提升用户体验,减少等待感 |
| 错误处理 | 实现重试机制,处理速率限制 |
| 成本控制 | 使用缓存、批量 API、便宜模型 |
| Moderation API | 生产环境必须审核用户输入 |
8.3 成本优化
| 策略 | 节省 |
|---|---|
| 使用 gpt-4o-mini | 比 gpt-4o 便宜 30 倍 |
| Batch API | 比实时 API 便宜 50% |
| 缓存结果 | 避免重复调用 |
| 压缩上下文 | 减少输入 token |
| 限制 max_tokens | 控制输出长度 |
本文基于 OpenAI API 官方文档编写。如有问题欢迎评论区讨论!