一招解决 AI 数据格式问题：让 AI 乖乖返回你要的数据结构

前言

在实际开发中，我们经常需要 AI 生成各种格式的数据。但是 AI 返回的数据格式往往不够规范，需要额外的处理。本文将介绍一个万能方法，让 AI 生成符合预期的数据结构。

核心思路

使用 Zod 定义数据结构
将 Zod Schema 转换为 JSON Schema
在提示词中加入格式说明
解析并验证 AI 返回的数据

代码实现

首先，我们需要安装必要的依赖：

bash 复制代码

npm install zod zod-to-json-schema

然后，实现核心代码:

ts 复制代码

import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

class StructuredOutputParser<T extends z.ZodTypeAny> {
  schema: T;
  constructor(schema: T) {
    this.schema = schema;
  }
  getFormatInstructions() {
    return `输出应格式化为符合以下 JSON Schema 的 JSON 实例。
    例如，对于 Schema { "properties": { "foo": { "title": "Foo", "description": "a list of strings", "type": "array", "items": { "type": "string" } } }, "required": ["foo"] }，
    对象 { "foo": ["bar", "baz"] } 是格式正确的 JSON 实例。
    对象 { "properties": { "foo": ["bar", "baz"] } } 不是 格式正确的 JSON 实例。
    以下是输出的 Schema:
\`\`\`json
${JSON.stringify(zodToJsonSchema(this.schema))}
\`\`\`
`;
  }
  async parse(text: string): Promise<z.infer<T>> {
    try {
      const json = text.includes('```') ? text.trim().split(/```(?:json)?/)[1] : text.trim();

      const escapedJson = json
        .replace(/"([^"\\]*(\\.[^"\\]*)*)"/g, (_match, capturedGroup) => {
          const escapedInsideQuotes = capturedGroup.replace(/\n/g, '\\n');
          return `"${escapedInsideQuotes}"`;
        })
        .replace(/\n/g, '');

      return await this.schema.parseAsync(JSON.parse(escapedJson));
    } catch (e) {
      throw new Error(`Failed to parse. Text: "${text}". Error: ${e}`);
    }
  }
}

class ParameterPrompt {
  private outputParser(schema: z.ZodTypeAny) {
    return new StructuredOutputParser(schema);
  }

  private createZodSchema(value: any): z.ZodTypeAny {
    if (typeof value === 'string') {
      return z.string();
    }
    if (typeof value === 'number') {
      return z.number();
    }
    if (typeof value === 'boolean') {
      return z.boolean();
    }
    if (Array.isArray(value)) {
      if (value.length > 0) {
        return z.array(this.createZodSchema(value[0]));
      }
      return z.array(z.any());
    }
    if (value === null) {
      return z.any().nullable();
    }
    if (typeof value === 'object' && value !== null) {
      return this.createObjectSchema(value);
    }
    return z.any();
  }

  private createObjectSchema(obj: Record<string, any>): z.ZodObject<any> {
    const schemaShape = Object.entries(obj).reduce(
      (acc, [key, value]) => {
        acc[key] = this.createZodSchema(value);
        return acc;
      },
      {} as Record<string, z.ZodTypeAny>,
    );

    return z.object(schemaShape);
  }

  public getPrompt(prompt: string, parameter: object) {
    const generatedZodSchema = this.createObjectSchema(parameter);
    const parser = this.outputParser(generatedZodSchema);
    const formatInstructions = parser.getFormatInstructions();
    const promptTemplate =
      prompt.indexOf('{{parameter}}') !== -1
        ? prompt.replace('{{parameter}}', formatInstructions)
        : prompt + formatInstructions;
    return { prompt: promptTemplate, parser };
  }
}

示例

生成人物信息

ts 复制代码

const { prompt, parser } = new ParameterPrompt().getPrompt(
    '请生成一个虚拟人物信息', 
    { 
        name: '名称',
        age: 18,
        description: '描述',
        hobbies: ['爱好']
    }
);

const result = await fetch('AI接口', {
    method: 'POST',
    body: JSON.stringify({ prompt })
});

const data = await parser.parse(result);
// 输出示例：
// {
//   "name": "李四",
//   "age": 25,
//   "description": "一个热爱生活的年轻人",
//   "hobbies": ["阅读", "旅行", "摄影"]
// }

生成商品数据

ts 复制代码

const { prompt, parser } = new ParameterPrompt().getPrompt(
    '请生成一个商品信息', 
    {
        id: 1,
        title: '商品名称',
        price: 99.9,
        tags: ['标签'],
        specs: {
            color: '颜色',
            size: '尺寸'
        }
    }
);
const result = await fetch('AI接口', {
    method: 'POST',
    body: JSON.stringify({ prompt })
});
const data = await parser.parse(result);
// 输出示例：
// {
//   "id": 101,
//   "title": "智能蓝牙耳机",
//   "price": 299.99,
//   "tags": ["电子产品", "蓝牙", "无线"],
//   "specs": {
//     "color": "黑色",
//     "size": "标准"
//   }
// }

总结

这个方案通过结合 Zod 的类型系统和自动化的 Schema 生成，提供了一种优雅的方式来处理 AI 生成的结构化数据。它不仅确保了数据格式的准确性，还提高了开发效率。希望这个方法能帮助大家更好地使用 AI 生成所需的数据结构。如果觉得有帮助，欢迎点赞转发，也欢迎在评论区分享你的想法和建议！

一招解决 AI 数据格式问题：让 AI 乖乖返回你要的数据结构

一招解决 AI 数据格式问题：让 AI 乖乖返回你要的数据结构

前言

核心思路

代码实现

示例

总结

参考资料