text2sql方法:基于ChatGPT的zero-shot方法C3

ChatGPT SQL

ChatGPT SQL出自2023年3月的论文《A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability》(github),这篇论文分析了ChatGPT的text2sql能力,实验结果表明ChatGPT的text2sql能力令人印象深刻,虽然没有达到当时的SOTA,但是无需训练性能也比SOTA低14%,且ChatGPT的鲁棒性与SOTA相比只有7.8%的差距。

使用zero-shot让ChatGPT生成SQL,prompt来自OpenAI展示的demo prompt,论文没有做特意的调整,如论文图1所示,图中上半部分是单轮场景下的text2sql,下半部分是多轮场景下的text2sql。

C3

text2sql方法C3出自2023年7月的论文《C3: Zero-shot Text-to-SQL with ChatGPT》(github), 它通过zero-shot的方式来prompt ChatGPT生成SQL。

C3的prompt包括3个组成部分:++C++ lear Prompting (CP), ++C++ alibration with Hints (CH), and ++C++onsistent Output (CO)

Clear Prompting:包括两个部分clear layout 和clear context,如论文图2©所示。

  • clear layout:用#将prompt里的指令、上下文(数据库schema)、问题分开。因为实验结果表明直接用论文图2(b)的指令可能会使生成的SQL有冗余列,所以在指令后添加了"and do not select extra columns that are not explicitly requested in the query"。
  • clear context:通过schema linking选择与问题相关的表和列。schema linking是通过zero-shot prompt ChatGPT来实现的,包括Table Recall 和 column Recall:

    • Table recall,zero-shop prompt ChatGPT分为3步来选择表。并用self-consistency来保证稳定性,即让ChatGPT生成10个结果集,每个结果中包括了top 4的表格,最后的结果为这10个结果集中出现最频繁的结果集。

      python 复制代码
      """
      Given the database schema and question, perform the following actions: 
      1 - Rank all the tables based on the possibility of being used in the SQL according to the question from the most relevant to the least relevant, Table or its column that matches more with the question words is highly relevant and must be placed ahead. 
      2 - Check whether you consider all the tables. 
      3 - Output a list object in the order of step 2, Your output should contain all the tables. The format should be like: 
      [
      "table_1", "table_2", ...
      ]
      
      Schema:
      # continents ( contid, continent )
      # countries ( countryid, countryname, continent )
      # car_makers ( id, maker, fullname, country )
      # model_list ( moddeli, maker, model )
      # car_names ( makeid, model, make )
      # cars_data ( id, mpg, cylinders, edispl, horsepower, weight, accelerate, year )
      Question:
      ### What is the name of the different car makers who produced a car in 1970?
      """
    • Column Recall,也通过zero-prompt来让ChatGPT分成两步来召回列。同样用self-consistency来保证稳定性,先让ChatGPT对每个表生成10个结果集,最后的结果为这10个结果集中出现最多频繁的5个列。

      python 复制代码
      """
      Given the database tables and question, perform the following actions: 
      1 - Rank the columns in each table based on the possibility of being used in the SQL, Column that matches more with the question words or the foreign key is highly relevant and must be placed ahead.
      You should output them in the order of the most relevant to the least relevant. 
      Explain why you choose each column. 
      2 - Output a JSON object that contains all the columns in each table according to your explanation. The format should be like: 
      { 
      "table_1": ["column_1", "column_2", ......],
      "table_2": ["column_1", "column_2", ......],
      "table_3": ["column_1", "column_2", ......],
      ...... 
      } 
      
      Schema: 
      # car_makers ( id, maker, fullname, country )
      # model_list ( modelid, maker, model )
      # car_names ( makeid, model, make ) 
      # cars_data ( id, mpg, cylinders, edispl, horsepower, weight, accelerate, year )
      Foreign keys: 
      # model_list.maker = car_makers.id 
      # car_names.model = model_list.model 
      # cars_data.id = car_names.makeid 
      
      Question:
      ### What is the name of the different car makers who produced a car in 1970?
      
      """

    Calibration with Hints : 通过对ChatGPT生成的SQL进行分析,发现它容易因为bias出现如论文图3所示的错误,所以在prompt里添加了如论文图1右上部分所示的两个提示。

Consistent Output:使用execution-based Self-consistency。先让LLM采样输出多个SQL结果,然后将这些生成的SQL查询在数据库上执行并记录执行结果,去掉错误记录后,通过对执行结果采取投票机制来选择最后SQL。

github issue 里有一个问题是关于执行时间的,作者回复如下:

Time taken for recalling table: approximately 7s per sample.

Time taken for recalling column: approximately 25s per sample.

Time taken for generating SQL: approximately 2s per sample.

The time spent also depends on the internet status and the rate limits of APl calls

在用self-consistency时,如issue作者所回复,通过ChatGPT api里的参数n来一次生成多个结果,对于n个输入是共享同一个输入token的。

相关推荐
DuHz9 小时前
无线通信与雷达感知融合的波形设计与信号处理——论文阅读(上)
论文阅读·信号处理
DuHz9 小时前
无线通信与雷达感知融合的波形设计与信号处理——论文阅读(下)
论文阅读·汽车·信息与通信·信号处理
张较瘦_2 天前
[论文阅读] AI + 软件工程 | LLM救场Serverless开发!SlsReuse框架让函数复用率飙升至91%,还快了44%
论文阅读·人工智能·软件工程
m0_650108242 天前
InstructBLIP:面向通用视觉语言模型的指令微调技术解析
论文阅读·人工智能·q-former·指令微调的视觉语言大模型·零样本跨任务泛化·通用视觉语言模型
做cv的小昊2 天前
VLM经典论文阅读:【综述】An Introduction to Vision-Language Modeling
论文阅读·人工智能·计算机视觉·语言模型·自然语言处理·bert·transformer
m0_650108243 天前
PaLM-E:具身智能的多模态语言模型新范式
论文阅读·人工智能·机器人·具身智能·多模态大语言模型·palm-e·大模型驱动
core5123 天前
不借助框架实现Text2SQL
sql·mysql·ai·大模型·qwen·text2sql
m0_650108243 天前
PaLM:Pathways 驱动的大规模语言模型 scaling 实践
论文阅读·人工智能·palm·谷歌大模型·大规模语言模型·全面评估与行为分析·scaling效应
小殊小殊4 天前
【论文笔记】视频RAG-Vgent:基于图结构的视频检索推理框架
论文阅读·人工智能·深度学习
有点不太正常4 天前
《ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs》——论文阅读
论文阅读·大模型·agent安全