自然语言处理从入门到应用——LangChain:提示(Prompts)-[提示模板:创建自定义提示模板和含有Few-Shot示例的提示模板]

分类目录:《自然语言处理从入门到应用》总目录


创建自定义提示模板

假设我们希望LLM根据函数名称生成该函数的英文语言解释。为了实现这个任务,我们将创建一个自定义的提示模板,以函数名称作为输入,并格式化提示模板以提供函数的源代码。LangChain提供了一组默认的提示模板,可用于生成各种任务的提示。但是,在某些情况下,默认的提示模板可能无法满足我们的需求。例如,我们可能希望创建一个具有特定动态指令的提示模板,以适应我们的语言模型。在这种情况下,我们可以创建自定义的提示模板。

有两种不同的提示模板:

  • 字符串提示模板:提供一个简单的字符串格式提示
  • 聊天提示模板:生成一个更结构化的聊天API使用的提示

在本文中,我们将使用字符串提示模板创建一个自定义提示。要创建自定义字符串提示模板,有两个要求:

  • 它具有input_variables属性,用于公开提示模板期望的输入变量
  • 它公开一个format方法,该方法接受与预期的input_variables相对应的关键字参数,并返回格式化的提示

我们将创建一个自定义的提示模板,它以函数名称作为输入,并格式化提示以提供函数的源代码。为了实现这一点,让我们首先创建一个函数,该函数将根据函数名称返回函数的源代码。

dart 复制代码
import inspect

def get_source_code(function_name):
    # Get the source code of the function
    return inspect.getsource(function_name)

接下来,我们将创建一个自定义的提示模板,该模板以函数名称作为输入,并格式化提示模板以提供函数的源代码:

from langchain.prompts import StringPromptTemplate
from pydantic import BaseModel, validator

class FunctionExplainerPromptTemplate(StringPromptTemplate, BaseModel):
    """一个自定义的提示模板,接受函数名作为输入,并格式化提示模板以提供函数的源代码。"""

    @validator("input_variables")
    def validate_input_variables(cls, v):
        """验证输入变量的正确性。"""
        if len(v) != 1 or "function_name" not in v:
            raise ValueError("function_name必须是唯一的输入变量。")
        return v

    def format(self, **kwargs) -> str:
        # 获取函数的源代码
        source_code = get_source_code(kwargs["function_name"])

        # 生成要发送给语言模型的提示
        prompt = f"""
        给定函数名和源代码,生成一个关于函数的英文语言解释。
        函数名:{kwargs["function_name"].__name__}
        源代码:
        {source_code}
        解释:
        """
        return prompt
    
    def _prompt_type(self):
        return "function-explainer"

现在我们已经创建了一个自定义的提示模板,我们可以使用它来生成我们任务的提示:

dart 复制代码
fn_explainer = FunctionExplainerPromptTemplate(input_variables=["function_name"])

# 为函数"get_source_code"生成一个提示
prompt = fn_explainer.format(function_name=get_source_code)
print(prompt)

输出:

dart 复制代码
给定函数名和源代码,生成一个关于函数的英文语言解释。
函数名:get_source_code
源代码:
def get_source_code(function_name):
    # Get the source code of the function
    return inspect.getsource(function_name)

解释:

创建含有Few-Shot示例的提示模板

在下文中,我们将学习如何创建含有Few-Shot示例的提示模板。我们将使用FewShotPromptTemplate类来创建一个含有Few-Shot示例的提示模板。该类可以接受一组示例或者一个ExampleSelector对象。在下文中,我们将分别为自我提问与搜索配置Few-Shot示例讨论这两种选项。

使用示例集

首先,创建一个Few-Shot示例的列表。每个示例应该是一个字典,其中键是输入变量,值是这些输入变量的值。

dart 复制代码
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate

examples = [
  {
    "question": "Who lived longer, Muhammad Ali or Alan Turing?",
    "answer": 
"""
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
"""
  },
  {
    "question": "When was the founder of craigslist born?",
    "answer": 
"""
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
"""
  },
  {
    "question": "Who was the maternal grandfather of George Washington?",
    "answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
"""
  },
  {
    "question": "Are both the directors of Jaws and Casino Royale from the same country?",
    "answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
"""
  }
]

然后,我们可以为Few Shot示例创建格式化程序。配置一个将Few Shot示例格式化为字符串的格式化程序。该格式化程序应该是一个PromptTemplate对象。

dart 复制代码
example_prompt = PromptTemplate(input_variables=["question", "answer"], template="Question: {question}\n{answer}")

print(example_prompt.format(**examples[0]))
Question: Who lived longer, Muhammad Ali or Alan Turing?

Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali

最后,创建一个FewShotPromptTemplate对象。该对象接受Few Shot示例和Few Shot示例的格式化程序作为输入。

dart 复制代码
prompt = FewShotPromptTemplate(
    examples=examples, 
    example_prompt=example_prompt, 
    suffix="Question: {input}", 
    input_variables=["input"]
)

print(prompt.format(input="Who was the father of Mary Ball Washington?"))

输出:

dart 复制代码
    Question: Who lived longer, Muhammad Ali or Alan Turing?
    
    Are follow up questions needed here: Yes.
    Follow up: How old was Muhammad Ali when he died?
    Intermediate answer: Muhammad Ali was 74 years old when he died.
    Follow up: How old was Alan Turing when he died?
    Intermediate answer: Alan Turing was 41 years old when he died.
    So the final answer is: Muhammad Ali
    
    
    Question: When was the founder of craigslist born?
    
    Are follow up questions needed here: Yes.
    Follow up: Who was the founder of craigslist?
    Intermediate answer: Craigslist was founded by Craig Newmark.
    Follow up: When was Craig Newmark born?
    Intermediate answer: Craig Newmark was born on December 6, 1952.
    So the final answer is: December 6, 1952
    
    
    Question: Who was the maternal grandfather of George Washington?
    
    Are follow up questions needed here: Yes.
    Follow up: Who was the mother of George Washington?
    Intermediate answer: The mother of George Washington was Mary Ball Washington.
    Follow up: Who was the father of Mary Ball Washington?
    Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
    So the final answer is: Joseph Ball
    
    
    Question: Are both the directors of Jaws and Casino Royale from the same country?
    
    Are follow up questions needed here: Yes.
    Follow up: Who is the director of Jaws?
    Intermediate Answer: The director of Jaws is Steven Spielberg.
    Follow up: Where is Steven Spielberg from?
    Intermediate Answer: The United States.
    Follow up: Who is the director of Casino Royale?
    Intermediate Answer: The director of Casino Royale is Martin Campbell.
    Follow up: Where is Martin Campbell from?
    Intermediate Answer: New Zealand.
    So the final answer is: No
    
    
    Question: Who was the father of Mary Ball Washington?
使用示例选择器

我们将重复使用上文中的示例集和格式化程序。但是,与其直接将示例输入到FewShotPromptTemplate对象中,我们将把它们输入到一个ExampleSelector对象中。在下文中,我们将使用SemanticSimilarityExampleSelector类。该类根据示例与输入之间的相似度选择Few-Shot示例。它使用嵌入模型计算输入与Few-Shot示例之间的相似度,并使用向量存储执行最近邻搜索。

dart 复制代码
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # 这是可供选择的示例列表。
    examples,
    # 这是用于生成嵌入的嵌入类,用于衡量语义相似度。
    OpenAIEmbeddings(),
    # 这是用于存储嵌入并进行相似度搜索的向量存储类。
    Chroma,
    # 这是要生成的示例数量。
    k=1
)

# 选择与输入最相似的示例。
question = "Who was the father of Mary Ball Washington?"
selected_examples = example_selector.select_examples({"question": question})
print(f"Examples most similar to the input: {question}")
for example in selected_examples:
    print("\n")
    for k, v in example.items():
        print(f"{k}: {v}")

输出:

dart 复制代码
Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.
Examples most similar to the input: Who was the father of Mary Ball Washington?


question: Who was the maternal grandfather of George Washington?
answer: 
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball

我们还可以将示例选择器应用于FewShotPromptTemplate。创建一个FewShotPromptTemplate对象。该对象接收示例选择器和用于Few-Shot示例的格式化程序:

dart 复制代码
prompt = FewShotPromptTemplate(
    example_selector=example_selector, 
    example_prompt=example_prompt, 
    suffix="Question: {input}", 
    input_variables=["input"]
)

print(prompt.format(input="Who was the father of Mary Ball Washington?"))

输出:

Question: Who was the maternal grandfather of George Washington?

Are follow up questions needed here: Yes.

Follow up: Who was the mother of George Washington?

Intermediate answer: The mother of George Washington was Mary Ball Washington.

Follow up: Who was the father of Mary Ball Washington?

Intermediate answer: The father of Mary Ball Washington was Joseph Ball.

So the final answer is: Joseph Ball

Question: Who was the father of Mary Ball Washington?

参考文献:

[1] LangChain 🦜️🔗 中文网,跟着LangChain一起学LLM/GPT开发:https://www.langchain.com.cn/

[2] LangChain中文网 - LangChain 是一个用于开发由语言模型驱动的应用程序的框架:http://www.cnlangchain.com/

相关推荐
Power202466633 分钟前
NLP论文速读|LongReward:基于AI反馈来提升长上下文大语言模型
人工智能·深度学习·机器学习·自然语言处理·nlp
数据猎手小k36 分钟前
AIDOVECL数据集:包含超过15000张AI生成的车辆图像数据集,目的解决旨在解决眼水平分类和定位问题。
人工智能·分类·数据挖掘
好奇龙猫41 分钟前
【学习AI-相关路程-mnist手写数字分类-win-硬件:windows-自我学习AI-实验步骤-全连接神经网络(BPnetwork)-操作流程(3) 】
人工智能·算法
沉下心来学鲁班1 小时前
复现LLM:带你从零认识语言模型
人工智能·语言模型
数据猎手小k1 小时前
AndroidLab:一个系统化的Android代理框架,包含操作环境和可复现的基准测试,支持大型语言模型和多模态模型。
android·人工智能·机器学习·语言模型
YRr YRr1 小时前
深度学习:循环神经网络(RNN)详解
人工智能·rnn·深度学习
sp_fyf_20241 小时前
计算机前沿技术-人工智能算法-大语言模型-最新研究进展-2024-11-01
人工智能·深度学习·神经网络·算法·机器学习·语言模型·数据挖掘
红客5971 小时前
Transformer和BERT的区别
深度学习·bert·transformer
多吃轻食1 小时前
大模型微调技术 --> 脉络
人工智能·深度学习·神经网络·自然语言处理·embedding
charles_vaez2 小时前
开源模型应用落地-glm模型小试-glm-4-9b-chat-快速体验(一)
深度学习·语言模型·自然语言处理