🔥🔥🔥 一文搞懂 Langchain Models （四）

接上文🔥🔥🔥 一文搞懂 Langchain Models （三）。

提示模板（Prompt Templates ）

在构建动态的、面向用户的应用程序时，一般不会对提示进行硬编码。我们需要能够在提示模板中使用用户输入来构建提示。LangChain 提供了构建这些提示模板和动态插入输入的类。

提示模板允许您传入变量值以动态调整传递给LLM的内容。

以下是来自文档的一个示例：

python 复制代码

rom langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
...

system_template="You are a helpful assistant that translates {input_language} to {output_language}."
system_message_prompt = SystemMessagePromptTemplate.from_template(system_template)

# SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input_language', 'output_language'], output_parser=None, partial_variables={}, template='You are a helpful assistant that translates {input_language} to {output_language}.', template_format='f-string', validate_template=True), additional_kwargs={})

human_template="{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

# HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['text'], output_parser=None, partial_variables={}, template='{text}', template_format='f-string', validate_template=True), additional_kwargs={})

chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

# ChatPromptTemplate(input_variables=['output_language', 'input_language', 'text'], output_parser=None, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input_language', 'output_language'], output_parser=None, partial_variables={}, template='You are a helpful assistant that translates {input_language} to {output_language}.', template_format='f-string', validate_template=True), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['text'], output_parser=None, partial_variables={}, template='{text}', template_format='f-string', validate_template=True), additional_kwargs={})])

# get a chat completion from the formatted messages
chat(chat_prompt.format_prompt(input_language="English", output_language="French", text="I love programming.").to_messages())

# AIMessage(content="J'adore la programmation.", additional_kwargs={})

在底层，LangChain 使用了内置的字符串库的 Formatter 类来解析使用传入变量的模板文本。这就是为什么模板中的变量周围有大括号（{}）。

以下是模板发生的事情的一个例子。

python 复制代码

from string import Formatter

formatter = Formatter()
format_string = "Hello, {name}! You are {age} years old."
result = formatter.format(format_string, name="John", age=30)

print(result)  # Output: "Hello, John! You are 30 years old."

LangChain 提供了关于 Chat Models 的几个指南：

如何使用少数示例 "少数示例提示（Few-Shot Prompting）"是一种技术，在提示中提供预期响应的示例，以便对LLM进行"条件"设置，以指导如何回应。
如何流式传输响应通过流式传输，您可以在从 LLM 接收到响应文本时立即显示，而无需等待整个响应。这无疑增加了"聊天"的体验。

嵌入模型（Text Embedding Models）

嵌入是一种将单词、短语或句子转换为固定大小的数字列表的方法，用于自然语言处理（NLP）。这通过将它们转换为数值形式（称为"向量"）来帮助算法更好地理解和处理文本数据。嵌入展示了单词的意义和结构，相似的单词具有相似的嵌入。

一旦我们将单词转换成这些"向量"，就可以使用数学方法来计算单词之间的相似性或差异性。这被证明非常强大，也是为什么最新的 LLMs 比以前的系统更加实用。

以下是将句子 "This is how embeddings work" 转换为嵌入的高级示例。

将句子分词为单词：["This", "is", "how", "embeddings", "work"]

使用预训练的嵌入模型将每个单词转换为其相应的嵌入向量。每个向量通常表示为一组固定长度的浮点数：

现在，句子 "This is how embeddings work" 可以表示为一系列嵌入向量：

text 复制代码

[   
	[0.12, -0.23, 0.56, ..., 0.07],
    [-0.15, 0.28, 0.31, ..., -0.03],
    [0.42, -0.12, -0.67, ..., 0.09],
    [0.22, 0.16, 0.08, ..., -0.24],
    [-0.04, -0.32, 0.25, ..., 0.13]
]

一旦你拥有了向量序列，你可以运行诸如语义搜索之类的查询，以返回最相关的结果。