构建LangChain代理以实现流程自动化示例

LangChain它是构建基于LLM的应用程序的最热门框架之一。其全面的工具和组件集允许我们使用几乎任何LLM构建端到端AI解决方案。

也许LangChain的核心能力是LangChain代理。它们是自主或半自主的工具，可以执行任务，做出决策，并与其他工具和API进行交互。它们代表了使用LLM自动化复杂工作流程的重大飞跃。

首先要设置开发环境

ini 复制代码

$ conda create -n langchain python=3.9 -y
$ conda activate langchain

安装LangChain的包和其他一些必要的库：

ruby 复制代码

$ pip install langchain langchain_openai langchain_community langgraph ipykernel python-dotenv

将新创建的Conda环境作为内核添加到Jupyter中：

css 复制代码

$ ipython kernel install --user --name=langchain

创建一个.env文件来存储密钥，例如API密钥：

shell 复制代码

$ touch .env
$ vim .env  # Paste your OPENAI key
OPENAI_API_KEY='YOUR_KEY_HERE'

从.env文件中检索OpenAI 或者deepseek API密钥：

arduino 复制代码

import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')

下面以OpenAI为例来说明。

通过查询OpenAI的GPT-3.5（默认语言模型）来测试一切是否都正常工作：

ini 复制代码

from langchain_openai import OpenAI
llm = OpenAI(openai_api_key=api_key)
question = "Is Messi the best footballer of all time?"
output = llm.invoke(question)
print(output[:75])

让我们花一些时间思考代理框架。

Chains vs. Agents

代理的特征是他们能够选择最佳行动顺序来解决给定工具集的问题的能力。

例如，假设我们有以下内容：天气APIML模型用于服装推荐Strava API用于推荐自行车路线用户偏好数据库图像识别模型语言模型（文本生成）

传统的问题解决方法包括使用列表中的一系列工具：

链1：基于天气的服装推荐

复制代码

1）调用天气API获取天气数据
2）将天气数据输入ML服装模型
3）生成服装建议
4）向用户展示结果

链2：基于天气的骑行路线骑行者

复制代码

1）调用天气API
2）调用Strava API获取常用路由
3）根据天气条件过滤路线
4）为用户呈现合适的路线

链3：服装照片分析仪

复制代码

1）接收用户的服装照片
2）使用图像识别模型来识别服装项目
3）与用户偏好数据库进行比较
4）使用文本生成模型生成反馈
5）将分析呈现给用户

每条链使用预定的步骤序列和可用工具的子集来解决特定的问题。

他们不能适应超出其定义的范围。它们还需要三个独立的开发分支，这在时间和资源方面效率低下。

现在，想象一个代理系统，可以访问所有这些工具。它将能够：理解用户的查询或问题（通过带有语言模型的自然语言）评估哪些工具与问题相关（推理）使用最合适的工具动态创建工作流执行工作流程，并根据需要进行实时调整（代理）评估结果并从过去的互动中学习

例如，如果用户问，"我今天骑自行车应该穿什么？"代理可以检查天气API、通过Strava分析合适的自行车路线、考虑用户过去的偏好来推荐合适的服装，并且生成个性化响应。

所以，代理可以：使用同一套工具处理各种问题为每种独特的情况创建自定义工作流根据具体情况和用户需求调整其方法从互动中学习，以提高未来的绩效

将语言模型（本身只生成文本）转换为推理引擎的能力是LangChain最主要的应用之一，推理引擎可以使用其处理的资源来采取适当的行动。

简而言之，LangChain能够开发与外部世界交互的强大自主代理。

LangChain代理由几个组件组成，例如聊天模型，提示模板，外部工具和其他相关构造。为了构建成功的代理，我们需要检查每个组件并了解它们的用途。

创建LangChain代理涉及很多活动部件。第一个也是最明显的是语言模型。

ini 复制代码

from langchain_openai import OpenAI
llm = OpenAI(api_key=api_key, model="gpt-3.5-turbo-instruct")
question = "What is special about the number 73?"
output = llm.invoke(question)
print(output[:100])

语言模型，如OpenAI的GPT-3.5 Turbo，接受并生成字符串，适合回答个人用户的查询。

更新和使用更强大的模型，它可以将一系列消息作为输入并返回聊天消息作为输出（而不是使用纯文本）。

ini 复制代码

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessageChunk, SystemMessage
# Initialize the model
chat_model = ChatOpenAI(api_key=api_key, model='gpt-4o-mini')
# Write the message
smessages = [SystemMessage(content='You are a grumpy pirate.'),
            HumanMessage(content="What's up?")]
output = chat_model.invoke(messages)

换句话说，聊天模型允许我们用自然语言进行对话。

在上面的示例中，我们使用系统消息和用户查询初始化GPT-4 o-mini。注意SystemMessage和HumanMessage类的使用。output是一个消息对象，这是聊天模型的预期行为，除此之外，它们还返回其他有用元数据。

查询语言或聊天模型的最有效方法是使用提示词模板。它们允许我们一致地构造查询，并动态地插入变量，使我们与模型的交互更加灵活和可重用。如：

arduino 复制代码

from langchain_core.prompts import PromptTemplat
equery_template = "Tell me about {book_name} by {author}.
"prompt = PromptTemplate(input_variables=["book_name", "author"], template=query_template)
prompt.invoke({"book_name": "Song of Ice and Fire", "author": "GRRM"})

该类要求我们创建一个字符串，其中包含要使用方括号表示法替换的变量占位符。然后，我们将这个模板字符串连同变量名一起传递给PromptTemplate类，从而构造我们的prompt。

调用.invoke()将显示如何将提示符传递给模型。

将这个prompt模板传递给语言模型需要我们使用管道操作符将其链接起来：

ini 复制代码

from langchain_openai import OpenAI
llm = OpenAI(api_key=api_key)
# Create a chain
chain = prompt | llm
# Invoke the chain
output = chain.invoke({"book_name": "Deathly Hallows", "author": "J.K. Rowling"})
print(output[:100])

管道操作符（|）是LangChain表达式语言（LCEL）的一部分，旨在链接多个LangChain组件和工具。

当我们在LangChain对象上使用管道操作符时，我们创建了一个RunnableSequence类的实例。一个可运行的序列代表了一个支持.invoke()方法的对象链，比如提示模板和语言/聊天模型。

因此，使用ChatPromptTemplate类，我们可以轻松创建具有不同个性的聊天模型，我们的初始输入通常是一个系统提示，告诉聊天模型如何操作：

ini 复制代码

chat_model = ChatOpenAI(api_key=api_key, model="gpt-4o-mini")
template = ChatPromptTemplate([
   ('system', 'You are a helpful AI bot. Your specialty is {specialty}.'),   
   ('human', 'Explain the concept of {concept} based on your expertise.')
])

该类需要一个基于角色的消息列表作为输入。列表的每个成员必须是一个（角色，消息）元组，并在需要的地方定义了变量占位符。

准备好之后，我们可以使用相同的管道操作符来创建具有不同行为的聊天模型：

ini 复制代码

specialties = ["psychology", "economics", "politics"]
concept = "time"
# Call the model with different personalities
for s in specialties:
   chain = template | chat_model   
   output = chain.invoke({"specialty": s, "concept": concept})   
   print(output.content[:100], end="\n" + "-" * 25 + '\n')

另外，代理可以选择自己掌握的工具组合来解决特定问题，并使用LLM作为底层推理引擎。

LangChain提供与数十种流行的API和服务的集成，让代理与世界其他地方进行交互。它们中的大多数在langchain_community包中可用，而一些在langchain_core包中。

例如，以下是如何使用ArXiv工具检索各种主题的论文摘要：

ini 复制代码

pip install -Uq arxiv  # Install arXiv Python SDK

scss 复制代码

from langchain_community.tools import ArxivQueryRun
tool = ArxivQueryRun()
print(tool.invoke('Photosynthesis')[:250])

vbnet 复制代码

Published: 2019-08-28
Title: Photosynthesis on Exoplanets and Exomoons from Reflected Light
Authors: Manasvi Lingam, Abraham Loeb
Summary: Photosynthesis offers a convenient means of sustaining biospheres. We
quantify the constraints for photosynthes

还有一种加载工具的替代方法：

css 复制代码

from langchain_community.agent_toolkits.load_tools import load_tools
tools = load_tools(["arxiv", "dalle-image-generator"])

在上面，我们使用load_tools()函数同时加载arXiv和Dall-E图像生成器工具。使用此函数加载的工具具有相同的用法语法：

bash 复制代码

# Call arXiv
print(tools[0].invoke("Kaggle")[:150])

load_tools函数要求我们知道工具类的字符串名称，例如ArxivQueryRun与'arxiv'的示例。我们可以通过运行get_all_tool_names函数快速检查任何工具的字符串名称：

scss 复制代码

from langchain_community.agent_toolkits.load_tools import get_all_tool_names
get_all_tool_names()[:10]

css 复制代码

['sleep','wolfram-alpha','google-search','google-search-results-json','searx-search-results-json','bing-search','metaphor-search','ddg-search','google-lens','google-serper']

在生成代理时，建议使用工具的类构造函数加载工具，这样可以根据工具的特定行为对其进行配置。

在接下来的示例中，我们将构建一个能够通过三种媒介（文本、图像和视频）解释任何主题的代理。更具体地说，就是基于所提出的问题，代理将决定是否以何种格式解释该主题。

配置环境后的第一步是定义我们将提供给代理的工具。

python 复制代码

from langchain_community.tools import WikipediaQueryRun  # pip install wikipedia
from langchain_community.utilities import WikipediaAPIWrapper
from langchain_community.tools import YouTubeSearchTool  # pip install youtube_search
from langchain_community.tools.openai_dalle_image_generation import (   
  OpenAIDALLEImageGenerationTool
)
from langchain_community.utilities.dalle_image_generator import DallEAPIWrapper

我们导入五个类：WikipediaAPIWrapper: 配置如何访问维基百科APIWikipediaQueryRun:生成维基百科页面摘要YouTubeSearchTool: 搜索YouTube视频主题DallEAPIWrapper: 配置如何访问OpenAI的DallE端点OpenAIDALLEImageGenerationTool: 使用提示符生成图像

当用户查询我们的代理时，它将决定是否使用文本格式的维基百科文章来解释主题，或者通过使用Dall-E创建图像来进行视觉理解，或者通过建议YouTube视频来进行更深入的理解。

让我们从维基百科工具开始初始化：

scss 复制代码

wiki_api_wrapper = WikipediaAPIWrapper(top_k_results=1, doc_content_chars_max=250)
wikipedia = WikipediaQueryRun(description="A tool to explain things in text format. Use this tool if you think the user's asked concept is best explained through text.", api_wrapper=wiki_api_wrapper)
print(wikipedia.invoke("Mobius strip"))

DallE图像生成器：

ini 复制代码

dalle_api_wrapper = DallEAPIWrapper(model="dall-e-3", size="1792x1024")
dalle = OpenAIDALLEImageGenerationTool(  
   api_wrapper=dalle_api_wrapper, description="A tool to generate images. Use this tool if you think the user's asked concept is best explained through an image."
)
output = dalle.invoke("A mountain bike illustration.")
print(output)

YouTube搜索工具：

css 复制代码

youtube = YouTubeSearchTool(
   description="A tool to search YouTube videos. Use this tool if you think the user's asked concept can be best explained by watching a video."
)
youtube.run("Oiling a bike's chain")

现在，我们将这些工具放入列表中：

ini 复制代码

tools = [wikipedia, dalle, youtube]

现在，我们已经可以将这组工具绑定到聊天模型，而无需创建代理：

ini 复制代码

chat_model = ChatOpenAI(api_key=api_key)
model_with_tools = chat_model.bind_tools(tools)

让我们尝试用一个简单的消息调用模型：

vbscript 复制代码

response = model_with_tools.invoke([HumanMessage("What's up?!")])
print(f"Text response: {response.content}")
print(f"Tools used in the response: {response.tool_calls}")
大模型回复：
Text response: Hello! How can I assist you today?
Tools used in the response: []

输出显示在生成答案时没有使用任何绑定工具。

现在，让我们问一个特定的问题：

python 复制代码

response = model_with_tools.invoke([
   HumanMessage("Can you generate an image of a mountain bike?")
])
print(f"Text response: {response.content}")
print(f"Tools used in the response: {response.tool_calls}")

Text response:
Tools used in the response: [{'name': 'openai_dalle', 'args': {'query': 'mountain bike'}, 'id': 'call_92GBfmsYtPi9TpGuIOFB1pG8', 'type': 'tool_call'}]

我们可以看到没有文本输出，但提到了OpenAI的DallE。工具没有被调用;模型只是建议我们使用它。

要真正调用它-采取行动，我们需要创建一个代理。

在定义模型和工具之后，我们创建代理。

LangChain从其create_react_agent()包中提供了一个高级langgraph函数接口，以快速创建ReAct（原因和行动）代理：

ini 复制代码

from langgraph.prebuilt import create_react_agent
system_prompt = SystemMessage("You are a helpful bot named Chandler.")
agent = create_react_agent(chat_model, tools, state_modifier=system_prompt)

在使用聊天模型和工具列表初始化代理时，我们传递一个系统提示符来告诉模型如何进行一般操作。

python 复制代码

from pprint import pprint
response = agent.invoke({"messages": HumanMessage("What's up?")})
pprint(response["messages"])


[HumanMessage(content="What's up?", id='133b9380-cfe1-495a-98f7-b835c874bd57'),
AIMessage(content='Hello! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 112, 'total_tokens': 122}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0a920c15-39b4-4857-ab51-6ea905575dba-0', usage_metadata={'input_tokens': 112, 'output_tokens': 10, 'total_tokens': 122})]

我们已经收到了一个可能的响应，这是一个简单的文本回答，没有工具调用。现在，让我们问一些更中肯的问题：

css 复制代码

response = agent.invoke({"messages": [
   HumanMessage('Explain how photosynthesis works.')
]})
print(len(response['messages']))

这一次，有四条信息。让我们看看消息类名称及其内容：

vbnet 复制代码

for message in response['messages']:
   print(      
       f"{message.__class__.__name__}: {message.content}"   
   )  # Print message class name and its content   
   print("-" * 20, end="\n")
   
   
HumanMessage: Explain how photosynthesis works.
--------------------
AIMessage:
--------------------
ToolMessage: Page: Photosynthesis
Summary: Photosynthesis ( FOH-tə-SINTH-ə-sis) is a system of biological processes by which photosynthetic organisms, such as most plants, algae, and cyanobacteria, convert light energy, typically from sunlight, into the chemical
--------------------
AIMessage: Photosynthesis is a biological process where photosynthetic organisms like plants, algae, and cyanobacteria convert light energy, usually from sunlight, into chemical energy. This process involves capturing light energy through pigments like chlorophyll in plant cells and using it to convert carbon dioxide and water into glucose and oxygen. Glucose serves as a source of energy for the plant, while oxygen is released into the atmosphere as a byproduct. Photosynthesis is crucial for the survival of plants and the balance of oxygen and carbon dioxide in the atmosphere.
--------------------

第三条消息来自一个工具调用，它是维基百科上关于光合作用的页面的摘要。最后一条消息来自聊天模型，该模型在构造其答案时使用工具调用的内容。

让我们快速创建一个函数来模块化我们最后采取的步骤：

python 复制代码

def execute(agent, query): 
   response = agent.invoke({'messages': [HumanMessage(query)]})   
   for message in response['messages']:     
      print(         
         f"{message.__class__.__name__}: {message.content}"       
      )  # Print message class name and its content       
      print("-" * 20, end="\n")   
  return response

现在，让我们更新我们的系统提示，详细说明代理应该如何行为：

python 复制代码

system_prompt = SystemMessage(
   """   
   You are a helpful bot named Chandler. Your task is to explain topics   
   asked by the user via three mediums: text, image or video.   
   If the asked topic is best explained in text format, use the Wikipedia tool.   
   If the topic is best explained by showing a picture of it, generate an image   
   of the topic using Dall-E image generator and print the image URL.   
   Finally, if video is the best medium to explain the topic, conduct a YouTube search on it   
   and return found video links.   
   """
)

用新的系统提示词重新创建我们的代理：

ini 复制代码

agent = create_react_agent(chat_model, tools, state_modifier=system_prompt)
response = execute(agent, query='Explain the Fourier Series visually.')

验证：

markdown 复制代码

HumanMessage: Explain the Fourier Series visually.
--------------------
AIMessage:
--------------------
ToolMessage: https://oaidalleapiprodscus.blob.core.windows.net/private/org-qRwX4bsgcnaYHHwbxFBdZxUy/user-LOXQPflMtXxamV72hac9oS2O/img-iY3gXXBzWapRWKdmkd9cXEIN.png?st=2024-08-09T18%3A36%3A46Z&se=2024-08-09T20%3A36%3A46Z&sp=r&sv=2024-08-04&sr=b&rscd=inline&rsct=image/png&skoid=d505667d-d6c1-4a0a-bac7-5c84a87759f8&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2024-08-09T17%3A56%3A46Z&ske=2024-08-10T17%3A56%3A46Z&sks=b&skv=2024-08-04&sig=Ly8F4jvakFeEtpZ/6jMliLq%2BG3Xs%2Bz1AmX1sL06%2BQ3U%3D
--------------------
AIMessage: Here is a visual representation of the Fourier Series:
![Fourier Series](https://oaidalleapiprodscus.blob.core.windows.net/private/org-qRwX4bsgcnaYHHwbxFBdZxUy/user-LOXQPflMtXxamV72hac9oS2O/img-iY3gXXBzWapRWKdmkd9cXEIN.png)
The Fourier Series is a way to represent a function as the sum of simple sine waves. It is used in many areas of mathematics and physics to analyze and understand periodic phenomena.
--------------------

到目前为止，我们的代理是无状态的，这意味着它不记得和我们以前的交互：

vbnet 复制代码

response = execute(agent, query="What did I ask you in the previous query?")
HumanMessage: What did I ask you in the previous query?

--------------------
AIMessage: I'm sorry, I cannot remember the previous query as I don't have access to the conversation history. How can I assist you today?
--------------------

向代理添加聊天消息历史记录的最简单方法是使用langgraph的SqliteSaver类：

ini 复制代码

from langgraph.checkpoint.sqlite import SqliteSaver
memory = SqliteSaver.from_conn_string(':agent_history:')
agent = create_react_agent(chat_model, tools, checkpointer=memory, state_modifier=system_prompt)

我们使用.from_conn_string()类的SqliteSaver方法初始化，该方法创建一个数据库文件。然后，我们将memory传递给checkpointer函数的create_react_agent()参数。

现在，我们需要创建一个配置字典：

ini 复制代码

config = {'configurable': {'thread_id': 'a1b2c3'}}

上述字典定义了一个线程ID来区分一个会话和另一个会话，它被传递给我们的代理的.invoke()方法。

我们更新execute()函数以包含此行为：

vbnet 复制代码

def execute(agent, query, thread_id="a1b2c3"):  
   config = {"configurable": {"thread_id": thread_id}}   
   response = agent.invoke({'messages': [HumanMessage(query)]}, config=config)   
   for message in response["messages"]:     
      print(        
         f"{message.__class__.__name__}: {message.content}"       
      )  # Print message class name and its content       
      print("-" * 20, end="\n")   
   return response
Let's test it again:
response = execute( 
   agent, query="Explain how to oil a bike's chain using a YouTube video", thread_id="123"
)


HumanMessage: Explain how to oil a bike's chain using a YouTube video
--------------------
AIMessage:
--------------------
ToolMessage: ['https://www.youtube.com/watch?v=X1Vze17bhgk&pp=ygUXaG93IHRvIG9pbCBhIGJpa2UgY2hhaW4%3D', 'https://www.youtube.com/watch?v=ubKCHtZ20-0&pp=ygUXaG93IHRvIG9pbCBhIGJpa2UgY2hhaW4%3D']
--------------------
AIMessage: I found a couple of YouTube videos that explain how to oil a bike chain:
1. [Video 1: How to oil a bike chain](https://www.youtube.com/watch?v=X1Vze17bhgk&pp=ygUXaG93IHRvIG9pbCBhIGJpa2UgY2hhaW4%3D)
2. [Video 2: Step-by-step guide on oiling a bike chain](https://www.youtube.com/watch?v=ubKCHtZ20-0&pp=ygUXaG93IHRvIG9pbCBhIGJpa2UgY2hhaW4%3D)
You can watch these videos to learn how to oil your bike's chain effectively.
--------------------

现在，我们向代理询问以前的查询：

bash 复制代码

response = execute(agent, query='What have I asked you so far?', thread_id='123')
print(response)

{'messages': [HumanMessage(content="Explain how to oil a bike's chain using a YouTube video",     id='8254142b-fb77-4958-8ad9-0a0283c6611a'), ...
]

正如预期的那样，代理正在返回先前的消息！

至此，我们只需要一个像ChatGPT那样的聊天用户界面，我们就有了一个自定义的聊天机器人。

后面有机会介绍另外一个案例askDataAgent。