【大模型】-LangChain自定义工具调用

文章目录

- - - [10.1 @tool 装饰器](#10.1 @tool 装饰器)
    - [10.2 StructuredTool](#10.2 StructuredTool)
    - [10.3 处理工具错误](#10.3 处理工具错误)
    - [10.4 调用内置工具包和拓展工具](#10.4 调用内置工具包和拓展工具)
    - [10.5 自定义默认工具](#10.5 自定义默认工具)
    - [10.6 如何使用内置工具包](#10.6 如何使用内置工具包)

方案： 普通函数 + Tool对象（最简单）

python 复制代码

from langchain.tools import Tool

# 1. 定义普通函数
def weather_query(city: str) -> str:
    """查询天气 - 普通函数"""
    weather_data = {
        "北京": {"temperature": "25°C", "weather": "晴"},
        "上海": {"temperature": "28°C", "weather": "多云"},
    }
    if city in weather_data:
        data = weather_data[city]
        return f"{city}: {data['temperature']}, {data['weather']}"
    return f"未找到{city}天气"

def attraction_search(query: str) -> str:
    """搜索景点 - 普通函数"""
    return f"找到了关于'{query}'的5个景点：..."

# 2. 创建Tool对象
weather_tool = Tool(
    name="weather_query",
    description="查询城市天气",
    func=weather_query  # 直接传入函数
)

attraction_tool = Tool(
    name="attraction_search",
    description="搜索旅游景点",
    func=attraction_search
)

# 3. 使用工具列表
tools = [weather_tool, attraction_tool]

LangChain 提供了三种创建工具的方式：

使用 @tool装饰器 -- 定义自定义工具的最简单方式。
使用 StructuredTool.from_function 类方法 -- 这类似于@tool装饰器，但允许更多配置和同步和异步实现的规范。
通过子类化BaseTool -- 这是最灵活的方法，它提供了最大程度的控制，但需要更多的工作量和代码。@tool 或 StructuredTool.from_function 类方法对于大多数用例应该足够了。提示如果工具具有精心选择的名称、描述和 JSON 模式，模型的性能会更好。

10.1 @tool 装饰器

这个 @tool 装饰器是定义自定义工具的最简单方式。该装饰器默认使用函数名称作为工具名称，但可以通过传递字符串作为第一个参数来覆盖。此外，装饰器将使用函数的文档字符串作为工具的描述 - 因此必须提供文档字符串。

同步如下：

python 复制代码

#示例：tools_decorator.py
from langchain_core.tools import tool
@tool
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b
# 检查与该工具关联的一些属性
print(multiply.name)
print(multiply.description)
print(multiply.args)

#输出结果
multiply
multiply(a: int, b: int) -> int - Multiply two numbers.
{'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}

创建一个异步实现，如下所示：

python 复制代码

#示例：tools_async.py
from langchain_core.tools import tool
@tool
async def amultiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

您还可以通过将它们传递给工具装饰器来自定义工具名称和 JSON 参数

python 复制代码

from pydantic import BaseModel, Field

class CalculatorInput(BaseModel):
    a: int = Field(description="first number")
    b: int = Field(description="second number")
        
@tool("multiplication-tool", args_schema=CalculatorInput, return_direct=True)
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b
# 检查与该工具关联的一些属性
print(multiply.name)
print(multiply.description)
print(multiply.args)
print(multiply.return_direct)

#结果
multiplication-tool
multiplication-tool(a: int, b: int) -> int - Multiply two numbers.
{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}
True

10.2 StructuredTool

StructuredTool.from_function 类方法提供了比 @tool 装饰器更多的可配置性，而无需太多额外的代码。

python 复制代码

from langchain_core.tools import StructuredTool
import asyncio
 
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b
 
async def amultiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b
 
async def main():
    calculator = StructuredTool.from_function(func=multiply, coroutine=amultiply)
    print(calculator.invoke({"a": 2, "b": 3}))
    print(await calculator.ainvoke({"a": 2, "b": 5}))
 
# 运行异步主函数
asyncio.run(main())

#结果
6
10

可以配置自定义参数

python 复制代码

from langchain_core.tools import StructuredTool
from pydantic import BaseModel, Field
import asyncio
 
class CalculatorInput(BaseModel):
    a: int = Field(description="first number")
    b: int = Field(description="second number")
 
def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b
 
# 创建一个异步包装器函数
async def async_addition(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a + b
async def main():
    calculator = StructuredTool.from_function(
        func=multiply,
        name="Calculator",
        description="multiply numbers",
        args_schema=CalculatorInput,
        return_direct=True,
        #coroutine= async_addition
        # coroutine= ... <- 如果需要，也可以指定异步方法
    )print(calculator.invoke({"a": 2, "b": 3}))
    #print(await calculator.ainvoke({"a": 2, "b": 5}))
    print(calculator.name)
    print(calculator.description)
    print(calculator.args)
 
# 运行异步主函数
asyncio.run(main())

10.3 处理工具错误

如果您正在使用带有代理的工具，您可能需要一个错误处理策略，以便代理可以从错误中恢复并继续执行。一个简单的策略是在工具内部抛出 ToolException，并使用 handle_tool_error 指定一个错误处理程序。当指定了错误处理程序时，异常将被捕获，错误处理程序将决定从工具返回哪个输出。您可以将 handle_tool_error 设置为 True、字符串值或函数。如果是函数，该函数应该以 ToolException 作为参数，并返回一个值。请注意，仅仅抛出 ToolException 是不会生效的。您需要首先设置工具的 handle_tool_error，因为其默认值是 False。

python 复制代码

from langchain_core.tools import ToolException
def get_weather(city: str) -> int:
    """获取给定城市的天气。"""
    raise ToolException(f"错误：没有名为{city}的城市。")
    
 # 示例：tools_exception.py
get_weather_tool = StructuredTool.from_function(
    func=get_weather,
    handle_tool_error=True,
)
get_weather_tool.invoke({"city": "foobar"})

#结果
'错误：没有名为foobar的城市。'

可以将 handle_tool_error 设置为一个始终返回的字符串。

python 复制代码

 # 示例：tools_exception_handle.py
get_weather_tool = StructuredTool.from_function(
    func=get_weather,
    handle_tool_error="没找到这个城市",
)
get_weather_tool.invoke({"city": "foobar"})


#结果
'没找到这个城市'

使用函数处理错误：

python 复制代码

 # 示例：tools_exception_handle_error.py
def _handle_error(error: ToolException) -> str:
    return f"工具执行期间发生以下错误：`{error.args[0]}`"
get_weather_tool = StructuredTool.from_function(
    func=get_weather,
    handle_tool_error=_handle_error,
)
get_weather_tool.invoke({"city": "foobar"})

#结果
'工具执行期间发生以下错误：错误：没有名为foobar的城市。'

10.4 调用内置工具包和拓展工具

使用维基百科工具包

python 复制代码

 # 示例：tools_wikipedia.py
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
api_wrapper = WikipediaAPIWrapper(top_k_results=1, doc_content_chars_max=100)
tool = WikipediaQueryRun(api_wrapper=api_wrapper)
print(tool.invoke({"query": "langchain"}))

#结果
Page: LangChain
Summary: LangChain is a framework designed to simplify the creation of applications
    
print(f"Name: {tool.name}")
print(f"Description: {tool.description}")
print(f"args schema: {tool.args}")
print(f"returns directly?: {tool.return_direct}")

#结果
Name: wikipedia
Description: A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.
args schema: {'query': {'title': 'Query', 'description': 'query to look up on wikipedia', 'type': 'string'}}
returns directly?: False

10.5 自定义默认工具

我们还可以修改内置工具的名称、描述和参数的 JSON 模式。

在定义参数的 JSON 模式时，重要的是输入保持与函数相同，因此您不应更改它。但您可以轻松为每个输入定义自定义描述。

python 复制代码

 #示例：tools_custom.py
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
from pydantic import BaseModel, Field
class WikiInputs(BaseModel):
    """维基百科工具的输入。"""
    query: str = Field(
        description="query to look up in Wikipedia, should be 3 or less words"
    )
tool = WikipediaQueryRun(
    name="wiki-tool",
    description="look up things in wikipedia",
    args_schema=WikiInputs,
    api_wrapper=api_wrapper,
    return_direct=True,
)
print(tool.run("langchain"))

#结果
Page: LangChain
Summary: LangChain is a framework designed to simplify the creation of applications 
    
print(f"Name: {tool.name}")
print(f"Description: {tool.description}")
print(f"args schema: {tool.args}")
print(f"returns directly?: {tool.return_direct}")

#结果
Name: wiki-tool
Description: look up things in wikipedia
args schema: {'query': {'title': 'Query', 'description': 'query to look up in Wikipedia, should be 3 or less words', 'type': 'string'}}
returns directly?: True

10.6 如何使用内置工具包

工具包是一组旨在一起使用以执行特定任务的工具。它们具有便捷的加载方法。

要获取可用的现成工具包完整列表，请访问集成。

所有工具包都公开了一个 get_tools 方法，该方法返回一个工具列表。

python 复制代码

 # 初始化一个工具包
toolkit = ExampleTookit(...)
# 获取工具列表
tools = toolkit.get_tools()

使用SQLDatabase toolkit 读取 langchain.db 数据库表结构：

python 复制代码

from langchain_community.agent_toolkits.sql.toolkit import SQLDatabaseToolkit
from langchain_community.utilities import SQLDatabase
from langchain_openai import ChatOpenAI
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain.agents.agent_types import AgentType
 
db = SQLDatabase.from_uri("sqlite:///langchain.db")
toolkit = SQLDatabaseToolkit(db=db, llm=ChatOpenAI(temperature=0))
print(toolkit.get_tools())

#结果
[QuerySQLDataBaseTool(description="Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.", db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x000001D15C9749B0>), InfoSQLDatabaseTool(description='Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3', db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x000001D15C9749B0>), ListSQLDatabaseTool(db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x000001D15C9749B0>), QuerySQLCheckerTool(description='Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!', db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x000001D15C9749B0>, llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x000001D17B4453D0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x000001D17B446D80>, temperature=0.0, openai_api_key=SecretStr('**********'), openai_proxy=''), llm_chain=LLMChain(prompt=PromptTemplate(input_variables=['dialect', 'query'], template='\n{query}\nDouble check the {dialect} query above for common mistakes, including:\n- Using NOT IN with NULL values\n- Using UNION when UNION ALL should have been used\n- Using BETWEEN for exclusive ranges\n- Data type mismatch in predicates\n- Properly quoting identifiers\n- Using the correct number of arguments for functions\n- Casting to the correct data type\n- Using the proper columns for joins\n\nIf there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.\n\nOutput the final SQL query only.\n\nSQL Query: '), llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x000001D17B4453D0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x000001D17B446D80>, temperature=0.0, openai_api_key=SecretStr('**********'), openai_proxy='')))]

 
agent_executor = create_sql_agent(
    llm=ChatOpenAI(temperature=0, model="gpt-4"),
    toolkit=toolkit,
    verbose=True,
    agent_type=AgentType.OPENAI_FUNCTIONS
)
# %%
agent_executor.invoke("Describe the full_llm_cache table")

#结果
Entering new SQL Agent Executor chain...
Invoking: sql_db_schema with `{'table_names': 'full_llm_cache'}`
CREATE TABLE full_llm_cache (
        prompt VARCHAR NOT NULL,
        llm VARCHAR NOT NULL,
        idx INTEGER NOT NULL,
        response VARCHAR,
        PRIMARY KEY (prompt, llm, idx)
)
/*
3 rows from full_llm_cache table:
prompt        llm        idx        response
[{"lc": 1, "type": "constructor", "id": ["langchain", "schema", "messages", "HumanMessage"], "kwargs        {"id": ["langchain", "chat_models", "openai", "ChatOpenAI"], "kwargs": {"max_retries": 2, "model_nam        0        {"lc": 1, "type": "constructor", "id": ["langchain", "schema", "output", "ChatGeneration"], "kwargs"
*/The `full_llm_cache` table has the following structure:
- `prompt`: A VARCHAR field that is part of the primary key. It cannot be NULL.
- `llm`: A VARCHAR field that is also part of the primary key. It cannot be NULL.
- `idx`: An INTEGER field that is part of the primary key as well. It cannot be NULL.
- `response`: A VARCHAR field that can contain NULL values.
Here are some sample rows from the `full_llm_cache` table:
| prompt | llm | idx | response |
|--------|-----|-----|----------|
| [{"lc": 1, "type": "constructor", "id": ["langchain", "schema", "messages", "HumanMessage"], "kwargs | {"id": ["langchain", "chat_models", "openai", "ChatOpenAI"], "kwargs": {"max_retries": 2, "model_nam | 0 | {"lc": 1, "type": "constructor", "id": ["langchain", "schema", "output", "ChatGeneration"], "kwargs" |
> Finished chain.