【LangChain系列 16】语言模型——LLMs(二)

原文地址:【LangChain系列 16】语言模型------LLMs(二)

本文速读:

  • 缓存

  • 序列化

  • token使用

在上一篇【LangChain系列 15】语言模型------LLMs(一)中,介绍了异步API、自定义LLM、Fake LLM、HumanInput LLM;本文将介绍LLMs的第二部分,主要包括缓存、序列化、流、token使用等内容

01 缓存

LangChain在LLM中提供了一个可选的缓存层,缓存有两种好处,一是减少LLM的API调用次数,节约成本,二是提高了响应速度。

内存缓存

python 复制代码
from langchain.globals import set_llm_cache
from langchain.llms import OpenAI
from langchain.cache import InMemoryCache
set_llm_cache(InMemoryCache())

# To make the caching really obvious, lets use a slower model.llm = OpenAI(model_name="text-davinci-002", n=2, best_of=2)
# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a joke")

执行代码,输出结果:

yaml 复制代码
    CPU times: user 35.9 ms, sys: 28.6 ms, total: 64.6 ms
    Wall time: 4.83 s


    "\n\nWhy couldn't the bicycle stand up by itself? It was...two tired!"

# The second time it is, so it goes faster
llm.predict("Tell me a joke")

    CPU times: user 238 µs, sys: 143 µs, total: 381 µs
    Wall time: 1.76 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

第一次调用,没有缓存,第二次命中缓存,时间较第一次少很多。

SQLite缓存

perl 复制代码
# We can do the same thing with a SQLite cache
from langchain.cache import SQLiteCache
set_llm_cache(SQLiteCache(database_path=".langchain.db"))
# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a joke")


    CPU times: user 17 ms, sys: 9.76 ms, total: 26.7 ms
    Wall time: 825 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

# The second time it is, so it goes faster
llm.predict("Tell me a joke")

    CPU times: user 2.46 ms, sys: 1.23 ms, total: 3.7 ms
    Wall time: 2.67 ms


    '\n\nWhy did the chicken cross the road?\n\nTo get to the other side.'

同样的,第一次调用,没有缓存,第二次命中缓存,时间较第一次少很多;同时本地会生成一个langchain.db缓存文件。

链中可选缓存

在链式调用中,你可以选择性的关闭、打开部分结点的缓存;比如在下面的map-reduce链中,对map进行结果缓存,而对reduce结果不缓存。

vbnet 复制代码
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain
from langchain.docstore.document import Documentfrom langchain.chains.summarize import load_summarize_chain

text_splitter = CharacterTextSplitter()
llm = OpenAI(model_name="text-davinci-002")
no_cache_llm = OpenAI(model_name="text-davinci-002", cache=False)

with open('../../../state_of_the_union.txt') as f:
    state_of_the_union = f.read()
texts = text_splitter.split_text(state_of_the_union)
docs = [Document(page_content=t) for t in texts[:3]]

chain = load_summarize_chain(llm, chain_type="map_reduce", reduce_llm=no_cache_llm)
chain.run(docs)


    CPU times: user 452 ms, sys: 60.3 ms, total: 512 ms
    Wall time: 5.09 s


    '\n\nPresident Biden is discussing the American Rescue Plan and the Bipartisan Infrastructure Law, which will create jobs and help Americans. He also talks about his vision for America, which includes investing in education and infrastructure. In response to Russian aggression in Ukraine, the United States is joining with European allies to impose sanctions and isolate Russia. American forces are being mobilized to protect NATO countries in the event that Putin decides to keep moving west. The Ukrainians are bravely fighting back, but the next few weeks will be hard for them. Putin will pay a high price for his actions in the long run. Americans should not be alarmed, as the United States is taking action to protect its interests and allies.'

02 序列化

对于LLM,LangChain提供了序列化功能,便于LLM的存储与共享;支持JSON和YAML两种文件格式。

加载LLM

json文件

json 复制代码
cat llm.json

{    "model_name": "text-davinci-003",    "temperature": 0.7,    "max_tokens": 256,    "top_p": 1.0,    "frequency_penalty": 0.0,    "presence_penalty": 0.0,    "n": 1,    "best_of": 1,    "request_timeout": null,    "_type": "openai"}

llm = load_llm("llm.json")

yaml文件

ini 复制代码
cat llm.yaml

_type: openaibest_of: 1frequency_penalty: 0.0max_tokens: 256model_name: text-davinci-003n: 1presence_penalty: 0.0request_timeout: nulltemperature: 0.7top_p: 1.0

llm = load_llm("llm.yaml")

存储LLM

arduino 复制代码
llm.save("llm.json")

llm.save("llm.yaml")

03

有些LLM提供流式响应,这样就不必等整个结果返回再去处理,而是可以只要有流式数据返回就可以去处理。

在LangChain层面,它提供了对LLM的流式处理,主要是通过CallbackHandler实现on_llm_new_token方法。

ini 复制代码
from langchain.llms 
import OpenAI
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = OpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()],
 temperature=0)
resp = llm("Write me a song about sparkling water.")

执行代码,输出结果:

vbnet 复制代码
    Verse 1
    I'm sippin' on sparkling water,
    It's so refreshing and light,
    It's the perfect way to quench my thirst
    On a hot summer night.

    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.

    Verse 2
    I'm sippin' on sparkling water,
    It's so bubbly and bright,
    It's the perfect way to cool me down
    On a hot summer night.

    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.

    Verse 3
    I'm sippin' on sparkling water,
    It's so light and so clear,
    It's the perfect way to keep me cool
    On a hot summer night.

    Chorus
    Sparkling water, sparkling water,
    It's the best way to stay hydrated,
    It's so crisp and so clean,
    It's the perfect way to stay refreshed.

注意:流式LLM不支持获取token_usage。

04 token使用

在LLM调用中,主要是以token计数的,所以获取token相关的信息也是有必要的,目前只有OpenAI的API支持获取token使用情况。

最简单的用法就是直接调用LLM。

python 复制代码
from langchain.llms import OpenAI
from langchain.callbacks import get_openai_callback
llm = OpenAI(openai_api_key="...", model_name="text-davinci-002", n=2, best_of=2)
with get_openai_callback() as cb:
    result = llm("Tell me a joke")
    result2 = llm("Tell me a joke")
    print(cb)

执行代码,输出结果:

yaml 复制代码
Tokens Used: 84      
Prompt Tokens: 8      
Completion Tokens: 76
Successful Requests: 2
Total Cost (USD): $0.00168

如果在链式或者agent中调用LLM,它会记录所有步骤的token使用情况。

python 复制代码
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import OpenAI

llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
with get_openai_callback() as cb:
    response = agent.run(
        "Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?"
    )
    print(f"Total Tokens: {cb.total_tokens}")
    print(f"Prompt Tokens: {cb.prompt_tokens}")
    print(f"Completion Tokens: {cb.completion_tokens}")
    print(f"Total Cost (USD): ${cb.total_cost}")

执行代码,输出结果:

vbnet 复制代码
    > Entering new AgentExecutor chain...
     I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.
    Action: Search
    Action Input: "Olivia Wilde boyfriend"
    Observation: Sudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.
    Thought: I need to find out Harry Styles' age.
    Action: Search
    Action Input: "Harry Styles age"
    Observation: 29 years
    Thought: I need to calculate 29 raised to the 0.23 power.
    Action: Calculator
    Action Input: 29^0.23
    Observation: Answer: 2.169459462491557
    
    Thought: I now know the final answer.
    Final Answer: Harry Styles, Olivia Wilde's boyfriend, is 29 years old and his age raised to the 0.23 power is 2.169459462491557.
    
    > Finished chain.
    Total Tokens: 1506
    Prompt Tokens: 1350
    Completion Tokens: 156
    Total Cost (USD): $0.03012

本文小结

以上就是LLMs的第二部分,主要介绍了缓存、序列化、流和token使用这四个部分。

相关推荐
湫ccc1 小时前
《Opencv》基础操作详解(3)
人工智能·opencv·计算机视觉
Jack_pirate1 小时前
深度学习中的特征到底是什么?
人工智能·深度学习
微凉的衣柜1 小时前
微软在AI时代的战略布局和挑战
人工智能·深度学习·microsoft
GocNeverGiveUp1 小时前
机器学习1-简单神经网络
人工智能·机器学习
Schwertlilien1 小时前
图像处理-Ch2-空间域的图像增强
人工智能
智慧化智能化数字化方案2 小时前
深入解读数据资产化实践指南(2024年)
大数据·人工智能·数据资产管理·数据资产入表·数据资产化实践指南
哦哦~9212 小时前
深度学习驱动的油气开发技术与应用
大数据·人工智能·深度学习·学习
智慧化智能化数字化方案2 小时前
120页PPT讲解ChatGPT如何与财务数字化转型的业财融合
人工智能·chatgpt
矩阵推荐官hy147623 小时前
短视频矩阵系统种类繁多,应该如何对比选择?
人工智能·python·矩阵·流量运营
kida_yuan3 小时前
【从零开始】10. RAGChecker 提升回答准确率(番外篇)
人工智能