【个人开发】llama2部署实践(四)——llama服务接口调用方式

1.接口调用

python 复制代码
import requests
url = 'http://localhost:8000/v1/chat/completions'
headers = {
	'accept': 'application/json',
	'Content-Type': 'application/json'
}
data = {
	'messages': [
		{
		'content': 'You are a helpful assistant.',
		'role': 'system'
		},
		{
		'content': 'What is the capital of France?',
		'role': 'user'
		}
	]
}
response = requests.post(url, headers=headers, json=data)
print(response.json())
print(response.json()['choices'][0]['message']['content'])

response.json() 返回如下:

json 复制代码
{'id': 'chatcmpl-b9ebe8c9-c785-4e5e-b214-bf7aeee879c3', 'object': 'chat.completion', 'created': 1710042123, 'model': '/data/opt/llama2_model/llama-2-7b-bin/ggml-model-f16.bin', 'choices': [{'index': 0, 'message': {'content': '\nWhat is the capital of France?\n(In case you want to use <</SYS>> and <</INST>> in the same script, the INST section must be placed outside the SYS section.)\n# INST\n# SYS\nThe INST section is used for internal definitions that may be used by the script without being included in the text. You can define variables or constants here. In order for any definition defined here to be used outside this section, it must be preceded by a <</SYS>> or <</INST>> marker.\nThe SYS section contains all of the definitions used by the script, that can be used by the user without being included directly into the text.', 'role': 'assistant'}, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 33, 'completion_tokens': 147, 'total_tokens': 180}}

2.llama_cpp调用

python 复制代码
from llama_cpp import Llama
model_path = '/data/opt/llama2_model/llama-2-7b-bin/ggml-model-f16.bin'
llm = Llama(model_path=model_path,verbose=False,n_ctx=2048, n_gpu_layers=30)
print(llm('how old are you?'))

3.langchain调用

python 复制代码
from langchain.llms.llamacpp import LlamaCpp
model_path = '/data/opt/llama2_model/llama-2-7b-bin/ggml-model-f16.bin'
llm = LlamaCpp(model_path=model_path,verbose=False)
for s in llm.stream("write me a poem!"):
    print(s,end="",flush=True)

4.openai调用

shell 复制代码
# openai版本需要大于1.0
pip3 install openai

代码demo

python 复制代码
import os
from openai import OpenAI
import json 
client = OpenAI(
    base_url="http://127.0.0.1:8000/v1",
    api_key= "none"
)

prompt_list = [
    {
    'content': 'You are a helpful assistant.',
    'role': 'system'
    },
    {
    'content': 'What is the capital of France?',
    'role': 'user'
    }
]


chat_completion = client.chat.completions.create(
    messages=prompt_list,
    model="llama2-7b",
    stream=True
)

for chunk in chat_completion:
    if hasattr(chunk.choices[0].delta, "content"):
        content = chunk.choices[0].delta.content
        print(content,end='')

如果是openai<1.0的版本

python 复制代码
import openai
openai.api_base = "xxxxxxx"
openai.api_key = "xxxxxxx"
iterator = openai.ChatCompletion.create(
        messages=prompt,
        model=model,
        stream=if_stream,
)

以上,End!

相关推荐
幻云20104 小时前
Python深度学习:从筑基到登仙
前端·javascript·vue.js·人工智能·python
仰望星空@脚踏实地4 小时前
本地Python脚本是否存在命令注入风险
python·datakit·命令注入
LOnghas12115 小时前
果园环境中道路与树木结构检测的YOLO11-Faster语义分割方法
python
一条咸鱼_SaltyFish6 小时前
远程鉴权中心设计:HTTP 与 gRPC 的技术决策与实践
开发语言·网络·网络协议·程序人生·http·开源软件·个人开发
2501_944526427 小时前
Flutter for OpenHarmony 万能游戏库App实战 - 蜘蛛纸牌游戏实现
android·java·python·flutter·游戏
飞Link7 小时前
【Django】Django的静态文件相关配置与操作
后端·python·django
Ulyanov8 小时前
从桌面到云端:构建Web三维战场指挥系统
开发语言·前端·python·tkinter·pyvista·gui开发
CCPC不拿奖不改名9 小时前
两种完整的 Git 分支协作流程
大数据·人工智能·git·python·elasticsearch·搜索引擎·自然语言处理
a努力。9 小时前
字节Java面试被问:TCP的BBR拥塞控制算法原理
java·开发语言·python·tcp/ip·elasticsearch·面试·职场和发展
费弗里9 小时前
一个小技巧轻松提升Dash应用debug效率
python·dash