gemini-pro-vision 看图说话

一、安装

复制代码
   pip install -U langchain-google-vertexai

二、设置访问权限

申请服务账号json格式key

三、完整代码

复制代码
import gradio as gr
import json
import base64
from pathlib import Path
import os
import time
import requests
from fastapi import FastAPI, UploadFile, File
from fastapi.middleware.cors import CORSMiddleware
import uvicorn
from langchain_core.messages import HumanMessage
from langchain_google_vertexai import ChatVertexAI
from langchain_core.output_parsers import StrOutputParser

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "xxx.json"
app = FastAPI()
app.add_middleware(
        CORSMiddleware,
        allow_origins=["*"],
        allow_credentials=True,
        allow_methods=["*"],
        allow_headers=["*"],
    )

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

def generate(model, prompt, images_base64):
    llm = ChatVertexAI(model_name=model)
    # example
    message = HumanMessage(
        content=[
            {
                "type": "text",
                "text": prompt,
            },
            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{images_base64}"}},
        ]
    )
    parser = StrOutputParser()
    result = llm.invoke([message])
    parserResult = parser.invoke(result)
    return parserResult

def respond(model, image_path, prompt, chat_history):
    print(model, image_path, prompt)
    images_base64 = [encode_image(image_path)]
    bot_message = generate(model, prompt, images_base64)
    chat_history.append((prompt, bot_message))
    time.sleep(1)
    return "", chat_history

with gr.Blocks() as demo:
    gr.Image(value='xxx.png',height=30,width=70, interactive=False, show_download_button=False, show_label=False)
    gr.HTML("""<h1 align="center">图片问答</h1>""")
    
    model = gr.Textbox(value="gemini-pro-vision",label="gemini多模态模型:")
    with gr.Row():
        with gr.Column(scale=1):
            image_path = gr.Image(label="上传图片:",type="filepath", value='Picture1.png')
        with gr.Column(scale=3):
            chatbot = gr.Chatbot()
    prompt = gr.Textbox(label="用户:",value="大童在保险行业的地位如何?使用中文回答。")
    
    clear = gr.ClearButton([prompt, chatbot])
            
    prompt.submit(respond, [model, image_path, prompt, chatbot], [prompt, chatbot])

app = gr.mount_gradio_app(app, demo, path="/")

if __name__ == '__main__':
    uvicorn.run(app='web_gemini:app', host='0.0.0.0', port=8500, workers=1)

四、运行效果

相关推荐
qq_414256576 分钟前
JavaScript中类继承中super关键字的调用执行逻辑
jvm·数据库·python
tanis_207712 分钟前
MinerU2.5-Pro 中文 PDF 识别准确率全解:OmniDocBench v1.6 权威基准数据
人工智能·python·pdf
ㄟ留恋さ寂寞31 分钟前
html如何修改备注
jvm·数据库·python
2401_8844541533 分钟前
c++如何读取YAML格式配置文件_yaml-cpp库快速入门【详解】
jvm·数据库·python
2301_775639891 小时前
mysql升级时如何使用Ansible进行自动化部署_mysql自动化管理
jvm·数据库·python
CLX05051 小时前
怎样设置外键的更新级联操作_ON UPDATE CASCADE配置
jvm·数据库·python
Gerardisite1 小时前
企微批量群发消息指南:用 QiWe 省掉人工操作
java·python·机器人·企业微信
zjy277771 小时前
SQL Server如何实现编写表与字段注释_Navicat兼容操作步骤
jvm·数据库·python
m0_702036531 小时前
CSS移动端实现响应式导航菜单_利用媒体查询切换显示隐藏状态
jvm·数据库·python
m0_596749091 小时前
Go语言怎么用Jaeger_Go语言Jaeger链路追踪教程【实用】
jvm·数据库·python