gemini-pro-vision 看图说话

一、安装

复制代码
   pip install -U langchain-google-vertexai

二、设置访问权限

申请服务账号json格式key

三、完整代码

复制代码
import gradio as gr
import json
import base64
from pathlib import Path
import os
import time
import requests
from fastapi import FastAPI, UploadFile, File
from fastapi.middleware.cors import CORSMiddleware
import uvicorn
from langchain_core.messages import HumanMessage
from langchain_google_vertexai import ChatVertexAI
from langchain_core.output_parsers import StrOutputParser

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "xxx.json"
app = FastAPI()
app.add_middleware(
        CORSMiddleware,
        allow_origins=["*"],
        allow_credentials=True,
        allow_methods=["*"],
        allow_headers=["*"],
    )

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

def generate(model, prompt, images_base64):
    llm = ChatVertexAI(model_name=model)
    # example
    message = HumanMessage(
        content=[
            {
                "type": "text",
                "text": prompt,
            },
            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{images_base64}"}},
        ]
    )
    parser = StrOutputParser()
    result = llm.invoke([message])
    parserResult = parser.invoke(result)
    return parserResult

def respond(model, image_path, prompt, chat_history):
    print(model, image_path, prompt)
    images_base64 = [encode_image(image_path)]
    bot_message = generate(model, prompt, images_base64)
    chat_history.append((prompt, bot_message))
    time.sleep(1)
    return "", chat_history

with gr.Blocks() as demo:
    gr.Image(value='xxx.png',height=30,width=70, interactive=False, show_download_button=False, show_label=False)
    gr.HTML("""<h1 align="center">图片问答</h1>""")
    
    model = gr.Textbox(value="gemini-pro-vision",label="gemini多模态模型:")
    with gr.Row():
        with gr.Column(scale=1):
            image_path = gr.Image(label="上传图片:",type="filepath", value='Picture1.png')
        with gr.Column(scale=3):
            chatbot = gr.Chatbot()
    prompt = gr.Textbox(label="用户:",value="大童在保险行业的地位如何?使用中文回答。")
    
    clear = gr.ClearButton([prompt, chatbot])
            
    prompt.submit(respond, [model, image_path, prompt, chatbot], [prompt, chatbot])

app = gr.mount_gradio_app(app, demo, path="/")

if __name__ == '__main__':
    uvicorn.run(app='web_gemini:app', host='0.0.0.0', port=8500, workers=1)

四、运行效果

相关推荐
程序员杰哥2 分钟前
性能测试详解
自动化测试·软件测试·python·测试工具·职场和发展·测试用例·性能测试
人工智能AI技术24 分钟前
【Agent从入门到实践】42实战:用Docker打包Agent,实现一键部署
人工智能·python
开发者小天1 小时前
python中的class类
开发语言·python
idwangzhen1 小时前
GEO优化系统哪家更专业
python·信息可视化
diediedei1 小时前
机器学习模型部署:将模型转化为Web API
jvm·数据库·python
m0_561359671 小时前
使用Python自动收发邮件
jvm·数据库·python
naruto_lnq2 小时前
用Python批量处理Excel和CSV文件
jvm·数据库·python
b2077212 小时前
Flutter for OpenHarmony 身体健康状况记录App实战 - 提醒设置实现
python·flutter·macos·cocoa·harmonyos
2301_822365032 小时前
数据分析与科学计算
jvm·数据库·python
河北小博博2 小时前
分布式系统稳定性基石:熔断与限流的深度解析(附Python实战)
java·开发语言·python