gemini-pro-vision 看图说话

一、安装

复制代码
   pip install -U langchain-google-vertexai

二、设置访问权限

申请服务账号json格式key

三、完整代码

复制代码
import gradio as gr
import json
import base64
from pathlib import Path
import os
import time
import requests
from fastapi import FastAPI, UploadFile, File
from fastapi.middleware.cors import CORSMiddleware
import uvicorn
from langchain_core.messages import HumanMessage
from langchain_google_vertexai import ChatVertexAI
from langchain_core.output_parsers import StrOutputParser

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "xxx.json"
app = FastAPI()
app.add_middleware(
        CORSMiddleware,
        allow_origins=["*"],
        allow_credentials=True,
        allow_methods=["*"],
        allow_headers=["*"],
    )

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

def generate(model, prompt, images_base64):
    llm = ChatVertexAI(model_name=model)
    # example
    message = HumanMessage(
        content=[
            {
                "type": "text",
                "text": prompt,
            },
            {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{images_base64}"}},
        ]
    )
    parser = StrOutputParser()
    result = llm.invoke([message])
    parserResult = parser.invoke(result)
    return parserResult

def respond(model, image_path, prompt, chat_history):
    print(model, image_path, prompt)
    images_base64 = [encode_image(image_path)]
    bot_message = generate(model, prompt, images_base64)
    chat_history.append((prompt, bot_message))
    time.sleep(1)
    return "", chat_history

with gr.Blocks() as demo:
    gr.Image(value='xxx.png',height=30,width=70, interactive=False, show_download_button=False, show_label=False)
    gr.HTML("""<h1 align="center">图片问答</h1>""")
    
    model = gr.Textbox(value="gemini-pro-vision",label="gemini多模态模型:")
    with gr.Row():
        with gr.Column(scale=1):
            image_path = gr.Image(label="上传图片:",type="filepath", value='Picture1.png')
        with gr.Column(scale=3):
            chatbot = gr.Chatbot()
    prompt = gr.Textbox(label="用户:",value="大童在保险行业的地位如何?使用中文回答。")
    
    clear = gr.ClearButton([prompt, chatbot])
            
    prompt.submit(respond, [model, image_path, prompt, chatbot], [prompt, chatbot])

app = gr.mount_gradio_app(app, demo, path="/")

if __name__ == '__main__':
    uvicorn.run(app='web_gemini:app', host='0.0.0.0', port=8500, workers=1)

四、运行效果

相关推荐
二川bro5 小时前
量子计算入门:Python量子编程基础
python
夏天的味道٥6 小时前
@JsonIgnore对Date类型不生效
开发语言·python
tsumikistep6 小时前
【前后端】接口文档与导入
前端·后端·python·硬件架构
小白学大数据7 小时前
Python爬虫伪装策略:如何模拟浏览器正常访问JSP站点
java·开发语言·爬虫·python
头发还在的女程序员8 小时前
三天搞定招聘系统!附完整源码
开发语言·python
温轻舟8 小时前
Python自动办公工具06-设置Word文档中表格的格式
开发语言·python·word·自动化工具·温轻舟
花酒锄作田8 小时前
[python]FastAPI-Tracking ID 的设计
python·fastapi
AI-智能8 小时前
别啃文档了!3 分钟带小白跑完 Dify 全链路:从 0 到第一个 AI 工作流
人工智能·python·自然语言处理·llm·embedding·agent·rag
d***956210 小时前
爬虫自动化(DrissionPage)
爬虫·python·自动化
APIshop10 小时前
Python 零基础写爬虫:一步步抓取商品详情(超细详解)
开发语言·爬虫·python