飞桨AI Studio可以玩多模态了?MiniGPT4实战演练!

MiniGPT4是基于GPT3的改进版本,它的参数量比GPT3少了一个数量级,但是在多项自然语言处理任务上的表现却不逊于GPT3。项目作者以MiniGPT4-7B作为实战演练项目。

创作者:衍哲

体验链接:
https://aistudio.baidu.com/aistudio/projectdetail/6556667

一键fork

fork该项目并运行,运行环境建议至少选择A100(40G)及以上配置

安装相关模块

复制代码
1import os 
2os.system("pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html") # 安装nlp分支最新包
3os.system("pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html")
4os.system("pip install tqdm")
5!pip install ipywidgets

引用相关模块

复制代码
 1%%capture
 2os.environ["CUDA_VISIBLE_DEVICES"] = "0"
 3os.environ["FLAGS_use_cuda_managed_memory"] = "true"
 4import requests
 5from PIL import Image
 6import gradio as gr
 7from tqdm import tqdm
 8import ipywidgets as widgets
 9from IPython.display import display
10import csv    
11from itertools import islice 
12from paddlenlp.transformers import MiniGPT4ForConditionalGeneration, MiniGPT4Processor

下载miniGPT4权重或配置文件

复制代码
1!mkdir minigpt4

 1%%capture
 2os.system("wget -O  minigpt4/model_config.json https://bj.bcebos.com/v1/ai-studio-online/924ed883c17b4b8b88b4a1f98e24d34b3b00160ac9bd4b3ba478aff6974e0e9d?responseContentDisposition=attachment%3B%20filename%3Dmodel_config.json ")
 3!wget -O  ./minigpt4/model_state.pdparams    https://bj.bcebos.com/v1/ai-studio-online/18bd53eaa2854263ba31fb4d75f31a5f0d38421a6da64525bff6da230389fc36?responseContentDisposition=attachment%3B%20filename%3Dmodel_state.pdparams
 4!wget -O  ./minigpt4/generation_config.json  https://bj.bcebos.com/v1/ai-studio-online/f0b2129d6a934a97abcaa139ac1f28e33a6940004c7a4c859737f282640cf332?responseContentDisposition=attachment%3B%20filename%3Dgeneration_config.json
 5!wget -O  ./minigpt4/preprocessor_config.json https://bj.bcebos.com/v1/ai-studio-online/748c332837d34f389d762f487470b1a7221edd36ccb5484b913bd2d3855ee9f6?responseContentDisposition=attachment%3B%20filename%3Dpreprocessor_config.json
 6!wget -O  ./minigpt4/sentencepiece.bpe.model https://bj.bcebos.com/v1/ai-studio-online/0139a1bfcdf84058b77cea4631837340ea94f5fcc37445929a3414f05d07579b?responseContentDisposition=attachment%3B%20filename%3Dsentencepiece.bpe.model
 7!wget  -O  ./minigpt4/special_tokens_map.json https://bj.bcebos.com/v1/ai-studio-online/90b16a96d4f94200ab417b39dcf3bce4ddef5885625c4d0c8e70b3f659cb6993?responseContentDisposition=attachment%3B%20filename%3Dspecial_tokens_map.json
 8!wget -O  ./minigpt4/tokenizer.json  https://bj.bcebos.com/v1/ai-studio-online/e877a685eb86499cb87e1c4cbf85353856506d12e9a841a292e780aa4a9e188a?responseContentDisposition=attachment%3B%20filename%3Dtokenizer.json
 9!wget  -O  ./minigpt4/tokenizer_config.json  https://bj.bcebos.com/v1/ai-studio-online/f93064db167c4075b1f86d6878cac9303fb8df418f7a42a7900785a6e188cc44?responseContentDisposition=attachment%3B%20filename%3Dtokenizer_config.json
10--2023-07-27 10:54:29--  https://bj.bcebos.com/v1/ai-studio-online/924ed883c17b4b8b88b4a1f98e24d34b3b00160ac9bd4b3ba478aff6974e0e9d?responseContentDisposition=attachment%3B%20filename%3Dmodel_config.json
11Resolving bj.bcebos.com (bj.bcebos.com)... 182.61.200.195, 182.61.200.229, 2409:8c04:1001:1002:0:ff:b001:368a
12Connecting to bj.bcebos.com (bj.bcebos.com)|182.61.200.195|:443... connected.
13HTTP request sent, awaiting response... 200 OK
14Length: 5628 (5.5K) [application/octet-stream]
15Saving to: 'minigpt4/model_config.json'

实例化miniGPT4模型和处理器

复制代码
1model_path ='./minigpt4'
2model = MiniGPT4ForConditionalGeneration.from_pretrained(model_path)
3model.eval()
4processor = MiniGPT4Processor.from_pretrained(model_path)

模型推理

输入图像url+prompt(单张图片+单轮对话)

另有本地上传图像形式,请进入项目查看

复制代码
 1def predict_per_url_prompt(url=None,text=None):
 2    if url==None:
 3        url = "https://paddlenlp.bj.bcebos.com/data/images/mugs.png"
 4    image = Image.open(requests.get(url, stream=True).raw)
 5    if text== None:
 6        text = "describe this image"
 7
 8    prompt = "Give the following image: <Img>ImageContent</Img>. You will be able to see the image once I provide it to you. Please answer my questions.###Human: <Img><ImageHere></Img> <TextHere>###Assistant:"
 9
10    inputs = processor([image], text, prompt)
11
12    generate_kwargs = {
13        "max_length": 300,
14        "num_beams": 1,
15        "top_p": 1.0,
16        "repetition_penalty": 1.0,
17        "length_penalty": 0,
18        "temperature": 1,
19        "decode_strategy": "greedy_search",
20        "eos_token_id": [[835], [2277, 29937]],
21    }
22    outputs = model.generate(**inputs, **generate_kwargs)
23    msg = processor.batch_decode(outputs[0])
24    return msg[0][0:-5]

将图像上传到本地后的file_path+prompt(多张图片+单轮对话)

复制代码
 1def predict_dir_and_one_prompt_out_list(dir_path=None,text=None):
 2    import os 
 3    assert os.path.isdir(dir_path),print('请输入文件夹路径,而不是图像路径')
 4    output = []
 5    for per_image_name in tqdm (os.listdir(dir_path)):
 6        image = Image.open(os.path.join(dir_path,per_image_name))
 7        if text== None:
 8            text = "describe this image"
 9        else:
10            text = text
11
12        prompt = "Give the following image: <Img>ImageContent</Img>. You will be able to see the image once I provide it to you. Please answer my questions.###Human: <Img><ImageHere></Img> <TextHere>###Assistant:"
13
14        inputs = processor([image], text, prompt)
15
16        generate_kwargs = {
17            "max_length": 300,
18            "num_beams": 1,
19            "top_p": 1.0,
20            "repetition_penalty": 1.0,
21            "length_penalty": 0,
22            "temperature": 1,
23            "decode_strategy": "greedy_search",
24            "eos_token_id": [[835], [2277, 29937]],
25        }
26        outputs = model.generate(**inputs, **generate_kwargs)
27        msg = processor.batch_decode(outputs[0])
28        output.append(msg[0][0:-5])
29    return output

效果展示

输入:描述这张图片,使用中文

输出:这张图片显示了一个女性角色,穿着红色和白色的服装,手持一根金色的剑。她的头发是白色的,眼睛是红色的。她站在一张草地上,手持剑的柄子。这个角色看起来像是一个英雄,她的服装和装备显示出她的力量和勇气

复制代码
1predict_per_url_prompt(url='https://ai-studio-static-online.cdn.bcebos.com/d283b05404bd44b69b9be868fddb67616296858284bf4ad587e29432de66e930',text="描述这张图片,使用中文")
2'这张图片显示了一个女性角色,穿着红色和白色的服装,手持一根金色的剑。她的头发是白色的,眼睛是红色的。她站在一张草地上,手持剑的柄子。这个角色看起来像是一个英雄,她的服装和装备显示出她的力量和勇气'

更多玩法,可一键fork该项目进行模型微调。

点击下方链接即可立即体验更多大模型应用。

https://aistudio.baidu.com/aistudio/application/center

相关推荐
扫地的小何尚9 分钟前
NVIDIA RTX PC开源AI工具升级:加速LLM和扩散模型的性能革命
人工智能·python·算法·开源·nvidia·1024程序员节
人工智能AI技术14 分钟前
多智能体开发实战:从需求拆解到落地部署,这套工程化方案直接复用
人工智能
我的offer在哪里18 分钟前
Hugging Face 生态全景图:从数据到部署的全链路 AI 工厂
人工智能
田井中律.28 分钟前
多模态RAG实战指南
人工智能
DX_水位流量监测1 小时前
大坝安全监测之渗流渗压位移监测设备技术解析
大数据·运维·服务器·网络·人工智能·安全
昵称已被吞噬~‘(*@﹏@*)’~1 小时前
【RL+空战】学习记录03:基于JSBSim构造简易空空导弹模型,并结合python接口调用测试
开发语言·人工智能·python·学习·深度强化学习·jsbsim·空战
Yeats_Liao1 小时前
MindSpore开发之路(二十四):MindSpore Hub:快速复用预训练模型
人工智能·分布式·神经网络·机器学习·个人开发
老周聊架构1 小时前
基于YOLOv8-OBB旋转目标检测数据集与模型训练
人工智能·yolo·目标检测
AKAMAI1 小时前
基准测试:Akamai云上的NVIDIA RTX Pro 6000 Blackwell
人工智能·云计算·测试
寂寞恋上夜2 小时前
异步任务怎么设计:轮询/WebSocket/回调(附PRD写法)
网络·人工智能·websocket·网络协议·markdown转xmind·deepseek思维导图