【通义实验室】开源【文本生成图片】大模型

文本生成图片效果

文本为一首古诗:孤帆远影碧空尽,唯见长江天际流。 不同风格生成的图片

模型地址

中文StableDiffusion-通用领域

初始化pipeline

python 复制代码
task = Tasks.text_to_image_synthesis
model_id = 'damo/multi-modal_chinese_stable_diffusion_v1.0'
pipe = pipeline(task=task, model=model_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32)

生成图片

python 复制代码
# 反向提示词
negative_prompt = (
        "blood, gore, violence, murder, kill, dead, corpse, "
        "horrible, frightening, scary, monster, ghost, skeleton, zombie, "
        "sex, nudity, pornography, adult, erotic, mature, "
        "drugs, alcohol, smoking, tobacco, illegal, "
        "dark, night, storm, thunder, lightning, apocalypse, disaster, "
        "gun, knife, sword, bomb, explosion, firearm, "
        "mean, angry, sadistic, hostile, aggressive, bullying, "
        "dangerous, unsafe, hazardous, poison, toxic, pollution"
    )
output = pipe(
        {
            'text': '孤帆远影碧空尽,唯见长江天际流。中国画',
            'num_inference_steps': 120,
            'guidance_scale': 11,
            'negative_prompt': negative_prompt
        }
    )
cv2.imwrite('result1.png', output['output_imgs'][0])
# 输出为opencv numpy格式,转为PIL.Image
img = output['output_imgs'][0]
img = Image.fromarray(img[:,:,::-1])
img.save('result1.png')

封装为http接口的完整代码

python 复制代码
from flask import Flask, request, send_file
import io
import torch
import cv2
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
from PIL import Image

app = Flask(__name__)

# 初始化pipeline
task = Tasks.text_to_image_synthesis
model_id = 'damo/multi-modal_chinese_stable_diffusion_v1.0'
pipe = pipeline(task=task, model=model_id, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32)

@app.route('/generate', methods=['POST'])
def generate_image():
    data = request.json
    text = data.get('text', '')
    guidance_scale = data.get('guidance_scale', 9)

    if not text:
        return {'error': 'No text provided'}, 400

    negative_prompt = (
        "blood, gore, violence, murder, kill, dead, corpse, "
        "horrible, frightening, scary, monster, ghost, skeleton, zombie, "
        "sex, nudity, pornography, adult, erotic, mature, "
        "drugs, alcohol, smoking, tobacco, illegal, "
        "dark, night, storm, thunder, lightning, apocalypse, disaster, "
        "gun, knife, sword, bomb, explosion, firearm, "
        "mean, angry, sadistic, hostile, aggressive, bullying, "
        "dangerous, unsafe, hazardous, poison, toxic, pollution"
    )

    output = pipe(
        {
            'text': text,
            'num_inference_steps': 120,
            'guidance_scale': guidance_scale,
            'negative_prompt': negative_prompt
        }
    )

    img = output['output_imgs'][0]
    img = Image.fromarray(img[:, :, ::-1])  # Convert BGR to RGB

    # Save image to bytes
    img_byte_arr = io.BytesIO()
    img.save(img_byte_arr, format='PNG')
    img_byte_arr.seek(0)

    return send_file(img_byte_arr, mimetype='image/png')


if __name__ == '__main__':
    app.run(debug=False, host='0.0.0.0', port=5000)

在python环境下运行代码

第一次运行会下载大模型文件,需要等待一段时间 启动成功会有如下提示

csharp 复制代码
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5000
 * Running on http://10.10.10.132:5000

使用postman测试

源码下载地址

相关推荐
阿里云云原生2 小时前
软件工程领域 LLM 驱动的自迭代知识引擎
llm
吴佳浩2 小时前
Hermes Agent 连环 400 真凶找到了:一个 call_id 让人炸毛
人工智能·llm·agent
用户0332126663673 小时前
使用 Python 从零创建 Word 文档
python
程序员cxuan3 小时前
幽默,一个 Github 名字叫“马尾辫”,但是他给你省了 80% 的 token
人工智能·后端·程序员
宋哥转AI3 小时前
Agent记忆模块系列:03存储与检索链路实测验证
人工智能·agent
老金带你玩AI3 小时前
老金开源GoalPro,别让AI把目标越写越烂
人工智能
Bigfish_coding4 小时前
前端转agent-【python】-08 用 LangGraph 把 Agent 做成状态机:像写 Vue 3 状态管理一样编排 AI 流程
人工智能
刺猬的温驯4 小时前
语音克隆模型的难点之一:音素对齐及交叉注意力早期失效问题 (兼论旋转位置编码)——F5-TTS、SupertonicTTS、VoxFlash-TTS 对比
人工智能·语音合成·tts
道友可好5 小时前
AI 是最好的混乱放大器:代码熵管理实战
前端·人工智能·后端
不加辣椒6 小时前
第7章 边界与约束技术:确保输出的准确性与安全性
人工智能