解决diffusers加载stablediffusion模型,输入prompt总是报错token数超出clip最大长度限制

1. StableDiffusion1.5

在加载huggingface中的扩散模型时,输入prompt总是会被报错超过clip的最大长度限制。

解决方案:使用compel库

python 复制代码
from diffusers import AutoPipelineForText2Image
import torch
import pdb
from compel import Compel

device = torch.device("cuda:3")
# 大模型
model_path = "/data1/zhikun.zhao/huggingface_test/hubd/stable-diffusion-v1-5"
pipeline = AutoPipelineForText2Image.from_pretrained(
	model_path, torch_dtype=torch.float32
).to(device)

# 设置lora
pipeline.load_lora_weights("/data1/zhikun.zhao/huggingface_test/hubd/adapter/c_adapt1", weight_name="zhenshi.safetensors", adapter_name = "zhenshi")

#保证重复性和可复现性
generator = torch.Generator("cuda:3").manual_seed(31)

prompt = "score_7_up, realhuman, photo_\\(medium\\), (dreamy, haze:1.2), (shot on GoPro hero:1.3), instagram, ultra-realistic, high quality, high resolution, RAW photo, 8k, 4k, soft shadows, artistic, shy, bashful, innocent, interior, dramatic, dynamic composition, 18yo woman, medium shot, closeup, petite 18-year-old woman, (hazel eyes,lip piercing,long silver straight hairs,Layered Curls cut, effect ,Sad expression, Downturned mouth, drooping eyelids, furrowed brows:0.8), wearing a figure-hugging dress with a plunging neckline and lace details, paired with black opaque tights pantyhose and knee-high leather boots, The look is bold and daring, perfect for a night out, detailed interior space, "
negative_prompt = "score_1, skinny, slim, ribs, abs, 2girls, piercings, bimbo breasts, professional, bokeh, blurry, text"

compel = Compel(tokenizer = pipeline.tokenizer, text_encoder = pipeline.text_encoder)
conditioning = compel.build_conditioning_tensor(prompt)
negative_conditioning = compel.build_conditioning_tensor(negative_prompt) # .build_conditioning_tensor()和()通用
[conditioning, negative_conditioning] = compel.pad_conditioning_tensors_to_same_length([conditioning, negative_conditioning])


out = pipeline(prompt_embeds = conditioning,
    num_images_per_prompt = 1, generator=generator, num_inference_steps = 50, # 建议步数50就可以
    height = 1024, width = 1024,
    guidance_scale = 7   # 文字相关度,这个值越高,生成图像就跟文字提示越接近,但是值太大效果就不好了。
)
image = out.images[0]
image.save("img/test.png")

2. StableDiffusionXL1.0

上述解决方案在加载SDXL1.0模型的时候提示:输入prompt_embeds的同时应该输入pooled_prompt_embeds。

修改部分上述代码如下:

python 复制代码
out = pipeline(prompt_embeds = conditioning[0], pooled_prompt_embeds = conditioning[1],
    negative_prompt_embeds = negative_conditioning[0], negative_pooled_prompt_embeds = negative_conditioning[1],
    num_images_per_prompt = 1, generator=generator, num_inference_steps = 50, # 建议步数50就可以
    height = 1024, width = 768,
    guidance_scale = 3   # 文字相关度,这个值越高,生成图像就跟文字提示越接近,但是值太大效果就不好了。
)
相关推荐
DuanPenghao1 天前
RISCV实战:实现基于Verilator模拟蜂鸟E203的加法器和卷积神经网络仿真
人工智能·嵌入式硬件·神经网络·cnn·risc-v
一水鉴天1 天前
整体设计 定稿 之31 拼语言统筹表 - “归” 档位属 多轴联动(codebuddy)
人工智能·架构
智算菩萨1 天前
Gemini 3 Flash深度解析:Google推出的最新一代快速高效AI模型详尽性能评测报告
人工智能·aigc·gemini
智算菩萨1 天前
【理论讲解】深度多任务学习:概念体系、方法谱系与跨领域建模逻辑
人工智能·机器学习·多任务学习
张彦峰ZYF1 天前
借助DeepSeek思考产业落地:蒸馏、小模型微调
人工智能·ai·deepseek-v3·deepseek-r1·蒸馏-小模型微调
蓝鲨硬科技1 天前
五一视界与摩尔线程深度合作,释放物理AI进化潜能
人工智能
过河卒_zh15667661 天前
网信发布2025年“人工智能+政务”规范应用案例拟入选名单公示
人工智能·大模型·aigc·政务·算法备案
540_5401 天前
ADVANCE Day26
人工智能·python·机器学习
IT_陈寒1 天前
Redis 性能优化实战:5个被低估的配置项让我节省了40%内存成本
前端·人工智能·后端
乾元1 天前
用 AI 做联动:当应用层出现问题,网络如何被“自动拉入决策回路”
运维·开发语言·网络·人工智能·ci/cd·自动化