简介
Lumina-Next-T2I 是在 Lumina-T2I 成功基础上发展起来的尖端图像生成模型。它采用了带有 2B 参数模型的 Next-DiT 和 Gemma-2B 文本编码器,推理速度更快,生成样式更丰富,并增强了多语言支持。
模型架构
Lumina-Next-T2I 的生成模型建立在 Next-DiT 骨干之上,文本编码器是 Gemma 2B 模型,而 VAE 则使用由 stabilityai 微调的 sdxl 版本。
- 生成模型: Next-DiT
- 文本编码器 Gemma-2B
- VAE: sdxl-vae
新闻和更新
- 2024 年 5 月 12 日,Lumina-Next-T2I 型号发布,为图像生成提供了更快更低的内存使用率。
安装
- 创建 conda 环境并安装 PyTorch 注意:您可能需要根据驱动程序版本调整 CUDA 版本。
bash
conda create -n Lumina_T2X -y
conda activate Lumina_T2X
conda install python=3.11 pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y
- 安装依赖
bash
pip install diffusers huggingface_hub
pip install flash-attn --no-build-isolation
- Diffusers推理
bash
from diffusers import LuminaText2ImgPipeline
import torch
pipeline = LuminaText2ImgPipeline.from_pretrained("/path/to/ckpt/Lumina-Next-SFT-diffusers", torch_dtype=torch.bfloat16).to("cuda")
# or you can download the model using code directly
# pipeline = LuminaText2ImgPipeline.from_pretrained("Alpha-VLLM/Lumina-Next-SFT-diffusers", torch_dtype=torch.bfloat16).to("cuda")
image = pipeline(prompt="Upper body of a young woman in a Victorian-era outfit with brass goggles and leather straps. "
"Background shows an industrial revolution cityscape with smoky skies and tall, metal structures").images[0]
鉴赏效果
A winter landscape with a frozen lake, snow-covered pine trees, and a small cabin with smoke coming out of the chimney.
An astronaut standing on a moonlit alien planet, with purple mountains and two large moons in the sky.
A rustic farmhouse kitchen with a wooden table, a bowl of fresh apples, and a cat curled up on a chair.
This is the Lumina output, and I wanted to show it because it was cartoony
感谢大家花时间阅读我的文章,你们的支持是我不断前进的动力。点赞并关注,获取最新科技动态,不落伍!🤗🤗🤗