文章目录

SD
前期预备
一些惊喜
- TorchHijackForUnet
[Txt2Img 搭配 Lora 使用](#Txt2Img 搭配 Lora 使用)
- [单独运行 txt2img.py](#单独运行 txt2img.py)
- 获取所有资源
- - 代码地址
  - 参数
  - [sd model](#sd model)
- 主程序
- - 代码地址
  - 参数(同上)
  - 模型Inference
  - - LORA应用
    - 重构并使用LORA模型
    - [用Lora重构后的网络做 sampler](#用Lora重构后的网络做 sampler)
    - 后处理

以下内容是最近的学习笔记，如果有不对的地方，还望同志们指出~共勉

SD

前期预备

深入Stable diffusion时，可以不按照官方指导来。官方指导对于AIGC爱好者比较的友好~.

可以选择Anaconda 按照之前AI传统，安装环境（需要从代码包里找到所有需要安装的库，有点麻烦，但是能用），也可以直接运行webui.sh安装虚拟的python环境（个人推荐VENV，非常省事，删掉VENV重新安装，科学上网后耗时只在半小时内）。

安装完环境后，可以设置IDE的python环境，然后debug 'launch.py', voila, 你可以开始各种探索了。

一些惊喜

TorchHijackForUnet

CondFunc('modules.models.diffusion.ddpm_edit.LatentDiffusion.decode_first_stage', first_stage_sub, first_stage_cond)

用于dynamicly import modules.

Txt2Img 搭配 Lora 使用

单独运行 txt2img.py

可以自行改写以下(放在txt2img.py的最后)：

python 复制代码

if __name__ == "__main__":
    txt2img(id_task= 'task(lt4vr6pvvx26gfm)', 
            prompt= 'pandas', 
            negative_prompt= 'cats', 
            prompt_styles=[], 
            steps= 20, 
            sampler_index= 0, 
            restore_faces= False, 
            tiling= False, 
            n_iter= 1, #(对应GUI上的Batch count)
            batch_size= 1, 
            cfg_scale= 7, 
            seed= -1, 
            subseed= -1, 
            subseed_strength= 0, 
            seed_resize_from_h= 0, 
            seed_resize_from_w= 0, 
            seed_enable_extras= False, 
            height= 512, 
            width= 512, 
            enable_hr= False, 
            denoising_strength= 0.7, 
            hr_scale= 2, 
            hr_upscaler= 'Latent', 
            hr_second_pass_steps= 0, 
            hr_resize_x= 0, 
            hr_resize_y= 0, 
            hr_sampler_index= 0, 
            hr_prompt= '', 
            hr_negative_prompt='', 
            override_settings_texts=[], 
    )

获取所有资源

代码地址

modules\txt2img.py

参数

是一个Object processing.StableDiffusionProcessingTxt2Img

参数地址： Line 871 和 Line 105

modules\processing.py

python 复制代码

p = processing.StableDiffusionProcessingTxt2Img(
        sd_model=shared.sd_model,
        outpath_samples=opts.outdir_samples or opts.outdir_txt2img_samples,
        outpath_grids=opts.outdir_grids or opts.outdir_txt2img_grids,
        prompt=prompt,
        styles=prompt_styles,
        negative_prompt=negative_prompt,
        seed=seed,
        subseed=subseed,
        subseed_strength=subseed_strength,
        seed_resize_from_h=seed_resize_from_h,
        seed_resize_from_w=seed_resize_from_w,
        seed_enable_extras=seed_enable_extras,
        sampler_name=sd_samplers.samplers[sampler_index].name,
        batch_size=batch_size,
        n_iter=n_iter,
        steps=steps,
        cfg_scale=cfg_scale,
        width=width,
        height=height,
        restore_faces=restore_faces,
        tiling=tiling,
        enable_hr=enable_hr,
        denoising_strength=denoising_strength if enable_hr else None,
        hr_scale=hr_scale,
        hr_upscaler=hr_upscaler,
        hr_second_pass_steps=hr_second_pass_steps,
        hr_resize_x=hr_resize_x,
        hr_resize_y=hr_resize_y,
        hr_sampler_name=sd_samplers.samplers_for_img2img[hr_sampler_index - 1].name if hr_sampler_index != 0 else None,
        hr_prompt=hr_prompt,
        hr_negative_prompt=hr_negative_prompt,
        override_settings=override_settings,
    )

sd model

除了：

python 复制代码

unet = p.sd_model.model.diffusion_model
text_encoder = p.sd_model.cond_stage_model

sd model还包含：

first_stage_model
configure_shareded_model

主程序

process_images()
- process_images_inner()

代码地址

modules\processing.py

参数(同上)

是一个Object processing.StableDiffusionProcessingTxt2Img

参数地址： Line 871 和 Line 105

modules\processing.py

模型Inference

LORA应用

地址：Line 186

extensions\sd-webui-additional-networks\scripts\additional_networks.py

Line 236 - Line 251:

加载Lora模型到latest_networks 这个list中
通过 du_state_dict = load_file(model_path)导入LORA模型每层权重（类似于一个ZIP里包含每一个layer的权重文件，然后通过文件名的index对应到每一层的名字上（单独一个name list））
network, info = lora_compvis.create_network_and_apply_compvis(du_state_dict, weight_tenc, weight_unet, text_encoder, unet) 将权重和LORA模型进行结合：

dimension: {96}, alpha: {48.0}, multiplier_unet: 1, multiplier_tenc: 1
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
original forward/weights is backed up.
enable LoRA for text encoder
enable LoRA for U-Net
shapes for 0 weights are converted.

重构并使用LORA模型

地址：

extensions\sd-webui-additional-networks\scripts\lora_compvis.py

靠 model里是否包含"MultiheadAttention"判断LORA的版本。有：V2，没有：V1

modules 里的dim 和 alpha 都是传入的一个list，可以潜在不唯一（如果满足网络结构）

python 复制代码

comp_vis_loras_dim_alpha[comp_vis_lora_name] = (dim, alpha)

把SD与lora结合就需要根据几点去筛选出Unet和CLIP里有用的那几层，条件：

选择：Linear or Conv2d相关的
- 不选 _resblocks_23_, 即StabilityAi Text Encoder最后一个block
- 不选不在comp_vis_loras_dim_alpha这个dict的keys里的
选择：MultiheadAttention相关的
- 在这个范围内的每一层，都会通过加入 ["q_proj", "k_proj", "v_proj", "out_proj"] 这些suffix 来构建关于 Q，K， V， O 的重复项
- 不选 _resblocks_23_, 即StabilityAi Text Encoder最后一个block
- 不选不在comp_vis_loras_dim_alpha这个dict的keys里的

以上条件都满足的会通过：

python 复制代码

lora = LoRAModule(lora_name, child_module, multiplier, dim, alpha)

重构针对Lora使用的Unet 或者 CLIP的每一层/block，形成新的Unet & CLIP，也因此，我们的SD model 融入好了我们自己的LORA

用Lora重构后的网络做 sampler

地址：

modules\processing.py

通过 def sample() 来运行：

组装sampler
设定latent scale mode
create_random_tensors
- noise的shape =(4, 64, 64)
- noise 被 slerp(subseed_strength, noise, subnoise)做interpolation的平滑优化
  
  samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x)

后处理

地址：

modules\processing.py

通过以下代码得到最终模型输出得结果：

python 复制代码

with devices.without_autocast() if devices.unet_needs_upcast else devices.autocast():
                samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)

从Line 741 开始进入后处理得阶段（根据用户需要进行处理），以下为一个例子：

改颜色范围以及调换shape(从（3, 512, 512）==> (512, 512, 3)).

python 复制代码

x_sample = 255. * np.moveaxis(x_sample.cpu().numpy(), 0, 2)

Debug Stable Diffusion webui

文章目录

SD

前期预备

一些惊喜

TorchHijackForUnet

Txt2Img 搭配 Lora 使用

单独运行 txt2img.py

获取所有资源

代码地址

参数

sd model

主程序

代码地址

参数(同上)

模型Inference

LORA应用

重构并使用LORA模型

用Lora重构后的网络 做 sampler

后处理

用Lora重构后的网络做 sampler