分割&蒙版&遮罩
BRIA RMBG 1.4 is for non-commercial use.
ComfyUI-Inspyrenet-Rembg
项目地址:github.com/plemeri/InS...
插件地址:github.com/john-mnz/Co...
Superior rembg quality compared to other methods (just give it a try!)
Can take batch of images as input
Optimized for image batch to be the fastest rembg node (perfect for video frames)
Outputs both the image and the corresponding mask
Shows the progress in terminal
模型下载
python
class Remover:
def __init__(self, mode="base", jit=False, device=None, ckpt=None, fast=None):
# ckpt (str, optional): specifying model checkpoint. find downloaded checkpoint or try download if not specified.
ckpt_dir, ckpt_name = os.path.split(os.path.abspath(ckpt))
shell
# 自动下载模型路径
~\.transparent-background
python
base:
url: "https://github.com/plemeri/transparent-background/releases/download/1.2.12/ckpt_base.pth"
md5: "d692e3dd5fa1b9658949d452bebf1cda"
ckpt_name: "ckpt_base.pth"
http_proxy: NULL
base_size: [1024, 1024]
fast:
url: "https://github.com/plemeri/transparent-background/releases/download/1.2.12/ckpt_fast.pth"
md5: "9efdbfbcc49b79ef0f7891c83d2fd52f"
ckpt_name: "ckpt_fast.pth"
http_proxy: NULL
base_size: [384, 384]
base-nightly:
url: "https://github.com/plemeri/transparent-background/releases/download/1.2.12/ckpt_base_nightly.pth"
md5: NULL
ckpt_name: "ckpt_base_nightly.pth"
http_proxy: NULL
base_size: [1024, 1024]
ZHO-ZHO-ZHO/ComfyUI-BiRefNet-ZHO
项目地址:github.com/ZhengPeng7/...
模型地址:huggingface.co/ZhengPeng7/...
插件地址:github.com/ZHO-ZHO-ZHO...
- 对 BiRefNet 的非官方实现
- 与 viperyl/ComfyUI-BiRefNet 插件区别:
-
原版插件:只能简单输出蒙版,不方便用,也不能处理视频
-
新版插件: 1)模型加载和图像处理相分离,提升速度(和我之前做的 BRIA RMBG in ComfyUI 插件一致)
2)可以直接输出透明背景的 PNG 图
3)可以直接抠视频
-
- BiRefNet 模型:目前最好的开源可商用背景抠除模型
- 版本:V1.0 同时支持 图像和视频 处理
- 将 [BiRefNet](ViperYX/BiRefNet at main (hf-mirror.com)) 中的 6 个模型均下载至
./models/BiRefNet
- 节点:
- 🧹BiRefNet Model Loader:自动加载 BiRefNet 模型
- 🧹BiRefNet:去除背景
Yoloworld-ESAM/camenduru/YoloWorld-EfficientSAM:
- 对YOLO-World + EfficientSAM的非官方实现
- 利用全新的 YOLO-World 与 EfficientSAM 实现高效的对象检测 + 分割
- 版本:V2.0 新增蒙版分离 + 提取功能,支持选择指定蒙版单独输出,同时支持图像和视频(V1.0工作流已弃用) 项目地址:YOLO-World + EfficientSAM & YOLO-World
模型下载地址: hf-mirror.com/camenduru/Y...
节点说明 | Features
-
YOLO-World 模型加载 | 🔎Yoloworld Model Loader
- 支持 3 种官方模型:yolo_world/l, yolo_world/m, yolo_world/s,会自动下载并加载
-
EfficientSAM 模型加载 | 🔎ESAM Model Loader
- 支持 CUDA 或 CPU
-
🆕检测 + 分割 | 🔎Yoloworld ESAM
- yolo_world_model:接入 YOLO-World 模型
- esam_model:接入 EfficientSAM 模型
- image:接入图像
- categories:检测 + 分割内容
- confidence_threshold:置信度阈值,降低可减少误检,增强模型对所需对象的敏感性。增加可最小化误报,防止模型识别不应识别的对象
- iou_threshold:IoU 阈值,降低数值可减少边界框的重叠,使检测过程更严格。增加数值将会允许更多的边界框重叠,适应更广泛的检测范围
- box_thickness:检测框厚度
- text_thickness:文字厚度
- text_scale:文字缩放
- with_confidence:是否显示检测对象的置信度
- with_class_agnostic_nms:是否抑制类别之间的重叠边界框
- with_segmentation:是否开启 EfficientSAM 进行实例分割
- mask_combined:是否合并(叠加)蒙版 mask,"是"则将所有 mask 叠加在一张图上输出,"否"则会将所有的蒙版单独输出
- mask_extracted:是否提取选定蒙版 mask,"是"则会将按照 mask_extracted_index 将所选序号的蒙版单独输出
- mask_extracted_index:选择蒙版 mask 序号
-
🆕检测 + 分割 | 🔎Yoloworld ESAM Detector Provider (由 ltdrdata 提供,感谢!)
- 可配合 Impact-Pack 一起使用
- yolo_world_model:接入 YOLO-World 模型
- esam_model:接入 EfficientSAM 模型
- categories:检测 + 分割内容
- iou_threshold:IoU 阈值
- with_class_agnostic_nms:是否抑制类别之间的重叠边界框
安装 | Install
-
推荐使用管理器 ComfyUI Manager 安装(On the Way)
-
手动安装:
cd custom_nodes
git clone https://github.com/ZHO-ZHO-ZHO/ComfyUI-YoloWorld-EfficientSAM
cd custom_nodes/ComfyUI-YoloWorld-EfficientSAM
pip install -r requirements.txt
- 重启 ComfyUI
-
模型下载:将 EfficientSAM 中的 efficient_sam_s_cpu.jit 和 efficient_sam_s_gpu.jit 下载到 custom_nodes/ComfyUI-YoloWorld-EfficientSAM 中
QA
- 加载 node 初始化缺少 supervision
shell
Traceback (most recent call last):
File "E:\vendor\cv\stablediffusion\ComfyUI-master\ComfyUI-master\nodes.py", line 1867, in load_custom_node
module_spec.loader.exec_module(module)
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "E:\vendor\cv\stablediffusion\ComfyUI-master\ComfyUI-master\custom_nodes\ComfyUI-YoloWorld-EfficientSAM\__init__.py", line 1, in <module>
from . import YOLO_WORLD_EfficientSAM
File "E:\vendor\cv\stablediffusion\ComfyUI-master\ComfyUI-master\custom_nodes\ComfyUI-YoloWorld-EfficientSAM\YOLO_WORLD_EfficientSAM.py", line 6, in <module>
import supervision as sv
ModuleNotFoundError: No module named 'supervision'
- module 'cv2' has no attribute 'FONT_HERSHEY_SIMPLEX'
shell
from . import YOLO_WORLD_EfficientSAM
File "E:\vendor\cv\stablediffusion\ComfyUI-master\ComfyUI-master\custom_nodes\ComfyUI-YoloWorld-EfficientSAM\YOLO_WORLD_EfficientSAM.py", line 6, in <module>
import supervision as sv
File "D:\Users\leoli\miniconda3\envs\ComfyUI\lib\site-packages\supervision\__init__.py", line 9, in <module>
from supervision.annotators.core import (
File "D:\Users\leoli\miniconda3\envs\ComfyUI\lib\site-packages\supervision\annotators\core.py", line 19, in <module>
from supervision.draw.utils import draw_polygon
File "D:\Users\leoli\miniconda3\envs\ComfyUI\lib\site-packages\supervision\draw\utils.py", line 164, in <module>
text_font: int = cv2.FONT_HERSHEY_SIMPLEX,
AttributeError: module 'cv2' has no attribute 'FONT_HERSHEY_SIMPLEX'
Cannot import E:\vendor\cv\stablediffusion\ComfyUI-master\ComfyUI-master\custom_nodes\ComfyUI-YoloWorld-EfficientSAM module for custom nodes: module 'cv2' has no attribute 'FONT_HERSHEY_SIMPLEX'
SAM: storyicon/comfyui_segment_anything
SAM 项目地址:
- SAM: GitHub - facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
- SAM2: GitHub - facebookresearch/segment-anything-2: The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
GitHub - kijai/ComfyUI-segment-anything-2: ComfyUI nodes to use segment-anything-2 模型目录:models/sam2
GitHub - neverbiasu/ComfyUI-SAM2: A ComfyUI extension for Segment-Anything 2 模型目录:models/sams
依赖
segment_anything
timm
segment anythin 2 模型
or individually from:
Then SAM 2 can be used in a few lines as follows for image and video prediction.
segment anything 模型
模型文件存放在 comfyUI_root/models/sams 目录中
name | size | model file |
---|---|---|
sam_vit_h | 2.56GB | download link |
sam_vit_l | 1.25GB | download link |
sam_vit_b | 375MB | download link |
sam_hq_vit_h | 2.57GB | download link |
sam_hq_vit_l | 1.25GB | download link |
sam_hq_vit_b | 379MB | download link |
mobile_sam | 39MB | download link |
文本 embedding 嵌入模型: bert-base-uncased
models/bert-base-uncased
folder located in the root directory of ComfyUI
shell
ComfyUI
models
bert-base-uncased
config.json
model.safetensors
tokenizer_config.json
tokenizer.json
vocab.txt
GroundingDino
GitHub - IDEA-Research/GroundingDINO: [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection" GitHub - Curiosity-Machines/GroundingDINO 生成物体的标签,将 SAM 分割出的打标签,将与提示词相关的物体分割出来。
模型文件与配置文件都存放 comfyUI_root/models/grounding-dino 目录中
name | size | config file | model file |
---|---|---|---|
GroundingDINO_SwinT_OGC | 694MB | download link | download link |
GroundingDINO_SwinB | 938MB | download link | download link |
ClipSeg
GitHub 地址: GitHub - biegert/ComfyUI-CLIPSeg: ComfyUI CLIPSeg
This repository contains two custom nodes for ComfyUI that utilize the CLIPSeg model to generate masks for image inpainting tasks based on text prompts.
模型下载
huggingface 自动下载,查看缓存路径
1. CLIPSeg
The CLIPSeg node generates a binary mask for a given input image and text prompt.
Inputs:
- image: A torch.Tensor representing the input image.
- text: A string representing the text prompt.
- blur: A float value to control the amount of Gaussian blur applied to the mask.
- threshold: A float value to control the threshold for creating the binary mask.
- dilation_factor: A float value to control the dilation of the binary mask.
Outputs:
- tensor_bw: A torch.Tensor representing the binary mask.
- image_out_hm: A torch.Tensor representing the heatmap overlay on the input image.
- image_out_bw: A torch.Tensor representing the binary mask overlay on the input image.
2. CombineSegMasks
The CombineSegMasks node combines two or optionally three masks into a single mask to improve masking of different areas.
Inputs:
- image: A torch.Tensor representing the input image.
- mask1: A torch.Tensor representing the first mask.
- mask2: A torch.Tensor representing the second mask.
- mask3 (optional): A torch.Tensor representing the third mask. Defaults to None.
Outputs:
- combined_mask: A torch.Tensor representing the combined mask.
- image_out_hm: A torch.Tensor representing the heatmap overlay of the combined mask on the input image.
- image_out_bw: A torch.Tensor representing the binary mask overlay of the combined mask on the input image.
Q&A
kijai/ComfyUI-Florence2
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks. Florence-2 can interpret simple text prompts to perform tasks like captioning, object detection, and segmentation. It leverages our FLD-5B dataset, containing 5.4 billion annotations across 126 million images, to master multi-task learning. The model's sequence-to-sequence architecture enables it to excel in both zero-shot and fine-tuned settings, proving to be a competitive vision foundation model.
This fork includes support for Document Visual Question Answering (DocVQA) using the Florence2 model. DocVQA allows you to ask questions about the content of document images, and the model will provide answers based on the visual and textual information in the document. This feature is particularly useful for extracting information from scanned documents, forms, receipts, and other text-heavy images.
模型下载
Supports most Florence2 models, which can be automatically downloaded with the DownloadAndLoadFlorence2Model
to ComfyUI/models/LLM
:
Official:
Tested finetunes:
ComfyUI 工作流
分割&遮罩 Florence2Run 节点的 task 选择 referring_expression_segmentation 等分割任务, 则会根据文本输入框中的内容进行语义分割,且输出遮罩
反推提示词 Florence2Run 节点的 task 选择 more_detailed_caption 等提示词反推任务,则会在 caption 输出图片的描述
Human-Segmentation
重绘&扩图
蒙版内容选项指定是否要在修复之前更改蒙版区域的图像。
- 填充(Fill):替换为遮罩区域的平均颜色。
- 原版(Original):没有变化。
- 潜空间噪声(Latent noise):仅随机噪声。
- 空白潜空间(Latent nothing): 无颜色或噪声(全零潜像)
在SD大模型中,有一种专门为修复而设计的SD修复模型。该修复模型与标准模型略有不同。它有 5 个额外的 UNet 输入通道,代表掩模和掩模图像
VAE内补编码器
使用 comfyUI 自带节点 latent->inpaint-> VAEEncodeForInpaint - Salt AI Docs (getsalt.ai) , 将整张图片中遮罩部分编码成空白潜空间,并带上遮罩信息。因为遮罩区域是空白潜空间,所以绘制不受原图影响,适合绘制与原图不相关的场景
This node is designed for encoding images into a latent representation suitable for inpainting tasks, incorporating additional preprocessing steps to adjust the input image and mask for optimal encoding by the VAE model.
设置 Latent 噪波遮罩
使用 comfyUI 自带节点 latent->inpaint-> SetLatentNoiseMask - Salt AI Docs (getsalt.ai) , 利用 latent 空间通道,将整张图片中遮罩部分编码成原图,并带上遮罩信息。因为遮罩区域是原图潜空间,所以绘制、受原图影响,适合绘制原图微调
This node is designed to apply a noise mask to a set of latent samples. It modifies the input samples by integrating a specified mask, thereby altering their noise characteristics.
VAE内补编码器&噪音潜空间
使用 VAE 内补编码器的工作流,在其中使用 Noisey Latent Image 生成 Noisey 添加到遮罩处,使用空白噪音潜空间出现噪音。随机性很强,可控制性低。
填充
Controlnet-inpaint
使用 Controlnet 的重绘模型,重绘预处理器设置遮罩。利用 clip 通道,带入遮罩信息。因为遮罩区域是原图潜空间,所以绘制、受原图影响,适合绘制原图微调
nullquant/ComfyUI-BrushNet
Custom nodes for ComfyUI allow to inpaint using Brushnet: "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion".
插件地址: GitHub - nullquant/ComfyUI-BrushNet: ComfyUI BrushNet nodes
Checkpoints of BrushNet can be downloaded from here.
The checkpoint in segmentation_mask_brushnet_ckpt
provides checkpoints trained on BrushData, which has segmentation prior (mask are with the same shape of objects). The random_mask_brushnet_ckpt
provides a more general ckpt for random mask shape.
segmentation_mask_brushnet_ckpt
and random_mask_brushnet_ckpt
contains BrushNet for SD 1.5 models while segmentation_mask_brushnet_ckpt_sdxl_v0
and random_mask_brushnet_ckpt_sdxl_v0
for SDXL.
You should place diffusion_pytorch_model.safetensors
files to your models/inpaint
folder. You can also specify inpaint
folder in your extra_model_paths.yaml
.
For PowerPaint you should download three files. Both diffusion_pytorch_model.safetensors
and pytorch_model.bin
from here should be placed in your models/inpaint
folder.
Also you need SD1.5 text encoder model model.safetensors
. You can take it from here or from another place. You can also use fp16 version. It should be placed in your models/clip
folder.
This is a structure of my models/inpaint
folder:
Yours can be different.
模型下载
夸克网盘分享 (quark.cn) 将model.fp16.safetensors模型放在models下的clip文件夹下,这个是一个sd1.5的clip模型,其实从文件大小上也看的出来,跟clip_l其实是同一个,只是名字不同
该节点目前有三种模型提供加载
- 第一种是PowerPaint模型:如图所示,使用PowerPaint模型时必须搭配1.5版的SD大模型,并加载PowerPaint所需的Clip模型。该模型的功能是去除图像中被蒙版覆盖的区域。
- 第二种是BrushNet模型,它使用了random_mask模型。在实验中,我使用的是SDXL大模型。使用random_mask模型进行局部修复的效果,最终修复后的图像并未严格按照蒙版的内容进行填充,该模型主要用于随机选择蒙版进行修复。
- 第三种是BrushNet使用的segmentation模型。最终生成的图像严格按照蒙版的区域进行绘制。然而,在BrushNet的原论文中提到,segmentation模型在插值操作方面可能会引入潜在的不准确性,因为将蒙版调整大小以匹配潜在空间的操作可能会导致潜在的不准确性"。也就是说,当蒙版信息通过VAE编误差。
一、BrushNet Loader节点
这一节点专注于加载BrushNet模型,以便在后续的图像处理工作流中使用。通过加载BrushNet模型,可以利用其强大的图像处理能力来执行各种细致的编辑和增强任务。
二、BrushNet节点
这一节点专注于使用BrushNet模型对图像进行各种处理,包括去噪、修复、增强等。通过配置和使用BrushNet模型,可以实现高质量的图像处理效果。 使用场景:
- 图像去噪:使用BrushNet模型去除图像中的噪点,提高图像质量。
- 图像修复:通过BrushNet模型修复图像中的瑕疵和损坏区域。
- 图像增强:增强图像的细节和视觉效果,使图像更加清晰和吸引人。
- 自动化处理:在自动化图像处理流程中,使用BrushNet模型实现高效、准确的图像处理。
通过使用BrushNet节点,可以在图像处理工作流程中实现高效的BrushNet模型应用,提升图像处理的精度和效果
三、Blend Inpaint节点
这一节点专注于图像的修复和填充。通过使用高级的图像修复算法,可以将缺失的图像部分进行填充,或将新的图像内容无缝地融合到现有图像中。 使用场景:
- 图像修复:在图像处理中,修复损坏或缺失的图像部分。
- 图像填充:在需要将新的图像内容无缝融合到现有图像中的场景中使用。
- 自动化处理:在自动化图像处理流程中,通过图像修复和填充算法,实现高效的图像处理。
- 通过使用Blend Inpaint节点,可以在图像处理工作流程中实现高效的图像修复和填充,提升图像处理的精度和效果。
四、Cut For Inpaint节点
这一节点专注于准备图像数据,以便在后续的修复和填充任务中使用。通过切割和处理图像中的特定区域,可以生成需要修复或填充的图像部分及其相应的掩码。 使用场景:
- 图像修复准备:在图像处理中,准备需要修复或填充的图像部分,为后续处理节点提供合适的数据。
- 图像填充准备:在需要将新的图像内容无缝融合到现有图像中的场景中,准备需要填充的图像区域。
- 自动化处理:在自动化图像处理流程中,通过切割和处理图像区域,实现高效的区域准备和优化。
通过使用Cut For Inpaint节点,可以在图像处理工作流程中实现高效的图像区域准备,为后续的修复和填充任务提供合适的输入数据,从而提升图像处理的精度和效果,满足各种复杂图像处理需求。
五、PowerPaint节点
这一节点专注于复杂的图像修复和增强任务,通过使用先进的图像处理算法,实现高质量的图像编辑。 在function的功能中一共有五个选项,分别是text guided,shape guided,object removal,context aware和image outpainting,分别对应的功能为"使用文本作为引导","使用形状作为引导","物体移除","上下文感知"和"图像的扩充"。 使用场景:
- 图像修复:在图像处理中,修复损坏或缺失的图像部分。
- 图像增强:通过增强图像的细节和视觉效果,使图像更加清晰和吸引人。
- 图像填充:在需要将新的图像内容无缝融合到现有图像中的场景中使用。
- 自动化处理:在自动化图像处理流程中,通过图像修复和增强算法,实现高效的图像处理。
PowerPaint 节点是一个强大的图像编辑工具,专门用于复杂的图像修复和增强任务。这个节点的设计目的是通过提供高级的图像处理算法和灵活的配置选项,进行高精度的图像编辑。
mimicbrush
Zero-shot Image Editing with Reference Imitation The University of Hong Kong | Alibaba Group | Ant Group
comfyUI 插件: github.com
Download Checkpoints
Download SD-1.5 and SD-1.5-inpainting checkpoint:
- You could download them from HuggingFace stable-diffusion-v1-5 and stable-diffusion-inpainting
- However, the repo above contains many models that would not be used, we provide a clean version at cleansd
Download MimicBrush checkpoint, along with a VAE, a CLIP encoder, and a depth model
- Download the weights on ModelScope [xichen/MimicBrush] (www.modelscope.cn/models/xich...)
- The model is big because it contains two U-Nets.
You could use the following code to download them from modelscope
python
from modelscope.hub.snapshot_download import snapshot_download as ms_snapshot_download
sd_dir = ms_snapshot_download('xichen/cleansd', cache_dir='./modelscope')
print('=== Pretrained SD weights downloaded ===')
model_dir = ms_snapshot_download('xichen/MimicBrush', cache_dir='./modelscope')
print('=== MimicBrush weights downloaded ===')
or from Huggingface
python
from huggingface_hub import snapshot_download
snapshot_download(repo_id="xichenhku/cleansd", local_dir="./cleansd")
print('=== Pretrained SD weights downloaded ===')
snapshot_download(repo_id="xichenhku/MimicBrush", local_dir="./MimicBrush")
print('=== MimicBrush weights downloaded ===')
lllyasviel/Fooocus
项目地址:GitHub - lllyasviel/Fooocus: Focus on prompting and generating
- Fooocus Inpaint: lllyasviel/Fooocus
- LaMa: advimman/lama
- MAT: fenglinglwb/MAT
- LaMa/MAT implementation: chaiNNer-org/spandrel
Nodes for better inpainting with ComfyUI: Fooocus inpaint model for SDXL, LaMa, MAT, and various other tools for pre-filling inpaint & outpaint areas.
Adds two nodes which allow using Fooocus inpaint model. It's a small and flexible patch which can be applied to your SDXL checkpoints and will transform them into an inpaint model. This model can then be used like other inpaint models to seamlessly fill and expand areas in an image.
模型
Fooocus Inpaint
Download models from lllyasviel/fooocus_inpaint to ComfyUI/models/inpaint
.
Inpaint Models (LaMA, MAT)
This runs a small, fast inpaint model on the masked area. Models can be loaded with Load Inpaint Model and are applied with the Inpaint (using Model) node. This works well for outpainting or object removal.
The following inpaint models are supported, place them in ComfyUI/models/inpaint
:
ComfyUI 工作流
Fooocuse Inpaint
Make sure to use the regular version of a checkpoint to create an inpaint model - distilled merges (Turbo, Lightning, Hyper) do not work.
Inpaint Conditioning Fooocus inpaint can be used with ComfyUI's VAE Encode (for Inpainting) directly. However this does not allow existing content in the masked area, denoise strength must be 1.0.
InpaintModelConditioning can be used to combine inpaint models with existing content. The resulting latent can however not be used directly to patch the model using Apply Fooocus Inpaint. This repository adds a new node VAE Encode & Inpaint Conditioning which provides two outputs: latent_inpaint (connect this to Apply Fooocus Inpaint) and latent_samples (connect this to KSampler).
It's the same as using both VAE Encode (for Inpainting) and InpaintModelConditioning, but less overhead because it avoids VAE-encoding the image twice. Example workflow
Inpaint Models (LaMA, MAT)
This runs a small, fast inpaint model on the masked area. Models can be loaded with Load Inpaint Model and are applied with the Inpaint (using Model) node. This works well for outpainting or object removal.
The following inpaint models are supported, place them in ComfyUI/models/inpaint
: