《Towards Black-Box Membership Inference Attack for Diffusion Models》论文笔记

《Towards Black-Box Membership Inference Attack for Diffusion Models》

Abstract

  1. 识别艺术品是否用于训练扩散模型的挑战,重点是人工智能生成的艺术品中的成员推断攻击------copyright protection
  2. 不需要访问内部模型组件的新型黑盒攻击方法
  3. 展示了在评估 DALL-E 生成的数据集方面的卓越性能。

作者主张

previous methods are not yet ready for copyright protection in diffusion models.

Contributions(文章里有三点,我觉得只有两点)

  1. ReDiffuse:using the model's variation API to alter an image and compare it with the original one.
  2. A new MIA evaluation dataset:use the image titles from LAION-5B as prompts for DALL-E's API [31] to generate images of the same contents but different styles.

Algorithm Design

target model:DDIM

为什么要强行引入一个版权保护的概念???

定义black-box variation API

x ^ = V θ ( x , t ) \hat{x}=V_{\theta}(x,t) x^=Vθ(x,t)

细节如下:

总结为: x x x加噪变为 x t x_t xt,再通过DDIM连续降噪变为 x ^ \hat{x} x^

intuition

Our key intuition comes from the reverse SDE dynamics in continuous diffusion models.

one simplified form of the reverse SDE (i.e., the denoise step)
X t = ( X t / 2 − ∇ x log ⁡ p ( X t ) ) + d W t , t ∈ [ 0 , T ] (3) X_t=(X_t/2-\nabla_x\log p(X_t))+dW_t,t\in[0,T]\tag{3} Xt=(Xt/2−∇xlogp(Xt))+dWt,t∈[0,T](3)

The key guarantee is that when the score function is learned for a data point x, then the reconstructed image x ^ i \hat{x}_i x^i is an unbiased estimator of x x x.(算是过拟合的另一种说法吧)

Hence,averaging over multiple independent samples x ^ i \hat{x}_i x^i would greatly reduce the estimation error (see Theorem 1).

On the other hand, for a non-member image x ′ x' x′, the unbiasedness of the denoised image is not guaranteed.

details of algorithm:

  1. independently apply the black-box variation API n times with our target image x as input
  2. average the output images
  3. compare the average result x ^ \hat{x} x^ with the original image.

evaluate the difference between the images using an indicator function:
f ( x ) = 1 [ D ( x , x ^ ) < τ ] f(x)=1[D(x,\hat{x})<\tau] f(x)=1[D(x,x^)<τ]

A sample is classified to be in the training set if D ( x , x ^ ) D(x,\hat{x}) D(x,x^) is smaller than a threshold τ \tau τ ( D ( x , x ^ ) D(x,\hat{x}) D(x,x^) represents the difference between the two images)

ReDiffuse
Theoretical Analysis

什么是sampling interval???

MIA on Latent Diffusion Models

泛化到latent diffusion model,即Stable Diffusion

ReDiffuse+

variation API for stable diffusion is different from DDIM, as it includes the encoder-decoder process.
z = E n c o d e r ( x ) , z t = α ‾ t z + 1 − α ‾ t ϵ , z ^ = Φ θ ( z t , 0 ) , x ^ = D e c o d e r ( z ^ ) (4) z={\rm Encoder}(x),\quad z_t=\sqrt{\overline{\alpha}_t}z+\sqrt{1-\overline{\alpha}t}\epsilon,\quad \hat{z}=\Phi{\theta}(z_t,0),\quad \hat{x}={\rm Decoder}(\hat{z})\tag{4} z=Encoder(x),zt=αt z+1−αt ϵ,z^=Φθ(zt,0),x^=Decoder(z^)(4)
modification of the algorithm

independently adding random noise to the original image twice and then comparing the differences between the two restored images x ^ 1 \hat{x}_1 x^1 and x ^ 2 \hat{x}_2 x^2:
f ( x ) = 1 [ D ( x ^ 1 , x ^ 2 ) < τ ] f(x)=1[D(\hat{x}_1,\hat{x}_2)<\tau] f(x)=1[D(x^1,x^2)<τ]

Experiments

Evaluation Metrics
  1. AUC
  2. ASR
  3. TPR@1%FPR
same experiment's setup in previous papers [5, 18].
target model DDIM Stable Diffusion
version 《Are diffusion models vulnerable to membership inference attacks?》 original:stable diffusion-v1-5 provided by Huggingface
dataset CIFAR10/100,STL10-Unlabeled,Tiny-Imagenet member set:LAION-5B,corresponding 500 images from LAION-5;non-member set:COCO2017-val,500 images from DALL-E3
T 1000 1000
k 100 10
baseline methods [5]Are diffusion models vulnerable to membership inference attacks?: SecMIA [18]An efficient membership inference attack for the diffusion model by proximal initialization. [28]Membership inference attacks against diffusion models
publication International Conference on Machine Learning arXiv preprint 2023 IEEE Security and Privacy Workshops (SPW)
Ablation Studies
  1. The impact of average numbers
  2. The impact of diffusion steps
  3. The impact of sampling intervals
相关推荐
墨染 殇雪1 天前
文件上传漏洞基础及挖掘流程
网络安全·漏洞分析·漏洞挖掘·安全机制
网络安全大学堂1 天前
【网络安全入门基础教程】网络安全零基础学习方向及需要掌握的技能
网络·学习·安全·web安全·网络安全·黑客
君君思密达1 天前
网络安全初级-渗透测试
安全·网络安全
CV-杨帆1 天前
论文阅读:ACL 2022 Beyond Goldfish Memory∗: Long-Term Open-Domain Conversation
论文阅读
网安INF1 天前
【论文阅读】-《Besting the Black-Box: Barrier Zones for Adversarial Example Defense》
人工智能·深度学习·网络安全·黑盒攻击
有点不太正常1 天前
《A Study of Probabilistic Password Models》(IEEE S&P 2014)——论文阅读
论文阅读·密码学·口令猜测·马尔可夫链
Arong-tina1 天前
【论文阅读—深度学习处理表格数据】ResNet-like & FT Transformer
论文阅读·深度学习·transformer
张较瘦_1 天前
[论文阅读] 软件工程 | REST API模糊测试的“标准化革命”——WFC与WFD如何破解行业三大痛点
论文阅读
Vizio<1 天前
《D (R,O) Grasp:跨机械手灵巧抓取的机器人 - 物体交互统一表示》论文解读
论文阅读·人工智能·机器人
lingggggaaaa1 天前
小迪安全v2023学习笔记(七十七讲)—— 业务设计篇&隐私合规检测&重定向漏洞&资源拒绝服务
笔记·学习·安全·web安全·网络安全