Pytorch | 利用PC-I-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击

Pytorch | 利用PC-I-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击

之前已经针对CIFAR10训练了多种分类器:
Pytorch | 从零构建AlexNet对CIFAR10进行分类
Pytorch | 从零构建Vgg对CIFAR10进行分类
Pytorch | 从零构建GoogleNet对CIFAR10进行分类
Pytorch | 从零构建ResNet对CIFAR10进行分类
Pytorch | 从零构建MobileNet对CIFAR10进行分类
Pytorch | 从零构建EfficientNet对CIFAR10进行分类
Pytorch | 从零构建ParNet对CIFAR10进行分类

也实现了一些攻击算法:
Pytorch | 利用FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用BIM/I-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用MI-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用NI-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用PI-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用VMI-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用VNI-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用EMI-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用AI-FGTM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用I-FGSSM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用SMI-FGRM针对CIFAR10上的ResNet分类器进行对抗攻击
Pytorch | 利用VA-I-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击

本篇文章我们使用Pytorch实现PC-I-FGSM对CIFAR10上的ResNet分类器进行攻击.

CIFAR数据集

CIFAR-10数据集是由加拿大高级研究所(CIFAR)收集整理的用于图像识别研究的常用数据集,基本信息如下:

  • 数据规模:该数据集包含60,000张彩色图像,分为10个不同的类别,每个类别有6,000张图像。通常将其中50,000张作为训练集,用于模型的训练;10,000张作为测试集,用于评估模型的性能。
  • 图像尺寸:所有图像的尺寸均为32×32像素,这相对较小的尺寸使得模型在处理该数据集时能够相对快速地进行训练和推理,但也增加了图像分类的难度。
  • 类别内容:涵盖了飞机(plane)、汽车(car)、鸟(bird)、猫(cat)、鹿(deer)、狗(dog)、青蛙(frog)、马(horse)、船(ship)、卡车(truck)这10个不同的类别,这些类别都是现实世界中常见的物体,具有一定的代表性。

下面是一些示例样本:

PC-I-FGSM介绍

PC-I-FGSM(Prediction-Correction Iterative Fast Gradient Sign Method)算法主要由预测(Prediction)和校正(Correction)两个阶段组成,通过多次迭代来生成对抗样本,其详细算法流程如下:

算法原理

  • 初始化阶段:
    • 参数设置
      • 给定原始样本 x x x 及其真实标签 y y y,分类器 F F F 及其损失函数 L L L。
      • 设定扰动幅度 ϵ \epsilon ϵ、预测次数 K K K、迭代次数 T T T,计算步长 α = ϵ / T \alpha = \epsilon / T α=ϵ/T。
    • 变量初始化
      • 初始化预测示例 x 0 p r e = x x_{0}^{pre}=x x0pre=x。
      • 初始化预测梯度累积 G 0 p r e = 0 G_{0}^{pre}=0 G0pre=0。
  • 预测阶段(核心过程):
    • 进行 K K K 次预测循环( k = 0 k = 0 k=0 到 K − 1 K - 1 K−1):
      • 更新预测示例 : x k + 1 p r e = c l i p x ϵ { x k p r e + α ⋅ s i g n ( ∇ x k p r e L ( x k p r e , y ) ) } x_{k + 1}^{pre}=clip_{x}^{\epsilon}\{x_{k}^{pre}+\alpha \cdot sign(\nabla_{x_{k}^{pre}} L(x_{k}^{pre}, y))\} xk+1pre=clipxϵ{xkpre+α⋅sign(∇xkpreL(xkpre,y))}
        其中, c l i p x ϵ clip_{x}^{\epsilon} clipxϵ 用于将生成的对抗样本限制在原始图像的 ϵ \epsilon ϵ - 球内, ∇ x k p r e L ( x k p r e , y ) \nabla_{x_{k}^{pre}} L(x_{k}^{pre}, y) ∇xkpreL(xkpre,y) 是预测示例 x k p r e x_{k}^{pre} xkpre 的梯度。
      • 计算预测示例梯度 :计算 ∇ x k + 1 p r e L ( x k + 1 p r e , y ) \nabla_{x_{k + 1}^{pre}} L(x_{k + 1}^{pre}, y) ∇xk+1preL(xk+1pre,y)。
      • 累积预测梯度 : G k + 1 p r e = G k p r e + ∇ x k + 1 p r e L ( x k + 1 p r e , y ) ∥ ∇ x k + 1 p r e L ( x k + 1 p r e , y ) ∥ 1 G_{k + 1}^{pre}=G_{k}^{pre}+\frac{\nabla_{x_{k + 1}^{pre}} L(x_{k + 1}^{pre}, y)}{\left\|\nabla_{x_{k + 1}^{pre}} L(x_{k + 1}^{pre}, y)\right\|_{1}} Gk+1pre=Gkpre+ ∇xk+1preL(xk+1pre,y) 1∇xk+1preL(xk+1pre,y)
  • 校正阶段(核心过程):
    • 计算校正梯度 : G = ∇ x L ( x , y ) ∥ ∇ x L ( x , y ) ∥ 1 + G K p r e G=\frac{\nabla_{x} L(x, y)}{\left\|\nabla_{x} L(x, y)\right\|{1}}+G{K}^{pre} G=∥∇xL(x,y)∥1∇xL(x,y)+GKpre 这里 ∇ x L ( x , y ) \nabla_{x} L(x, y) ∇xL(x,y) 是原始样本 x x x 的梯度, G K p r e G_{K}^{pre} GKpre 是 K K K 次预测后累积的归一化预测梯度。
    • 生成对抗样本 : x a d v = c l i p x ϵ { x + ϵ ⋅ s i g n ( G ) } x^{adv}=clip_{x}^{\epsilon}\{x+\epsilon \cdot sign(G)\} xadv=clipxϵ{x+ϵ⋅sign(G)}
  • 迭代更新阶段:
    • 进行 T T T次迭代循环( t = 0 t = 0 t=0 到 T − 1 T - 1 T−1):
    • 将当前对抗样本 x a d v x^{adv} xadv 作为新的原始样本 x x x,重复预测阶段和校正阶段的操作,得到新的对抗样本 x t + 1 a d v x_{t + 1}^{adv} xt+1adv.
  • 最终输出:
    • 返回经过 T T T 次迭代后生成的对抗样本 x a d v x^{adv} xadv 。

PC-I-FGSM代码实现

PC-I-FGSM算法实现

python 复制代码
import torch
import torch.nn as nn

def PC_I_FGSM(model, criterion, original_images, labels, epsilon, num_iterations=10, num_predictions=1):
    """
    PC-I-FGSM (Prediction-Correction Iterative Fast Gradient Sign Method)

    参数:
    - model: 要攻击的模型
    - criterion: 损失函数
    - original_images: 原始图像
    - labels: 原始图像的标签
    - epsilon: 最大扰动幅度
    - num_iterations: 迭代次数
    - num_predictions: 预测次数
    """
    alpha = epsilon / num_iterations
    perturbed_images = original_images.clone().detach().requires_grad_(True)
    ori_perturbed_images = original_images.clone().detach().requires_grad_(True)

    # 用于校正阶段
    original_outputs = model(ori_perturbed_images)
    original_loss = criterion(original_outputs, labels)
    model.zero_grad()
    original_loss.backward()
    original_gradient = ori_perturbed_images.grad.data

    for _ in range(num_iterations):
        # 预测阶段
        acumulated_predicted_gradients = torch.zeros_like(original_images).detach().to(original_images.device)
        # 先更新一次对抗样本
        outputs = model(perturbed_images)
        loss = criterion(outputs, labels)
        model.zero_grad()
        loss.backward()
        data_grad = perturbed_images.grad.data
        perturbed_images = perturbed_images.detach().requires_grad_(True)
        for _ in range(num_predictions-1):
            outputs = model(perturbed_images)
            loss = criterion(outputs, labels)
            model.zero_grad()
            loss.backward()
            data_grad = perturbed_images.grad.data
            # 更新对抗样本(预测步骤)
            perturbed_images = perturbed_images + alpha * data_grad.sign()
            perturbed_images = torch.clamp(perturbed_images, original_images - epsilon, original_images + epsilon)
            perturbed_images = perturbed_images.detach().requires_grad_(True)
            acumulated_predicted_gradients += data_grad / torch.sum(torch.abs(data_grad), dim=(1, 2, 3), keepdim=True)

        # 校正阶段
        corrected_gradient = original_gradient + acumulated_predicted_gradients
        # 更新对抗样本(校正步骤)
        perturbed_images = original_images + epsilon * corrected_gradient.sign()
        perturbed_images = torch.clamp(perturbed_images, original_images - epsilon, original_images + epsilon)
        perturbed_images = perturbed_images.detach().requires_grad_(True)

    return perturbed_images

攻击效果

代码汇总

pcifgsm.py

python 复制代码
import torch
import torch.nn as nn

def PC_I_FGSM(model, criterion, original_images, labels, epsilon, num_iterations=10, num_predictions=1):
    """
    PC-I-FGSM (Prediction-Correction Iterative Fast Gradient Sign Method)

    参数:
    - model: 要攻击的模型
    - criterion: 损失函数
    - original_images: 原始图像
    - labels: 原始图像的标签
    - epsilon: 最大扰动幅度
    - num_iterations: 迭代次数
    - num_predictions: 预测次数
    """
    alpha = epsilon / num_iterations
    perturbed_images = original_images.clone().detach().requires_grad_(True)
    ori_perturbed_images = original_images.clone().detach().requires_grad_(True)

    # 用于校正阶段
    original_outputs = model(ori_perturbed_images)
    original_loss = criterion(original_outputs, labels)
    model.zero_grad()
    original_loss.backward()
    original_gradient = ori_perturbed_images.grad.data

    for _ in range(num_iterations):
        # 预测阶段
        acumulated_predicted_gradients = torch.zeros_like(original_images).detach().to(original_images.device)
        # 先更新一次对抗样本
        outputs = model(perturbed_images)
        loss = criterion(outputs, labels)
        model.zero_grad()
        loss.backward()
        data_grad = perturbed_images.grad.data
        perturbed_images = perturbed_images.detach().requires_grad_(True)
        for _ in range(num_predictions-1):
            outputs = model(perturbed_images)
            loss = criterion(outputs, labels)
            model.zero_grad()
            loss.backward()
            data_grad = perturbed_images.grad.data
            # 更新对抗样本(预测步骤)
            perturbed_images = perturbed_images + alpha * data_grad.sign()
            perturbed_images = torch.clamp(perturbed_images, original_images - epsilon, original_images + epsilon)
            perturbed_images = perturbed_images.detach().requires_grad_(True)
            acumulated_predicted_gradients += data_grad / torch.sum(torch.abs(data_grad), dim=(1, 2, 3), keepdim=True)

        # 校正阶段
        corrected_gradient = original_gradient + acumulated_predicted_gradients
        # 更新对抗样本(校正步骤)
        perturbed_images = original_images + epsilon * corrected_gradient.sign()
        perturbed_images = torch.clamp(perturbed_images, original_images - epsilon, original_images + epsilon)
        perturbed_images = perturbed_images.detach().requires_grad_(True)

    return perturbed_images

train.py

python 复制代码
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from models import ResNet18


# 数据预处理
transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
])

# 加载Cifar10训练集和测试集
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=False, transform=transform_train)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=False, transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)

# 定义设备(GPU或CPU)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# 初始化模型
model = ResNet18(num_classes=10)
model.to(device)

# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

if __name__ == "__main__":
    # 训练模型
    for epoch in range(10):  # 可以根据实际情况调整训练轮数
        running_loss = 0.0
        for i, data in enumerate(trainloader, 0):
            inputs, labels = data[0].to(device), data[1].to(device)

            optimizer.zero_grad()

            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()
            if i % 100 == 99:
                print(f'Epoch {epoch + 1}, Batch {i + 1}: Loss = {running_loss / 100}')
                running_loss = 0.0

    torch.save(model.state_dict(), f'weights/epoch_{epoch + 1}.pth')
    print('Finished Training')

advtest.py

python 复制代码
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from models import *
from attacks import *
import ssl
import os
from PIL import Image
import matplotlib.pyplot as plt

ssl._create_default_https_context = ssl._create_unverified_context

# 定义数据预处理操作
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.491, 0.482, 0.446), (0.247, 0.243, 0.261))])

# 加载CIFAR10测试集
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128,
                                         shuffle=False, num_workers=2)

# 定义设备(GPU优先,若可用)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = ResNet18(num_classes=10).to(device)

criterion = nn.CrossEntropyLoss()

# 加载模型权重
weights_path = "weights/epoch_10.pth"
model.load_state_dict(torch.load(weights_path, map_location=device))


if __name__ == "__main__":
    # 在测试集上进行FGSM攻击并评估准确率
    model.eval()  # 设置为评估模式
    correct = 0
    total = 0
    epsilon = 16 / 255  # 可以调整扰动强度
    for data in testloader:
        original_images, labels = data[0].to(device), data[1].to(device)
        original_images.requires_grad = True
        
        attack_name = 'PC-I-FGSM'
        if attack_name == 'FGSM':
            perturbed_images = FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'BIM':
            perturbed_images = BIM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'MI-FGSM':
            perturbed_images = MI_FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'NI-FGSM':
            perturbed_images = NI_FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'PI-FGSM':
            perturbed_images = PI_FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'VMI-FGSM':
            perturbed_images = VMI_FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'VNI-FGSM':
            perturbed_images = VNI_FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'EMI-FGSM':
            perturbed_images = EMI_FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'AI-FGTM':
            perturbed_images = AI_FGTM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'I-FGSSM':
            perturbed_images = I_FGSSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'SMI-FGRM':
            perturbed_images = SMI_FGRM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'VA-I-FGSM':
            perturbed_images = VA_I_FGSM(model, criterion, original_images, labels, epsilon)
        elif attack_name == 'PC-I-FGSM':
            perturbed_images = PC_I_FGSM(model, criterion, original_images, labels, epsilon)
        
        perturbed_outputs = model(perturbed_images)
        _, predicted = torch.max(perturbed_outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    # Attack Success Rate
    ASR = 100 - accuracy
    print(f'Load ResNet Model Weight from {weights_path}')
    print(f'epsilon: {epsilon:.4f}')
    print(f'ASR of {attack_name} : {ASR :.2f}%')
相关推荐
roman_日积跬步-终至千里1 天前
【人工智能导论】08-学习-如何让计算机理解序列数据——用RNN/LSTM建模时序依赖,用文本嵌入表示序列元素
人工智能·rnn·学习
写代码的【黑咖啡】1 天前
深入理解 Python 中的模块(Module)
开发语言·python
技术吧1 天前
2025年AI不是宠物,是会思考的幽灵!
人工智能·宠物
苍何1 天前
以前我以为达人营销很玄学,用了 Aha 才知道还能这么玩!(附教程)
人工智能
苍何1 天前
扣子彻底变了!拥抱 Vibe Coding,不只是智能体!
人工智能
苍何1 天前
抢先实测豆包1.8模型,多模态Agent超强!
人工智能
黎相思1 天前
项目简介
人工智能·chatgpt
Coding茶水间1 天前
基于深度学习的安检危险品检测系统演示与介绍(YOLOv12/v11/v8/v5模型+Pyqt5界面+训练代码+数据集)
图像处理·人工智能·深度学习·yolo·目标检测·机器学习·计算机视觉
爱笑的眼睛111 天前
超越 `cross_val_score`:深度解析Scikit-learn交叉验证API的架构、技巧与陷阱
java·人工智能·python·ai
sky丶Mamba1 天前
上下文工程是什么,和Prompt、普通上下文区别
人工智能·prompt