[FL]Adversarial Machine Learning (1)

Reading Alleviates Anxiety [Simba的阅读障碍治疗计划:#1]

Reading Notes for (NIST AI 100-2e2023)[https://csrc.nist.gov/pubs/ai/100/2/e2023/final\].

Adversarial Machine Learning

A Taxonomy and Terminology of Attacks and Mitigations

Section 1: Introduction

  • There are two broad classes of AI systems, based on their capabilities: Predictive AI (PredAI) and Generative AI (GenAI).

  • However, despite the signifcant progress that AI and machine learning (ML) have made in a number of different application domains, these technologies are also vulnerable to attacks that can cause spectacular failures with dire consequences.

  • Unlike cryptography, there are no information-theoretic security proofs for the widely used machine learning algorithms. Moreover, information-theoretic impossibility results have started to appear in the literature [102, 116] that set limits on the effectiveness of widely-used mitigation techniques. As a result, many of the advances in developing mitigations against different classes of attacks tend to be empirical and limited in nature.

Section 2: Pridictive AI Taxonomy

  • The attacker's objectives are shown as disjointed circles with the attacker's goal at the center of each circle: Availability breakdown , Integrity violations , and Privacy compromise.
  • Machine learning involves a TRAINING STAGE, in which a model is learned, and a DEPLOYMENT STAGE, in which the model is deployed on new, unlabeled data samples to generate predictions.

  • Adversarial machine learning literature predominantly considers adversarial attacks against AI systems that could occur at either the training stage or the ML deployment stage. During the ML training stage, the attacker might control part of the training data, their labels, the model parameters, or the code of ML algorithms, resulting in different types of poisoning attacks. During the ML deployment stage, the ML model is already trained, and the adversary could mount evasion attacks to create integrity violations and change the ML model's predictions, as well as privacy attacks to infer sensitive information about the training data or the ML model.

  • Training-time attacks. Poisoning Attack. Data poisoning attacks are applicable to all learning paradigms, while model poisoning attacks are most prevalent in federated learning, where clients send local model updates to the aggregating server, and in supply-chain attacks where malicious code may be added to the model by suppliers of model technology.

  • Deployment-time attack. Adversarial Example.

  • Attacker Goals and Objectives. Availability breakdown, Integrity violations, and Privacy compromise.

  • Privacy Compromise. Attackers might be interested in learning information about the training data (resulting in DATA PRIVACY attacks) or about the ML model (resulting in MODEL PRIVACY attacks). The attacker could have different objectives for compromising the privacy of training data, such as DATA RECONSTRUCTION (inferring content or features of training data), MEMBERSHIP-INFERENCE ATTACKS (inferring the presence of data in the training set), data EXTRACTION (ability to extract training data from generative models), and PROPERTY INFERENCE (inferring properties about the training data distribution). MODEL EXTRACTION is a model privacy attack in which attackers aim to extract information about the model.

  • Attacker Capabilities. Training Data Control. Model Control. Testing Data Control. Label Limit. Source Code Control. Query Access.

  • Attacker Knowledge. White-box attacks. These assume that the attacker operates with full knowledge about the ML system, including the training data, model architecture, and model hyper-parameters. Black-box attacks. These attacks assume minimal knowledge about the ML system. An adversary might get query access to the model, but they have no other information about how the model is trained. Gray-box attacks. There are a range of gray-box attacks that capture adversarial knowledge between black-box and white-box attacks. Suciu et al. introduced a framework to classify gray-box attacks. An attacker might know the model architecture but not its parameters, or the attacker might know the model and its parameters but not the training data.

  • Data Modality: Image. Text. Audio. Video. Cybersecurity. Tabular Data.

  • Recently, the use of ML models trained on multimodal data has gained traction, particularly the combination of image and text data modalities. Several papers have shown that multimodal models may provide some resilience against attacks, but other papers show that multimodal models themselves could be vulnerable to attacks mounted on all modalities at the same time.

  • An interesting open challenge is to test and characterize the resilience of a variety of multimodal ML against evasion, poisoning, and privacy attacks.

  • Evasion Attacks and Mitigations. Methods for creating adversarial examples in black-box settings include zeroth-order optimization, discrete optimization, and Bayesian optimization, as well as transferability, which involves the white-box generation of adversarial examples on a different model architecture before transferring them to the target model.

  • The most promising directions for mitigating the critical threat of evasion attacks are adversarial training (iteratively generating and inserting adversarial examples with their correct labels at training time); certifed techniques, such as randomized smoothing (evaluating ML predic-

    tion under noise); and formal verifcation techniques [112, 154] (applying formal method techniques to verify the model's output). Nevertheless, these methods come with different limitations, such as decreased accuracy for adversarial training and randomized smoothing, and computational complexity for formal methods. There is an inherent trade-off between robustness and accuracy [297, 302, 343]. Similarly, there are trade-offs between a model's robustness and fairness guarantees.

相关推荐
计算生物前沿27 分钟前
单细胞分析教程 | (二)标准化、特征选择、降为、聚类及可视化
人工智能·机器学习·聚类
kyle~1 小时前
Opencv---深度学习开发
人工智能·深度学习·opencv·计算机视觉·机器人
运器1231 小时前
【一起来学AI大模型】PyTorch DataLoader 实战指南
大数据·人工智能·pytorch·python·深度学习·ai·ai编程
超龄超能程序猿1 小时前
(5)机器学习小白入门 YOLOv:数据需求与图像不足应对策略
人工智能·python·机器学习·numpy·pandas·scipy
卷福同学1 小时前
【AI编程】AI+高德MCP不到10分钟搞定上海三日游
人工智能·算法·程序员
帅次1 小时前
系统分析师-计算机系统-输入输出系统
人工智能·分布式·深度学习·神经网络·架构·系统架构·硬件架构
AndrewHZ2 小时前
【图像处理基石】如何入门大规模三维重建?
人工智能·深度学习·大模型·llm·三维重建·立体视觉·大规模三维重建
5G行业应用2 小时前
【赠书福利,回馈公号读者】《智慧城市与智能网联汽车,融合创新发展之路》
人工智能·汽车·智慧城市
悟空胆好小2 小时前
分音塔科技(BABEL Technology) 的公司背景、股权构成、产品类型及技术能力的全方位解读
网络·人工智能·科技·嵌入式硬件
探讨探讨AGV2 小时前
以科技赋能未来,科聪持续支持青年创新实践 —— 第七届“科聪杯”浙江省大学生智能机器人创意竞赛圆满落幕
人工智能·科技·机器人