论文原文:PDF
论文年份:2020
论文被引:6(2020/10/04) 89(2022/03/26)


文章目录

  • Anomalous Instance Detection in Deep Learning: A Survey
  • Abstract
  • 1. INTRODUCTION
    • 1.1. What are Anomalies?
      • 1.1.1. Unintentional: Novel and out-of-distribution examples
      • 1.1.2. Intentional: Adversarial Examples
    • 1.2. Challenges
    • 1.3. Related Work
    • 1.4. Our Contributions
    • 1.5. Organization
  • 2. UNINTENTIONAL ANOMALY DETECTION
    • 2.1. Supervised Approaches
    • 2.2. Semi-supervised Approaches
    • 2.3. Unsupervised Approaches
    • 2.4. Other Miscellaneous Techniques
  • 3. INTENTIONAL ANOMALY DETECTION
    • 3.1. Supervised Approaches、
    • 3.2. Semi-supervised Approaches
    • 3.3. Unsupervised Approaches
    • 3.4. Other Miscellaneous Techniques
  • 4. RELATIVE STRENGTHS AND WEAKNESS
  • 5. APPLICATION DOMAINS
  • 6. CONCLUSION AND OPEN QUESTIONS
  • Acknowledgement

Anomalous Instance Detection in Deep Learning: A Survey

Abstract

Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based applications. We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. Our goal in this survey is to provide an easier yet better understanding of the techniques belonging to different categories in which research has been done on this topic. Finally, we highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions.

深度学习(DL)容易受到分布不均和对抗性样本的影响,从而导致不正确的输出。为了使DL更加健壮,最近已经提出了几种异常检测技术来检测(和丢弃)这些异常样本。本综述试图为基于深度学习的异常检测研究提供一个结构化和全面的综述。我们根据现有技术的基本假设和采用的方法,对它们进行了分类。讨论了每个类别中的各种技术,以及这些方法的相对优缺点。本研究的目标是提供一个更好地理解属于不同类别的技术,在这些类别中已经对这个主题进行了研究。最后,我们强调了在DL中应用异常检测技术时尚未解决的研究挑战,并提出了一些有影响的未来研究方向。

1. INTRODUCTION

Deep Learning (DL) techniques provide incredible opportunities to answer some of the most important and difficult questions in a wide range of applications in science and engineering. Therefore, scientists and engineers are increasingly adopting the use of DL for making potentially important decisions in the context of applications of interest, such as bioinformatics, healthcare, cyber-security, and fully autonomous vehicles. Several of these applications are often high-regret (i.e., incurring significant costs) in nature. In such applications, incorrect decisions or predictions have significant costs either in terms of experimental resources when testing drugs, lost opportunities to observe rare phenomena, or in health and safety when certifying parts. Most DL methods implicitly assume ideal conditions and rely on the assumption that test data comes from the “same distribution” as the training data. However, this assumption is not satisfied in many real-world applications and virtually all problems require various levels of transformation of the DL output as test data is typically different from the training data either due to noise, adversarial corruptions, or other changes in distribution possibly due to temporal and spatial effects. These deviant (or out-ofdistribution) data samples are often referred to as anomalies, outliers, novelties in different domains. It is well known that DL models are highly sensitive to such anomalies, which often leads to unintended and potentially harmful consequences due to incorrect results generated by DL. Hence, it is critical to determine whether the incoming test data is so different from the training dataset that the output of the model cannot be trusted (referred to as the anomaly detection problem).

深度学习技术为回答科学和工程中广泛应用的一些最重要和最困难的问题提供了不可思议的机会。因此,科学家和工程师越来越多地使用DL在感兴趣的应用环境中做出潜在的重要决策,例如生物信息学、医疗保健、网络安全和自动驾驶。这些应用中有几个通常需要大量成本。在此类应用中,错误的决策或预测会带来巨大的成本,无论是在测试药物的实验资源方面,观察罕见现象的机会丧失方面,还是在认证零件的健康和安全方面。大多数DL方法隐含地假设理想条件,并依赖于测试数据与训练数据是相同分布的假设。然而,这一假设在许多实际应用中并不满足,实际上所有的问题都需要对DL输出进行不同层次的转换,因为测试数据通常不同于训练数据,或者是由于噪声、对抗破坏,或者是由于时间和空间效应可能导致的分布的其他变化。这些异常(或分布外)的数据样本通常被称为不同领域的异常值、新奇值。众所周知,DL模型对这种异常非常敏感,由于DL产生的结果不正确,往往会导致意想不到的潜在有害后果。因此,检测测试数据是否与训练数据集不同,以至于模型的输出不可信(称为异常检测问题)是至关重要的

Due to its practical importance, anomaly detection has received a lot of attention from statistics, signal processing and machine learning communities. Recently, there has been a surge of interest in devising anomaly detection methods for DL applications. This survey aims to provide a structured overview of recent studies and approaches to anomaly detection in DL based high-regret applications. To the best of our knowledge, there has not been any comprehensive review of anomaly detection approaches in DL systems. Although a number of surveys have appeared for conventional machine learning applications, none of these are specifically for DL applications. This has motivated this survey paper especially in light of recent research results in DL. We expect that this review will facilitate a better understanding of the different directions in which research has been carried out on this topic and potential high-impact future directions.

由于其实际重要性,异常检测受到了统计、信号处理和机器学习界的广泛关注。最近,人们对设计面向DL应用的异常检测方法的兴趣激增。这项研究旨在提供一个结构化的综述,最近的研究和方法,异常检测在基于DL的高后悔应用。据我们所知,还没有对DL中的异常检测方法进行全面的回顾。虽然已经出现了许多针对传统机器学习应用的研究,但是没有一个是专门针对数字学习应用的。这是这篇研究论文的动机,特别是考虑到最近在DL的研究结果。我们希望本研究将有助于更好地了解就这一主题开展研究的不同方向以及潜在的高影响力未来方向。

1.1. What are Anomalies?

The problem setup for anomaly detection in deep neural networks (DNNs) is as follows: the DNN is trained on indistribution data and is asked to perform predictions on both in-distribution as well as out-of-distribution (OOD) test samples. In-distribution test samples are from the same distribution as the training data and the trained DNN is expected to perform reliably on them. On the other hand, anomalous test samples are samples which do not conform to the distribution of the training data. Therefore, predictions of DNNs based on these anomalous samples should not be trusted. The goal of the anomaly detection problem is to design post-hoc detectors to detect these nonconforming test samples (see Fig. 1).

深层神经网络异常检测的问题设置为:DNN是在非分布数据上训练的,并被要求对分布内和分布外(ODD)测试样本进行预测。分布内测试样本来自与训练数据相同的分布,并且训练后的DNN期望在它们上可靠地执行。另一方面,异常测试样本是不符合训练数据分布的样本。因此,基于这些异常样本的DNN预测是不可信的。异常检测问题的目标是设计事后检测器(post-hoc detectors)来检测这些不合格的测试样本,见图1。

Next we discuss the types of anomalies, and present their respective differences. We classify anomalies into (a) unintentional and (b) intentional (see Fig. 2) types. Unintentional anomalies are independent of the DNN model, as opposed to,intentional anomalies which are intentionally designed by an attacker to force the DNN model to yield incorrect results, and are model dependent.

接下来,我们讨论异常的类型,并介绍它们各自的差异。我们将异常分为(a)无意和(b)有意(见图2)类型。无意的异常与DNN模型无关,相反,攻击者故意设计的故意异常,以迫使DNN模型产生不正确的结果,并且依赖于模型。

1.1.1. Unintentional: Novel and out-of-distribution examples

The unintentional anomalies are further classified into novel and OOD examples1. Novelty detection is the identification of new or unknown in-distribution data that a machine learning system is not aware of during training. However, the OOD example comes from a distribution other than that of the training data. The distinction between novelties and OOD data is that the novel data samples are typically incorporated into the normal model after being detected, however, OOD samples are usually discarded. In Fig. 2, the blue circles outside class boundaries are OOD examples. The OOD examples do not belong to any of the classes. In other words, the classifier is either unaware or does not recognize the OOD examples.

无意的异常进一步分为新颖的样本和OOD样本。新颖性检测是对机器学习系统在训练过程中没有意识到的新的或未知的分布数据的识别。然而,OOD例子来自于一个不同于训练数据的分布。新数据和ODD数据的区别在于,新数据样本通常在被检测到后被合并到正常模型中,然而,ODD样本通常被丢弃。在图2中,类边界外的蓝色圆圈是OOD的例子。ODD例子不属于任何一个类。换句话说,分类器要么不知道,要么不识别ODD例子

A related problem arises in Domain adaptation (DA) and transfer learning [1] which deal with scenarios where a model trained on a source distribution is used in the context of a different(butrelated)targetdistribution. Thedifferencebetween the DA and OOD problems is that DA techniques assume that the test/target distribution is related to the task (or distribution) of interest (thus, utilized during training). On the other hand, OOD techniques are designed to detect if incoming data is so different (and unrelated) from the training data that the model cannot be trusted.

一个相关的问题出现在域适应(DA)和转移学习[1]中,该问题涉及在不同(但相关)目标分布的情况下使用在源分布上训练的模型的场景。 DA和OOD问题之间的区别在于,DA技术假定测试/目标分布与目标任务(或分布)有关(因此,在训练过程中使用)。另一方面,OOD技术旨在检测输入数据是否与训练数据如此不同(且不相关)以至于无法信任模型。

1.1.2. Intentional: Adversarial Examples

The intentional anomalies (also known as the adversarial examples) are the test inputs that are intentionally designed by an attacker to coerce the model to make a mistake. For example, an attacker can modify the input image to fool the DNN classifier which could lead to unforeseen consequences, such as, accidents of autonomous cars or possible bank frauds. In Fig. 2, the examples in red are adversarial in nature. Via small perturbation at the input, these examples have been moved to other class regions leading to misclassification. The classifier may or may not have access to some of the labels of these examples leading to different techniques in the literature.
有意的异常(也称为对抗性示例)是攻击者有意设计的测试输入,用于强制模型出错。例如,攻击者可以修改输入图像,以欺骗DNN分类器,这可能导致无法预料的后果,例如自动驾驶汽车事故或可能的银行欺诈。在图2中,红色示例实际上是对抗性的。通过在输入上的小扰动,这些示例已移至其他类区域,从而导致分类错误。分类器可能会或可能不会访问这些示例的某些标签,从而导致文献中采用不同的技术。

1.2. Challenges

As mentioned above, anomalies are data samples that do not comply with the expected normal behavior. Hence, a naive approach for detecting anomalies is to define a region in the data space that represents normal behavior and declare an example as anomaly if it does not lie in this region. However, there are several factors that make this seemingly simple method ineffective:

如上所述,异常是不符合预期正常行为的数据样本。因此,一种用于检测异常的简单方法是在数据空间中定义一个代表正常的区域,如果该示例不在该区域中,则将其声明为异常。但是,有几个因素使得简单的方法无效:

  • The boundary between the normal and anomalous regions is very difficult to define, especially, in complex DNN feature spaces.

  • Based on the type of applications, the definition of an anomaly changes. For certain applications, a small deviation in the classification result from that of the normal input data may have far reaching consequences and thus may be declared as anomaly. In other applications, the deviation needs to be large for the input to be declared as an anomaly.

  • The success of some anomaly detection techniques in the literature depends on the availability of the labels for the training and/or testing data.

  • Anomaly detection is particularly difficult when the adversarial examples tend to disguise themselves as normal data.

  • 正常区域和异常区域之间的边界很难定义,尤其是在复杂的DNN特征空间中。

  • 根据应用程序的类型,异常的定义会更改。对于某些应用,分类结果与正常输入数据的微小偏差可能会产生深远的影响,因此可能被认为异常。在其他应用中,对于将输入声明为异常,偏差需要很大。

  • 文献中某些异常检测技术的成功取决于训练和/或测试数据标签的可用性。

  • 当对抗示例倾向于伪装成正常数据时,异常检测就特别困难。

The aforementioned difficulties make the anomaly detection problem difficult to solve in general. Therefore, most of thetechniquesintheliteraturetendtosolveaspecificinstance of the general problem based on the type of application, type of input data and model, availability of labels for the training and/or testing data, and type of anomalies.

上述困难使得异常检测问题总体上难以解决。因此,基于应用程序的类型,输入数据和模型的类型,用于训练和/或测试数据的标签的可用性以及异常的类型,本文中的大多数技术都倾向于解决一般问题的特定情况。

1.3. Related Work

Anomaly detection is the subject of various surveys, review articles, and books. In [2], a comprehensive survey of various categories of anomaly detection techniques for conventional machine learning as well as statistical models is presented. For each category of detection, various techniques and their respective assumptions along with the advantages and disadvantages are discussed. The computational complexity of each technique is also mentioned. A comprehensive survey of the novelty detection techniques is presented in [3]. V arious techniques are classified based on the statistical models used and the complexity of methods. Recently, an elaborate survey is presented in [4] where DL based anomaly detection techniques are discussed. Here, two more categories of anomaly detection, namely, hybrid models as well as one-class DNN techniques are also included. Note that our survey paper is different from [4] as our focus is on discussing unintentional and intentional anomalies specifically in the context of DNNs whereas [4] discusses approaches which use DNN based detectors applied to conventional ML problems. In some sense, our survey paper is much broader in the context of DL applications. In[5], a survey of the data mining techniques used for anomaly detection are discussed. The techniques discussed are clustering, regression, and rule learning. Furthermore, in [6], the authors discuss the models that are adaptive to account for the data coming from the dynamically changing characteristics of the environment and detect anomalies from theevolvingdata. Here, the techniques account for the change in the underlying data distribution and the corresponding unsupervised techniques are reviewed. In [7], the anomaly detection techniques are classified based on the type of data namely, metric data, evolving data, and multi-structured data. The metric data anomaly detection techniques consider the use of metrics like distance, correlation, and distribution. The evolving data include discrete sequences and time series. In [8], various statistical techniques, data mining based techniques, and machine learning based techniques for anomaly detection are discussed. In [9, 10], the existing techniques for anomaly detection which include statistical, neural network based, and other machine learning based techniques are discussed. Various books [11, 12, 13, 14] also discussed the techniques for anomaly detection.

异常检测是各种调查,评论文章和书籍的主题。在[2]中,提出了对常规机器学习的各种异常检测技术以及统计模型的全面调查。对于每种检测类别,都讨论了各种技术及其各自的假设以及优点和缺点。还提到了每种技术的计算复杂性。新颖性检测技术的全面综述在[3]中提出。根据使用的统计模型和方法的复杂性,对各种技术进行分类。最近,在[4]中进行了详尽的调查,其中讨论了基于DL的异常检测技术。这里,还包括另外两类异常检测,即混合模型以及一类DNN技术。请注意,我们的调查论文与[4]不同,因为我们的重点是在DNN的背景下专门讨论无意和有意的异常,而[4]讨论了将基于DNN的检测器应用于常规ML问题的方法。从某种意义上说,我们的调查论文在DL应用的背景下要广泛得多。在[5]中,讨论了用于异常检测的数据挖掘技术的调查。讨论的技术是聚类,回归和规则学习。此外,在[6]中,作者讨论了适用于解释来自环境动态变化特征的数据并从不断变化的数据中检测异常的模型。在这里,这些技术解决了基础数据分布的变化,并且对相应的无监督技术进行了回顾。在[7]中,基于数据的类型对异常检测技术进行分类,即度量数据,演进数据和多结构数据。度量数据异常检测技术考虑了距离,相关性和分布等度量的使用。不断发展的数据包括离散序列和时间序列。在[8]中,讨论了各种统计技术,基于数据挖掘的技术以及基于机器学习的异常检测技术。在[9,10]中,讨论了用于异常检测的现有技术,包括统计,基于神经网络和其他基于机器学习的技术。各种书籍[11、12、13、14]也讨论了异常检测技术。

1.4. Our Contributions

To the best of our knowledge, this survey is the first attempt to provide a structured and a broad overview of extensive research on detection techniques spanning both unintentional and intentional anomalies in the context of DNNs. Mostofthe existing surveys on anomaly detection focus on (i) anomaly detection techniques for conventional machine learning algorithms and statistical models, (ii) novelty detection techniques for statistical models, (iii) DL based anomaly detection techniques. In contrast, we provide a focused survey on post hoc anomaly detection techniques for DL. We classify these techniques based on the availability of labels for the training data corresponding to anomalies, namely, supervised, semisupervised, and unsupervised techniques. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. We also briefly discuss anomaly detection techniques that do not fall in the post-hoc category, e.g.,training-based, architecture design, etc.

据我们所知,该研究首次尝试对DNNs背景下的无意和有意异常检测技术的广泛研究进行结构化和广泛的概述。现有的异常检测研究大多集中在(1)传统机器学习算法和统计模型的异常检测技术,(2)统计模型的新颖性检测技术,(3)基于DL的异常检测技术。相比之下,我们提供了一个关于后自组织异常检测技术的重点研究。我们根据与异常相对应的训练数据的标签可用性对这些技术进行分类,即监督、半监督和非监督技术。我们讨论了每个类别中的各种技术,并提供了这些方法的相对优势和劣势。我们还将简要讨论不属于事后类别的异常检测技术,例如基于训练的架构设计等。

1.5. Organization

This survey is organized mainly in three parts: detection of unintentional anomalies, detection of intentional anomalies, and applications. For both unintentional and intentional anomalies, we will discuss different types of approaches (as illustrated in Fig. 3). In Sec. 2, we present various post-hoc anomaly detection techniques which are used to detect unintentional anomalies. These techniques are classified based on the availability of labels. In Sec. 3, we present various post-hoc anomaly detection techniques which are used to detect intentional anomalies (or adversarial examples). The techniques are again classified based on the availability of labels. In Sec. 4, we discuss strengths and weaknesses of different categories of methods. In Sec. 5, we describe various application domains where anomaly detection is applied. Finally, we conclude and present open questions in this area in Sec. 6.

本研究主要分为三个部分:无意异常的检测、有意异常的检测和应用。对于无意和有意的异常,我们将讨论不同类型的方法(如图3所示)。第2节,我们提出了各种用于检测无意异常的事后异常检测技术。这些技术根据标签的可用性进行分类。第3节,我们提出了各种事后异常检测技术,用于检测有意的异常(或敌对的例子)。这些技术再次根据标签的可用性进行分类。第4节,我们讨论不同类别方法的优缺点。第5节,我们描述了应用异常检测的各种应用领域。最后,我们在第6节总结并提出这方面的开放性问题。

2. UNINTENTIONAL ANOMALY DETECTION

In this section, we discuss the detection techniques which detect the OOD examples given a pre-trained neural network. Most DL approaches assume that the test examples belong to the same distribution as the training examples. Consequently, the neural networks are vulnerable to test examples which are OOD. Hence, we need techniques to improve the reliability of the predictions or determine whether the test example is different in distribution from that of the training dataset. Here, we concentrate on the techniques that determine whether the test example is different in distribution from that of the training dataset, using the pre-trained DNN followed by a detector. We refer to this architecture as post-hoc anomaly detection. A topic related to OOD example detection is novelty detection [15, 16, 17] which aims at detecting previously unobserved (emergent, novel) patterns in the data. It should be noted that solutions for novelty detection related problems are often used for OOD detection and vice-versa, and hence we use these terms interchangeably in this survey. Based on the availability of labels for OOD data, techniques are classified as supervised, semi-supervised, and unsupervised which are discussed next and summarized in Table 1.

在这一节中,我们讨论了检测技术,它检测给定预训练神经网络的ODD设计实例。大多数DL方法假设测试示例与训练示例属于相同的分布。因此,神经网络容易受到ODD测试示例的影响。因此,我们需要技术来提高预测的可靠性,或者确定测试示例在分布上是否不同于训练数据集。在这里,我们集中于确定测试示例在分布上是否不同于训练数据集的技术,使用预先训练的DNN和检测器。我们称这种架构为事后异常检测。与ODD设计实例检测相关的一个主题是新颖性检测[15,16,17],旨在检测数据中以前未观察到的(紧急的,新颖的)模式。应该注意的是,新颖性检测相关问题的解决方案通常用于ODD检测,反之亦然,因此我们在本次研究中交替使用这些术语。根据ODD数据标签的可用性,技术分为监督、半监督和非监督,下面将讨论这些技术,并在表1中进行总结。

2.1. Supervised Approaches

In this section, we review the anomaly detection approaches when the labels of both the in-distribution and the OOD examples are available to enable differentiation between them as the supervised anomaly detection problem. Any unseen test data sample is compared with the detector to determine which class (in-distribution vs. OOD) it belongs to.

在这一节中,我们回顾了当内分布和ODD例子的标签都可用时的异常检测方法,以便能够将它们区分为监督异常检测问题。任何看不见的测试数据样本都与检测器进行比较,以确定它属于哪一类(分布中还是ODD)。

In [18], an approach to measure uncertainty of a neural network based on gradient information of the negative loglikelihood at the predicted class label is presented. The gradient metrics are computed from all the layers in this method and scalarized using norm or min/max operations. A large value of the gradient metrics indicates incorrect classification or OOD example. A convolutional neural network (CNN) is used as the classifier trained on Extended MNIST digits [32]. EMNIST letters, CIFAR10 [33] images as well as different types of noise are used as OOD data. The authors found that such an unsupervised scheme does not work well on all types of OOD data. Therefore, a supervised variant of this scheme where one allows an anomaly detector to be trained on uncertainty metrics of some OOD samples is proposed. It was shown that the performance is improved considerably by utilizing the labeled OOD data.

在[18]中,提出了一种基于预测类标签处负对数似然的梯度信息来测量神经网络不确定性的方法。在这种方法中,梯度度量是从所有层计算的,并使用范数或最小/最大操作进行标量化。梯度度量的大值表示不正确的分类或ODD例子。卷积神经网络(CNN)被用作在扩展MNIST数字上训练的分类器[32]。EMNIST信函、CIFAR10 [33]图像以及不同类型的噪声被用作OOD数据。作者发现,这种无人监管的方案并不适用于所有类型的ODD数据。因此,提出了该方案的监督变体,其中允许异常检测器在一些ODD设计样本的不确定性度量上被训练。结果表明,通过使用标记的ODD数据,性能得到了显著提高。

In [19], the high-level idea is to measure the probability density of test sample on DNN feature spaces. Specifically, the authors fit class-conditional Gaussian distributions to pretrained features. This is possible since the posterior distribution can be shown to be equivalent to the softmax classifier under Gaussian discriminant analysis. Next, a confidence score using the Mahalanobis distance with respect to the closest class conditional distribution is defined. Its parameters are chosen to be empirical class means and tied empirical covarianceoftrainingsamples. Tofurtherimprovetheperformance, confidence scores from different layers of DNN is combined using weighted averaging. Weight of each layer is learned by training a logistic regression detector using labeled validation samples comprising of both in-distribution and OOD data. The method is shown to be robust to OOD examples.

在[19]中,高级思想是在DNN特征空间上测量测试样本的概率密度。具体来说,作者将类条件高斯分布拟合到预处理特征。这是可能的,因为后验分布可以被显示为等同于高斯鉴别分析下的软最大分类器。接下来,使用关于最接近的类条件分布的马氏距离来定义置信度得分。它的参数被选择为经验类均值和训练样本的关联经验协变。为了进一步提高性能,来自DNN不同层的置信度得分使用加权平均进行组合。每一层的权重是通过训练一个逻辑回归检测器来学习的,该检测器使用由分布内和ODD数据组成的标记验证样本。该方法对ODD样本是鲁棒的。

In [20], a detector was trained on representations derived from a set of classifier responses generated from applying different natural transformation to a given image. Analyzing the invariance of classifier’s decision under various transformations establishes a measure of confidence in its decision. In other words, the softmax values of the OOD input should fluctuate across transformed versions, while those of the indistribution image should be relatively stable. The authors trained a binary OOD detector on confidence scores under various transformations for in-distribution vs. OOD training data. ResNet based architecture is used as the classifier and the Self-Taught Learning (STL-10) dataset [34] is used as the in-distribution data and the Street View House Numbers (SVHN) dataset [35] is used as the OOD data. The approach is shown to outperform other baselines.

在[20]中,检测器被训练成从对给定图像应用不同的自然变换而产生的一组分类器响应中导出的表示。分析分类器决策在各种变换下的不变性建立了对其决策的置信度。换句话说,ODD设计输入的软最大值应该在转换版本之间波动,而非分布图像的软最大值应该相对稳定。作者训练了一个二进制的ODD设计的检测器,在不同的转换下,对ODD设计的训练数据进行置信度的计算。基于ResNet的架构用作分类器,自学(STL-10)数据集[34]用作分布数据,街景门牌(SVHN)数据集[35]用作ODD数据。该方法被证明优于其他基线。

In [21], a trust score is proposed to know whether the prediction of a test example by a classifier can be trusted. This score is defined as the ratio of the Hausdorff distances between the distance from the testing sample to the nearest class different from the predicted class (e.g., OOD class) and the distance to the predicted class. To compute the trust score, the training data is pre-processed to find a high density set of each class to filter outliers. The trust score is estimated based on this high density set. The idea behind the approach is that if the classifier predicts a label that is considerably farther than the closest label, then it may be an OOD or unreliable example. For the task of identifying correctly/incorrectly classified examples, it was shown that the trust score performs well in low to medium dimensions. However, it performs similar to classifiers’ own reported confidence (i.e., probabilities from the softmax layer) in high dimensions.

在[21]中,提出了信任分数来知道分类器对测试示例的预测是否可信。这个分数被定义为从测试样本到不同于预测类别的最近类别(例如,OOD类别)的距离和到预测类别的距离之间的豪斯多夫距离的比率。为了计算信任分数,对训练数据进行预处理,以找到每个类的高密度集合来过滤离群值。基于这个高密度集合来估计信任分数。这种方法背后的想法是,如果分类器预测的标签比最近的标签远得多,那么它可能是一个ODD或不可靠的例子。对于识别正确/不正确分类的例子的任务,显示信任分数在低到中等维度中表现良好。然而,它在高维度上的表现类似于分类器自己报告的置信度(即,来自软最大层的概率)。

2.2. Semi-supervised Approaches

We refer to the anomaly detection techniques as semisupervised if they utilize unlabeled contaminated data (or information) in addition to labeled instances of in-distribution class. Since, these techniques do not require to know whether unlabeled instance is in-distribution or OOD, they are more widely applicable than supervised techniques.

我们称异常检测技术为半监督的,如果它们除了使用内分布类的标记实例之外,还使用未标记的污染数据(或信息)。因为这些技术不需要知道未标记的实例是在分布中还是ODD,所以它们比监督技术更适用。

In [22], the algorithm uses the knowledge of the upper bound on the number of anomaly examples in the training dataset to provide Probably Approximately Correct (PAC) guarantees for achieving a desired anomaly detection rate. The algorithm uses cumulative distribution functions (CDFs) over anomaly scores for the clean and contaminated training datasets to derive an anomaly threshold. An anomaly detector assigns score for all the test examples and orders them according to how anomalous the examples are with respect to the in-distribution data. This ordered score vector is then compared to the threshold to detect the OOD examples. The threshold is computed such that it guarantees a specific anomaly detection rate. Empirical results on synthetic and standard datasets show that the algorithm achieves guaranteed performance on OOD detection task given enough data.

在[22]中,该算法使用训练数据集中异常示例数量上限的知识来为实现期望的异常检测率提供可能近似正确(PAC)保证。该算法使用累积分布函数(CDFs)对干净和污染的训练数据集进行异常评分,得出异常阈值。异常检测器为所有测试示例分配分数,并根据示例相对于分布内数据的异常程度对它们进行排序。然后,将这个有序的得分向量与阈值进行比较,以检测ODD示例。计算阈值以保证特定的异常检测率。在合成数据集和标准数据集上的实验结果表明,在给定足够数据的情况下,该算法在ODD检测任务上取得了保证的性能。

In [23], a likelihood ratio-based method using deep generative models is presented to differentiate between indistribution and OOD examples. The authors assumed that the in-distribution data is comprised of both semantic and background parts. The authors found that the likelihood can be confounded by the background (e.g. OOD input with the same background but different semantic component). Using this information about OOD data, they propose to use a background model to correct for the background statistics and enhance the in-distribution specific features for OOD detection. Specifically, background model is trained by adding the right amount of perturbations to inputs to corrupt the semantic structure in the data. Hence, the model trained on perturbed inputs captures only the population level background statistics. This likelihood ratio is computed from the in-distribution data and the background statistics. If the likelihood ratio is larger than a pre-specified threshold, it is highly likely that the test example is OOD. The National Center for Biotechnology Information microbial genome dataset is utilized in [23] in the following manner. Various bacteria are grouped into classes which were discovered over the years. Specifically, the classes discovered before a given cutoff year are considered as in-distribution classes and those discovered after the cutoff year are considered OOD classes. The proposed test improves the accuracy of OOD detection compared to the accuracy of the state-of-the-art detection results.

在[23]中,提出了一种基于似然比的方法,使用深度生成模型来区分不均匀分布和ODD例子。作者假设分布数据由语义和背景两部分组成。作者发现,这种可能性可能会受到背景的干扰(例如,具有相同背景但不同语义成分的ODD输入)。利用这些关于ODD数据的信息,他们建议使用背景模型来校正背景统计,并增强ODD检测的分布内特定特征。具体来说,背景模型是通过向输入添加适量的扰动来破坏数据中的语义结构来训练的。因此,在扰动输入上训练的模型仅捕获群体水平的背景统计。这个似然比是从分布内数据和背景统计中计算出来的。如果似然比大于预先指定的阈值,则测试示例很可能是ODD。[23]国家生物技术信息中心微生物基因组数据集的使用方式如下。各种细菌被归入多年前发现的类别。具体来说,在给定截止年之前发现的类被认为是分布内类,而在截止年之后发现的类被认为是ODD类。与最先进的检测结果的准确性相比,所提出的测试提高了ODD检测的准确性。

In [24], a semi-supervised OOD detection technique based on two-head CNN was proposed. The idea is to train a two-head CNN consisting of one common feature extractor and two classifiers which have different decision boundaries but can classify in-distribution samples correctly. Further, unlabeled contaminated data is used to maximize the discrepancy between two classifiers to push OOD samples outside in-distribution manifold. This enables the detection of OOD samples that are far from the support of the in-distribution samples.

文献[24]提出了一种基于双头神经网络的半监督ODD检测技术。其思想是训练一个由一个公共特征提取器和两个分类器组成的双头CNN,这两个分类器具有不同的决策边界,但可以正确地对分布内样本进行分类。此外,未标记的污染数据被用来最大化两个分类器之间的差异,以将ODD样本推到分布流形之外。这使得能够检测远离分布内样本支持的OOD样本。

2.3. Unsupervised Approaches

We refer to the detection techniques as unsupervised if they only utilize in-distribution data for OOD detection.

如果检测技术仅利用分布内数据进行ODD检测,我们称之为无监督检测。

In [25], as the statistics derived from the softmax distributions are helpful, a baseline method based on softmax to determine whether or not a test example is OOD is proposed. The idea is that a well trained network tends to assign higher predicted probability to in-distribution examples than to OOD examples. Hence, the OOD example can be detected by comparing the predicted softmax class probabilities of the examples to a threshold. Specifically, the authors generated the training data by separating correctly and incorrectly classified test set examples and, for each example, computing the softmax probability of the predicted class which was used to compute the threshold. The performance of this approach was evaluated on computer vision, natural language processing and speech recognition tasks. The technique fails if the classifier does not separate the maximum values of the predictive distribution well enough with respect to in-distribution and OOD examples. Therefore, the authors in [26] proposed a method based on the observation that using temperature scaling and adding small perturbations to the input can better separate the softmax score distributions between in- and out-ofdistribution images. Wide ResNet [36] and DenseNet [37] architectures were used and trained using the CIFAR-10 and CIFAR-100 [33] as in-distribution datasets. The OOD detector was tested on several different natural image datasets and synthetic noise datasets. It was shown that the approach significantly improves the detection performance and outperforms the baseline in [25].

在[25]中,由于从softmax分布导出的统计量是有用的,因此提出了一种基于softmax的基线方法来确定测试实例是否是ODD。这个想法是,一个训练有素的网络倾向于给分布内的例子分配比ODD例子更高的预测概率。因此,可以通过将实例的预测软最大类概率与阈值进行比较来检测ODD设计实例。具体来说,作者通过分离正确和错误分类的测试集示例来生成训练数据,并且对于每个示例,计算用于计算阈值的预测类的软最大概率。在计算机视觉、自然语言处理和语音识别任务中评估了该方法的性能。如果分类器不能很好地分离预测分布的最大值,那么该技术就失败了。因此,作者在[26]中提出了一种方法,该方法基于这样的观察,即使用温度缩放和向输入添加小扰动可以更好地分离分布图像内外的软最大分数分布。使用CIFAR-10和CIFAR-100 [33]作为分布内数据集,使用并训练了宽ResNet [36]和DenseNet [37]架构。在几个不同的自然图像数据集和合成噪声数据集上测试了OOD检测器。结果表明,该方法显著提高了检测性能,优于文献[25]中的基线。

In [27], a generative adversarial network (GAN) [38] based architecture is used in reconstruction error based OOD detection method. The motivation is that the GAN will perform better when generating images from previously seen objects (i.e., in-distribution data) than it will when generating images of objects it has never seen before (i.e., OOD data). In this approach, the test image is first passed through the generator of the GAN, which produces bottleneck features and a reconstructed image. Next, the reconstructed image is passed through the encoder producing another set of bottleneck features. The Euclidean distance between these two feature sets represents a measure of how much the generated image deviates from the original image and is used as an anomaly score.

在[27]中,基于生成对抗网络[38]的体系结构被用于基于重构误差的ODD检测方法。其动机是,当从先前看到的对象(即,分布内数据)生成图像时,GAN将比当生成其从未见过的对象(即,OOD数据)的图像时表现得更好。在这种方法中,测试图像首先通过GAN的发生器,发生器产生瓶颈特征和重建图像。接下来,重建的图像通过编码器,产生另一组瓶颈特征。这两个特征集之间的欧几里德距离表示生成的图像偏离原始图像的程度的度量,并被用作异常分数。

In [28], the authors propose a degenerated prior network architecture, which can efficiently separate model-level uncertaintyfromdata-leveluncertaintyviapriorentropy. Tobetter separate in-distribution and OOD images, they propose a concentration perturbation algorithm, which adaptively adds noise to concentration parameters of prior network. Through comprehensive experiments, it was shown that this method achieves state-of-the-art performance especially on the largescale dataset. However, this method is found to be sensitive to different neural network architectures, which could sometimes lead to inferior performance.

在[28]中,作者提出了一种退化的先验网络结构,它能有效地分离模型级不确定性和数据级不确定性。为了更好地分离内分布图像和外分布图像,他们提出了一种浓度扰动算法,该算法自适应地在先验网络的浓度参数中加入噪声。通过综合实验表明,该方法在大规模数据集上取得了最好的性能。然而,该方法被发现对不同的神经网络体系结构敏感,这有时会导致较差的性能。

In [29], the intuition is that learning to discriminate between geometric transformations applied to images help in learning of unique features of each class that are useful in anomaly detection. The authors train a multi-class classifier over a self-labeled dataset created by applying various geometric transformations to in-distribution images. At test time, transformed images are passed through this classifier, and an anomaly score derived from the distribution of softmax values of the in-distribution training images is used for detecting OOD data. The classifier used is the Wide Residual Network model [36] trained on CIFAR dataset. The CatsvsDogs dataset [39], that contains 12,500 images of cats and dogs each, is treated as the OOD data. The method performs better compared to the baseline approaches in [25] for the largersized images and is robust to the OOD examples. The method is able to distinguish between the normal and OOD examples with a significant margin compared to the baseline methods.

在[29]中,直觉告诉我们,学习区分应用于图像的几何变换有助于学习每一类在异常检测中有用的独特特征。作者在自标记数据集上训练多类分类器,该数据集是通过将各种几何变换应用于内分布图像而创建的。在测试时,转换后的图像通过该分类器,并且从分布内训练图像的软最大值的分布中导出的异常分数被用于检测ODD数据。使用的分类器是在CIFAR数据集上训练的宽残差网络模型[36]。猫和狗数据集[39]包含12,500张猫和狗的图像,被视为ODD数据。与[25]中的基线方法相比,该方法对于大尺寸图像表现更好,并且对于ODD设计的例子是稳健的。与基线方法相比,该方法能够显著区分正常和ODD示例。

The approach in [30] (and references therein) consider the problem of detecting OOD samples based on the reconstruction error. These methods assume that OOD data is composed of different factors than in-distribution data. Therefore, it is difficult to compress and reconstruct OOD data based on a reconstruction scheme optimized for in-distribution data. Specifically, [30] proposes to incorporate the Mahalanobis distance in latent space to better capture these OOD samples. They combined the Mahalanobis distance between the encoded test sample and the mean vector of the encoded training set with the reconstruction loss of the test sample to construct an anomaly score. Single digit class from MNIST [40] is used as in-distribution and the other classes of MNIST are treated as OOD samples. The authors illustrate that by including the latent distance helps in improving the detection of in-distribution and OOD examples.

[30]中的方法(以及其中的参考文献)考虑了基于重构误差检测ODD设计样本的问题。这些方法假设ODD数据由不同于分布内数据的因素组成。因此,很难基于针对分布内数据优化的重建方案来压缩和重建ODD数据。具体来说,[30]建议将马氏距离纳入潜在空间,以更好地捕捉这些ODD设计样本。他们将编码后的测试样本和编码后的训练集的平均向量之间的马氏距离与测试样本的重构损失相结合,以构建异常分数。来自MNIST的一位数分类[40]被用作内分布,而MNIST的其他分类被视为样本。作者举例说明,通过包含潜在距离有助于提高对分布内和ODD实例的检测。

In [31], the predictions of a pre-trained DNN are audited to determine their reliability. Resampling uncertainty estimation (RUE) approach is proposed as an approximation to the bootstrap procedure. Intuitively, RUE estimates the amount that a prediction would change if different training data was used from the same distribution. It quantifies uncertainty using the gradients and Hessian of the model’s loss on training data and bootstrap samples to produce an ensemble of predictions for a test input. This uncertainty score is compared to a threshold for detecting correct and incorrect predictions. A single hidden layer feedforward neural network architecture is trained using eight common benchmark regression datasets [41] from the UCI dataset repository. The authors show that the uncertainty score detects inaccurate predictions for auditing reliability compared to existing techniques more effectively. This approach can also be used to detect OOD samples.

在[31]中,预先训练的DNN的预测被审核以确定它们的可靠性。重采样不确定性估计(RUE)方法被提出作为自举过程的近似。直觉上,RUE估计如果使用来自相同分布的不同训练数据,预测将会改变的量。它使用模型在训练数据和自举样本上的损失的梯度和海森来量化不确定性,以产生测试输入的预测集合。将该不确定性分数与检测正确和不正确预测的阈值进行比较。使用来自UCI数据集仓库的八个公共基准回归数据集[41]来训练单个隐藏层前馈神经网络架构。作者表明,与现有技术相比,不确定性分数更有效地检测审计可靠性的不准确预测。这种方法也可以用于检测食品样品。

Note that the unsupervised methods discussed above require comparing proposed anomaly scores with a threshold. Although thresholds are computed solely based on in-distribution data, one can further improve the performance by optimally choosing thresholds based on OOD validation samples (if available).

请注意,上面讨论的无监督方法需要将建议的异常分数与阈值进行比较。虽然阈值仅基于分布内数据计算,但可以通过基于ODD验证样本(如果可用)优化选择阈值来进一步提高性能。

2.4. Other Miscellaneous Techniques

In this section, we discuss various approaches that are different from the post-hoc anomaly detection techniques, e.g., training-based, architecture design, etc.

在本节中,我们将讨论不同于事后异常检测技术的各种方法,例如基于训练的方法、架构设计等。

In [42], a new form of support vector machine (SVM) is presented that combines multi-class classification and OOD detection into a single step. Specifically, the authors augmented original SVM with an auxiliary zeroth class as the anomaly class for labeling OOD examples. The UCI datasets are used as the training examples. The authors demonstrate the trade-off between the ability to detect anomalies and the incorrect labeling of normal examples as anomalies.

在[42]中,提出了一种新形式的支持向量机(SVM),它将多类分类和ODD检测结合成一个步骤。具体来说,作者用一个辅助零类作为异常类来标记ODD例子,从而扩充了原始SVM。UCI数据集被用作训练示例。作者论证了检测异常的能力和将正常例子错误地标记为异常之间的权衡。

A hybrid model for fake news detection in [43] consists of three steps which capture the temporal pattern of user activity on a given article using a recurrent neural network (RNN), checking the credibility of the media source, and classifying the article as fake or not. In [44], an RNN network is used to detect anomalous data where the Numenta Anomaly Benchmark metric is used for early detection of anomalies.

文献[43]中的假新闻检测混合模型由三个步骤组成,这三个步骤使用递归神经网络(RNN)捕获给定文章上用户活动的时间模式,检查媒体源的可信度,并将文章分类为假或不假。在[44]中,RNN网络用于检测异常数据,而Numenta异常基准度量用于异常的早期检测。

The method presented in [45] proposed to modify the output layer of DNNs. Specifically, instead of using logit scores for computing class probabilities, the cosine of the angle between the weights of a class and the features of the class are used. In other words, the class probabilities are obtained using the softmax of scaled cosine similarity. The detection of OOD samples is done by comparing the maximum of cosine values across classes to a threshold. The method is hyperparameter-free and has high OOD detection performance. However, the trade-off is the degradation of the classification accuracy. The Wide Residual Network [36] is used as the classifier trained using the CIFAR dataset, and tiny ImageNet and SVHN datasets are considered OOD data. The approach achieves competitive detection performance even without the tuning of the hyperparameters and the method requires only a single forward pass without the need for backpropagation for each input.

[45]中提出的方法建议修改DNNs的输出层。具体来说,不是使用logit分数来计算类概率,而是使用类的权重和类的特征之间的角度的余弦。换句话说,类概率是使用比例余弦相似度的软最大值获得的。通过比较类间余弦值的最大值和一个阈值来检测好的样本。该方法不含超参数,具有较高的检测性能。然而,代价是分类精度的降低。宽残差网络[36]被用作使用CIFAR数据集训练的分类器,并且微小的ImageNet和SVHN数据集被认为是ODD数据。该方法甚至在没有调整超参数的情况下也实现了有竞争力的检测性能,并且该方法只需要一次正向传递,而不需要对每个输入进行反向传播。

In [46], a deep autoencoder is combined with CNN to perform supervised OOD detection. Autoencoder is used as a pre-training method for supervised CNN training. The idea is to reconstruct high-dimensional features using the deep autoencoder and detect anomalies using CNNs. It was shown that this combination can improve the accuracy and efficiency of large-scale Android malware detection.

在[46]中,深度自动编码器与CNN相结合来执行监督式OOD检测。自动编码器被用作CNN监督训练的预训练方法。这个想法是使用深度自动编码器重建高维特征,并使用CNNs检测异常。结果表明,这种结合可以提高大规模安卓恶意软件检测的准确性和效率。

A novel training method is presented in [47] where two additional terms are added in the cross entropy loss that minimize the Kullback-Leibler (KL) distance between the predictive distribution on OOD examples and the uniform distribution to assign less confident predictions to the OOD examples. Then, in-distribution and OOD samples are expected to be more separable. However, the loss function for optimization requires OOD examples for training which are generated by using a GAN architecture. Hence, the training involves minimizing the classifiers loss and the GAN loss alternately.

在[47]中提出了一种新的训练方法,在交叉熵损失中增加了两个附加项,使ODD设计实例的预测分布和均匀分布之间的库勒贝克-莱布勒距离最小化,从而将不太可信的预测分配给ODD设计实例。那么,内分布和OOD样本预计会更容易分离。然而,用于优化的损失函数需要通过使用GAN体系结构生成的用于训练的OOD示例。因此,训练包括交替地最小化分类器损失和GAN损失。

In [48], the algorithm comprises of an ensemble of leaveout-classifiers. Each classifier is trained using in-distribution examples as well as OOD examples. Here, the OOD examples are obtained by designating a random subset from the training dataset as OOD and the rest are in-distribution. A novel margin-based loss function is presented that maintains a margin m between the average entropy of the OOD and in-distribution samples. Hence, the loss function is the crossentropy loss along with the margin-based loss. The loss function is minimized to train the ensemble of classifiers. The OOD detection score is obtained by combining the softmax prediction score and the entropy with temperature scaling. The score is shown to be high for in-distribution examples and low for OOD examples.

在[48]中,该算法由叶外分类器的集合组成。每个分类器都是用分布内的例子和ODD例子来训练的。这里,ODD设计的例子是通过从训练数据集中指定一个随机子集作为ODD设计获得的,其余的是分布的。提出了一种新的基于边际的损失函数,它保持了样本的平均熵与分布内样本之间的边际。因此,损失函数是交叉熵损失和基于边际的损失。损失函数被最小化以训练分类器的集成。通过将softmax预测分数和熵与温度标度相结合来获得OOD检测分数。对于内部分配的例子,得分较高,而对于ODD例子,得分较低。

Furthermore, [49] proposes leveraging alternative data sources to improve OOD detection by training anomaly detectors against an auxiliary dataset of outliers, an approach they call Outlier Exposure. The motivation is that while it is difficult to model every variant of anomaly distribution, one can learn effective heuristics for detecting OOD samples by exposing the model to diverse OOD datasets. Thus, learning a more conservative concept of the in-distribution and enabling anomaly detectors to generalize and detect unseen anomalies.

此外,[49]建议利用替代数据源,通过针对异常值的辅助数据集训练异常检测器来提高ODD检测,这种方法被称为异常值暴露。其动机是,虽然很难对异常分布的每一种变体进行建模,但可以通过将模型暴露给不同的ODD设计数据集来学习检测ODD设计样本的有效启发式方法。因此,学习更保守的内分布概念,使异常检测器能够概括和检测看不见的异常。

The key idea in [50] is that the likelihood models assign higher density values to the OOD examples than the in-distribution examples. The authors propose generative ensembles to detect OOD examples by combining a density evaluation model with predictive uncertainty estimation on the density model via ensemble variance. Specifically, they use uncertainty estimation on randomly sampled GAN discriminators to de-correlate the OOD classification errors made by a single discriminator.

[50]中的关键思想是,似然模型给ODD设计示例分配的密度值比分布内示例高。作者提出了通过集成方差将密度评估模型与密度模型上的预测不确定性估计相结合来检测ODD设计实例的生成集成。具体来说,他们对随机采样的GAN鉴别器使用不确定性估计来消除单个鉴别器产生的OOD分类错误的相关性。

The authors in [51] proposed a permutation test statistics to detect OOD samples using deep generative models trained with batch normalization. They show that the training objective of generative models with batch normalization can be interpretedasmaximumpseudo-likelihoodoveradifferentjoint distribution. Over this joint distribution, the estimated likelihood of a batch of OOD samples is shown to be much lower than that of in-distribution samples.

作者在[51]中提出了一种置换检验统计方法,使用批量归一化训练的深度生成模型来检测ODD设计样本。它们表明,具有批处理规范化的生成模型的训练目标可以用极大似然性解释为不同的点分布。在这种联合分布下,一批样本的估计似然性比分布内样本低得多。

In[52], benchmarking of some of the existing posthoc calibration based OOD detection techniques is performed. The effect of OOD examples on the accuracy and calibration for the classification tasks is investigated. The authors evaluate uncertainty not only for in-distribution examples but also for OOD examples. They utilize metrics such as negative loglikelihood and Brier scores to evaluate the model uncertainty or accuracy of computed predicted probabilities. Using largescale experiments, the authors show that the calibration error increases with increasing distribution shift and post-hoc calibration does indeed fall short in detecting OOD examples.

在[52]中,对一些现有的基于后校准的ODD检测技术进行了基准测试。研究了ODD设计实例对分类任务精度和校准的影响。作者不仅对分布内的例子,而且对ODD例子进行了不确定性评估。他们利用负对数似然性和布瑞尔分数等指标来评估模型的不确定性或计算预测概率的准确性。使用大规模实验,作者表明,校准误差随着分布偏移的增加而增加,并且事后校准在检测ODD例子方面确实不足。

3. INTENTIONAL ANOMALY DETECTION

In this section, we discuss the detection techniques for detecting intentionally designed adversarial test examples given a pre-trained neural network. It is well known that DNNs are highly susceptible to test time adversarial examples – humanimperceptible perturbations that, when added to any image, causes it to be misclassified with high probability [53, 54]. The imperceptibility constraint ensures that the test example belongs to the data manifold yet gets misclassified. Hence, we need techniques to improve the reliability of the predictions or determine whether the test example is adversarial or normal. Here, we focus on the latter with the availability of a pre-trained DNN followed by a detector. Based on the availability of labels, the techniques are classified as supervised, semi-supervised, and unsupervised which are elaborated as follows and summarized in Table 2.

在这一节中,我们讨论了在给定预先训练的神经网络的情况下,用于检测有意设计的对抗测试示例的检测技术。众所周知,DNN非常容易受到测试时间敌对例子的影响——人类不可察觉的扰动,当将其添加到任何图像中时,都会导致很大概率的错误分类[53,54]。不可感知性约束确保测试示例属于数据流形,但却被错误分类。因此,我们需要技术来提高预测的可靠性,或者确定测试示例是对抗性的还是正常的。在这里,我们将重点放在后者上,提供一个预先训练好的DNN和一个探测器。根据标签的可用性,这些技术分为监督、半监督和非监督三种,详述如下,并在表2中总结。

3.1. Supervised Approaches、

In this section, we discuss the detection techniques that require the labels of both in-distribution and adversarial examples and referred to them as supervised anomaly detection techniques. The test examples are compared against the detector to determine whether they are normal or adversarial.

在这一节中,我们讨论了需要分布内和敌对实例标签的检测技术,并将它们称为监督异常检测技术。将测试示例与检测器进行比较,以确定它们是正常的还是敌对的。

In [55], a binary adversarial example detector is proposed. The detector is trained on intermediate feature representations of a pre-trained classifier on the original data set and adversarial examples. Although it may seem very difficult to train such a detector, their results on CIFAR10 and a 10-class subset of ImageNet datasets show that training such a detector is indeed possible. In fact, the detector achieves high accuracy in the detection of adversarial examples. Moreover, while the detector is trained on adversarial examples generated using a specific attack method, it is found that the detector generalizes to similar and weaker attack methods. Similar strategy was employed in [68] where ML model was augmented with an additional class in which the model is trained to classify all adversarial inputs using labeled data.

在[55]中,提出了一种二元对立示例检测器。检测器在原始数据集和敌对实例上的预训练分类器的中间特征表示上进行训练。尽管训练这样一个检测器似乎非常困难,但他们在CIFAR10和ImageNet数据集的10类子集上的结果表明,训练这样一个检测器确实是可能的。事实上,该检测器在检测敌对实例时达到了很高的准确度。此外,虽然检测器是在使用特定攻击方法生成的对抗示例上训练的,但是发现检测器可以推广到类似的和更弱的攻击方法。在[68]中采用了类似的策略,在该策略中,最大似然模型被增加了一个额外的类,在该类中,模型被训练为使用标记数据对所有敌对输入进行分类。

The authors in [56] proposed three methods to detect adversarial examples. First, method which is based on the den-sity estimation uses estimates from the kernel density estimation of the training set in the feature space of the last hidden layer to detect adversarial examples. This method is meant to detect points that lie far from the data manifold. However, this strategy may not work well when adversarial example is very near the benign submanifold. Therefore, the authors proposed second approach which uses Bayesian uncertainty estimates from the dropout neural networks when points lie in low-confidence regions of the input space. They show that dropout based method can detect adversarial samples in situations where density estimates cannot. Finally, they also build acombined detector which is a simple logistic regression classifier with two features as input: the uncertainty and the density estimate. The combined detector is trained on a labeled training set which comprises of uncertainty values an ddensity estimates for both benign and adversarial examples generated using different adversarial attack methods. The authors report that the performance of the combined detector (detection accuracy of 85-93%) is better than detectors trained either on uncertainty or on density values, demonstrating that each feature is able to detect different qualities of adversarial features.

作者在[56]中提出了三种方法来检测敌对的例子。首先,基于密度估计的方法使用来自最后一个隐藏层的特征空间中的训练集的核密度估计的估计来检测敌对示例。该方法旨在检测远离数据流形的点。然而,当对抗性例子非常接近良性子流形时,这种策略可能不会很好地工作。因此,作者提出了第二种方法,即当点位于输入空间的低置信度区域时,使用来自缺失神经网络的贝叶斯不确定性估计。他们表明,基于缺失的方法可以在密度估计无法检测的情况下检测敌对样本。最后,他们还建立了一个组合检测器,它是一个简单的逻辑回归分类器,具有两个特征作为输入:不确定性和密度估计。组合检测器在一个标记的训练集上训练,该训练集包括不确定性值和使用不同对抗攻击方法生成的良性和对抗示例的密度估计。作者报告说,组合检测器的性能(检测精度为85-93%)优于在不确定性或密度值上训练的检测器,表明每个特征能够检测不同质量的对抗特征。

In [57], the idea is that the trajectory of the internal representations in the forward pass for the adversarial examples are different from that of the in-distribution examples. The internal representations of an input is embedded into the feature distance spaces which capture the relative positions of an example with respect to a given in-distribution example in the feature space. The embedding enables compact encoding of the evolution of the activations through the forward pass of the network. Hence, facilitating the search for differences between the trajectories of in-distribution and adversarial inputs. An LSTM based binary detector is trained to analyze the sequence of deep features embedded in a distance space and detect adversarial examples. The experimental results show that the detection scheme is able to detect a variety of adversarial examples targeting the ResNet-50 classifier pre-trained on the ImageNet dataset.

在[57]中,观点是对抗性例子在向前传递中的内部表示的轨迹不同于在分布中的例子。输入的内部表示被嵌入到特征距离空间中,该特征距离空间捕捉一个示例相对于特征空间中给定的内分布示例的相对位置。嵌入使得能够通过网络的前向通路对激活的演变进行紧凑编码。因此,有助于寻找内部分配和对抗性投入之间的差异。训练基于LSTM的二元检测器来分析嵌入在距离空间中的深层特征序列,并检测敌对的例子。实验结果表明,该检测方案能够针对预先训练在图像网数据集上的ResNet-50分类器,检测出各种敌对的例子。

In [58], an expansion-based measure of intrinsic dimensionality is used as an alternative to density measure to detect adversarial example. The expansion model of dimensionality assesses the local dimensional structure of the data and characterizes the intrinsic dimensionality as a property of the datasets. The Local Intrinsic Dimensionality (LID) generalizes this concept to the local distance distribution from a reference point to its neighbors – the dimensionality of the local data submanifold in the vicinity of the reference point is revealed by the growth characteristics of the cumulative distribution function. The authors use LID to characterize the intrinsic dimensionality of regions where adversarial examples lie, and use estimates of LID to detect adversarial examples. Note that LID is a function of the nearest neighbor distances and it found to be significantly higher for the adversarial examples than the benign examples. A binary adversarial example detector is trained by using the training data to construct features for each sample, based on its LID across different layers, where the class label is assigned positive for adversarial examples and assigned negative for in-distribution examples. Experiments on several attack strategies show that LID based detector outperforms several state-of-the-art detection measures by large margins.

在[58]中,基于扩展的内在维度度量被用作密度度量的替代,以检测敌对示例。维度扩展模型评估数据的局部维度结构,并将内在维度表征为数据集的属性。局部内在维度(LID)将这一概念推广到从参考点到其邻居的局部距离分布——参考点附近的局部数据子流形的维度由累积分布函数的增长特征揭示。作者使用LID来表征敌对实例所在区域的内在维度,并使用LID的估计来检测敌对实例。请注意,LID是最近邻距离的函数,并且发现敌对示例的LID明显高于良性示例。二元对立示例检测器是通过使用训练数据来构造每个样本的特征来训练的,基于其跨不同层的LID,其中类别标签对于对立示例被指定为正,对于分布内示例被指定为负。在几种攻击策略上的实验表明,基于LID的检测器在很大程度上优于几种最先进的检测方法。

In [59], a three layer regression NN is used as a detector that takes logits of in-distribution and adversarial examples from a pre-trained DNN as the input and predicts the confidence value, i.e., whether the classification is normal or adversarial. The classifier used is a pre-trained CNN trained using in-distribution datasets (MNIST and CIFAR) and the detector is trained on logits of both in-distribution and adversarial examples generated using different methods. This work show that logits of a pre-trained network provide relevant information to detect adversarial examples.

在[59]中,使用三层回归神经网络作为检测器,该检测器将来自预先训练的DNN的内分布和对抗实例的逻辑作为输入,并预测置信度值,即分类是正常的还是对抗的。所使用的分类器是使用分布内数据集(MNIST和CIFAR)训练的预先训练的CNN,并且检测器在使用不同方法生成的分布内和敌对示例的逻辑上训练。这项工作表明,预先训练好的网络的逻辑提供了相关的信息来检测敌对的例子。

3.2. Semi-supervised Approaches

Semi-supervised anomaly detection techniques utilize unlabeled contaminated data (or information) in addition to labeled instances of in-distribution class. Since, these techniques do not require to know whether unlabeled instance is in-distribution or adversarial examples, they are more widely applicable than supervised techniques. However, we could not find any existing semi-supervised adversarial example detection approach in the literature. Note that this may be a worthwhile direction to pursue in future research.

半监督异常检测技术利用未标记的污染数据(或信息)以及已标记的分布类实例。由于这些技术不需要知道未标记的实例是分布中的还是对立的实例,因此它们比监督技术更适用。然而,我们在文献中找不到任何现有的半监督对抗示例检测方法。请注意,这可能是未来研究中值得追求的方向。

3.3. Unsupervised Approaches

We refer to the detection techniques as unsupervised if they only utilize in-distribution data for adversarial detection.

如果检测技术仅利用分布内数据进行对抗检测,我们称之为无监督检测。

In [60], the probabilities of all the training images under the generative model (such as, PixelCNN) is computed. Then, for a test example, the probability density at the input is computed and its rank among the density values of all the training examples is evaluated. This rank can be used as a test statistic which gives a p-value for whether the example is normal or adversarial. The method improves resilience of the stateof-the-art methods against attacks and increases the detection accuracy by a significant margin. Further, the authors suggest purifying adversarial examples by searching for more probable images within a small distance of the original training ones. By utilizing L∞distance, the true labels of the purified images remains unchanged. The resulting purified images have higher probability under in-distribution so that the classifier trained on normal images will have more reliable predictions on these purified images. This intuition is used to build a more effective defense against adversarial attacks.

在[60]中,计算生成模型(例如,PixelCNN)下所有训练图像的概率。然后,对于测试示例,计算输入处的概率密度,并评估其在所有训练示例的密度值中的等级。这个排名可以作为一个测试统计,给出一个p值来判断这个例子是正常的还是敌对的。该方法提高了最先进的方法对攻击的恢复能力,并显著提高了检测精度。此外,作者建议通过在原始训练图像的小范围内搜索更可能的图像来净化敌对的例子。通过利用L∞距离,纯化图像的真实标签保持不变。所得的纯化图像在内分布下具有更高的概率,使得在正常图像上训练的分类器将对这些纯化图像具有更可靠的预测。这种直觉被用来建立更有效的防御对抗攻击。

The motivation for the method in [61] is that adversarial examples should be both (a) “too atypical” (i.e., have atypically low likelihood) under the density model for the DNNpredicted class, and (b) “too typical” (i.e., have too high a likelihood) under some class other than the DNN-predicted class. While it may seem that one requires to use two detection thresholds, they instead propose a single decision statistic that captures both requirements. Specifically, they define (a) a two-class posterior evaluated with respect to the (densitybased) null model, and (b) corresponding two-class posterior evaluated via the DNN. Both deviations (“too atypical” and “too typical”) are captured by the Kullback-Leibler divergence decision statistic. A sample is declared adversarial if this statistic exceeds a preset threshold value.

[61]中的方法的动机是对抗性例子应该是(a)在密度模型下对于dnnp预测类“太不典型”(即具有不典型的低似然性),和(b)在除了DNN预测类之外的某个类“太典型”(即具有太高似然性)。虽然看起来可能需要使用两个检测阈值,但是他们提出了一个单一的决策统计来捕获这两个需求。具体来说,它们定义了(a)相对于(基于密度的)零模型评估的二类后验,和(b)通过DNN评估的相应二类后验。这两种偏差(“太不典型”和“太典型”)都被库尔巴克-莱布勒散度决策统计捕获。如果该统计值超过预设阈值,则样本被声明为敌对样本。

The approach in [62] performs a kNN similarity search among the deep features obtained from the training images to a given test image classified by the DNN. They then use the score assigned by a kNN classifier to the class predicted by the DNN as a measure of confidence of the classification. Note that this approach does not rely on the classification produced by the kNN classifier, but only use the score assigned to the DNN prediction as a measure of confidence. The intuition behind this approach is that while it is unlikely that a class correctly predicted by the DNN has the highest kNN score among the scores of all the classes, it is implausible that a correct classification has a very low score. Results on the ImageNet dataset show that hidden layers activations can be used to detect misclassifications caused by various attacks.

[62]中的方法在从训练图像获得的深层特征中对由DNN分类的给定测试图像执行kNN相似性搜索。然后,他们使用由一个知识网络分类器分配给DNN预测的类别的分数作为分类置信度的度量。注意,这种方法不依赖于由kNN分类器产生的分类,而是仅使用分配给DNN预测的分数作为置信度的度量。这种方法背后的直觉是,虽然DNN正确预测的一个类别不可能在所有类别的分数中具有最高的kNN分数,但是正确的分类具有非常低的分数是不可信的。在ImageNet数据集上的结果表明,隐藏层激活可用于检测由各种攻击引起的误分类。

In [63], intrinsic properties of the pre-trained DNN, i.e., output distributions of the hidden neurons, are used to detect adversarial examples. Their motivation is that when the DNN incorrectly assigns an adversarial example to a specific class label, the distribution of its hidden states are very different as compared to those obtained by the normal data of the same class. They use Gaussian Mixture Model (GMM) to approximate the hidden state distribution of each class using benign training data. Likelihoods are then compared to the respective class thresholds to detect whether an example is adversarial or not. Experimental results on standard datasets (MNIST, F-MNIST, CIFAR-10) against several attack methods show that this approach can achieve state-of-the-art robustness in defending black-box and gray-box attacks.

在[63]中,预先训练的DNN的内在属性,即隐藏神经元的输出分布,被用来检测敌对的例子。他们的动机是,当DNN错误地将一个敌对的例子分配给一个特定的类别标签时,其隐藏状态的分布与通过同一类别的正常数据获得的状态分布非常不同。他们使用高斯混合模型(GMM)使用良性训练数据来近似每个类的隐藏状态分布。然后将可能性与相应的类别阈值进行比较,以检测一个示例是否具有对抗性。在标准数据集(MNIST、F-MNIST、CIFAR-10)上对几种攻击方法的实验结果表明,该方法在防御黑盒和灰盒攻击方面具有很好的鲁棒性。

The authors in [64] found that adversarial examples mainly exploit two attack channels: the provenance channel and the activation value distribution channel. The provenance channel imply instability of DNN output to small changes in activation values, which eventually leads to misclassification. On the other hand, the activation channel imply that while the provenance changes slightly, the activation values of a layer may be substantially different from those in the presence of benign inputs. Exploiting these observations they propose a method that extracts two kinds of invariants (or probability distributions denoted by models), the value invariants to guard the value channel and the provenance invariants to guard the provenance channel. This is achieved by training a set of models for individual layers to describe the activation and provenance distributions only using in-distribution inputs. In other words, invariant models are trained as a One-Class Classification (OCC) problem where all training samples are positive (i.e., in-distribution inputs in this context). At test time, an input is passed through all the invariant models which provide independent predictions about whether the input induces states that violate the invariant distributions. The final result is a joint decision based on all these predictions. Extensive experiments on various attacks, datasets and models suggest that this method can achieve consistently high detection accuracy on all different types of attacks, while the performance of baseline detectors is not consistent.

作者在[64]中发现,对抗性例子主要利用两种攻击渠道:起源渠道和激活值分布渠道。起源通道暗示DNN输出的不稳定性导致活化值的微小变化,这最终导致错误分类。另一方面,活化通道意味着,虽然物源略有变化,但一个层的活化值可能与良性输入下的活化值有很大不同。利用这些观察,他们提出了一种提取两种不变量(或由模型表示的概率分布)的方法,即保护价值通道的价值不变量和保护起源通道的起源不变量。这是通过为单个层训练一组模型来实现的,这些模型仅使用分布内输入来描述活化和物源分布。换句话说,不变模型被训练为一类分类(OCC)问题,其中所有训练样本都是正的(即,在这种情况下是分布内输入)。在测试时,一个输入通过所有的不变模型,这些不变模型提供关于输入是否引起违反不变分布的状态的独立预测。最终结果是基于所有这些预测的联合决策。对各种攻击、数据集和模型的大量实验表明,该方法能够对所有不同类型的攻击实现一致的高检测精度,而基线检测器的性能并不一致。

In [65], the idea is that inherent distance of adversarial perturbation from the training data manifold will cause the overall network uncertainty to exceed that of the normal example. To this end, random sampling of hidden units of each layer of a pre-trained network is used to introduce randomness and the overall uncertainty of a test image is quantified in terms of the hidden layer components. A mutual information based thresholding test is used to detect adversarial examples. The performance is further improved by optimizing over the sampling probabilities to minimize uncertainty. Experiments on the CIFAR10 and the cats-and-dogs datasets on deep state-of-the-art CNNs demonstrated the importance sampling parameter optimization, which readily translate to improved attack detection.

在[65]中,观点是对抗扰动与训练数据流形的固有距离将导致整个网络的不确定性超过正常例子的不确定性。为此,预先训练的网络的每一层的隐藏单元的随机采样被用于引入随机性,并且测试图像的整体不确定性根据隐藏层分量被量化。基于互信息的阈值测试用于检测敌对的例子。通过优化采样概率以最小化不确定性,进一步提高了性能。在CIFAR10和猫和狗数据集上的实验表明了采样参数优化的重要性,这很容易转化为改进的攻击检测。

Approaches such as [66] and [67] rely on projecting the test image to benign dataset manifold to detect adversarial examples. The underlying assumption in these approaches is that adversarial perturbations move the test image away from the benign image manifold and the effect of adversary can be nullified by projecting the images back onto the benign manifold before classifying them. As the true image manifold is unknown, various estimation techniques are used. For example, [66] use a sample approximation comprising a database of billions of natural images. On the other hand, [67] use a generative model trained on benign images to estimate the manifold. Given the estimated benign manifold, the projection is done by nearest neighbor search in [66] and gradientbased search in [67]. These methods are founds to be robust against gray-box and black-box attacks where the adversary is unaware of the defense strategy.

[66]和[67]等方法依赖于将测试图像投影到良性数据集流形来检测敌对示例。这些方法的基本假设是,敌对扰动使测试图像远离良性图像流形,并且在对图像进行分类之前,可以通过将图像投影回良性流形来抵消对手的影响。由于真实图像流形是未知的,所以使用各种估计技术。例如,[66]使用包含数十亿幅自然图像数据库的样本近似。另一方面,[67]使用在良性图像上训练的生成模型来估计流形。给定估计的良性流形,投影通过[66]中的最近邻搜索和[67]中的基于梯度的搜索来完成。这些方法被发现对对手不知道防御策略的灰箱和黑箱攻击是鲁棒的。

3.4. Other Miscellaneous Techniques

Here we discuss some other techniques that are used for adversarial example detection which do not fall in the aforementioned categorizations of the post-hoc processing.

在这里,我们讨论一些其他技术,用于敌对的例子检测,不属于上述分类的后特设处理。

In [69], various uncertainty measures, e.g., entropy, mutual information, softmax variance, for adversarial example detection are examined. Each of these measures capture distinct types of uncertainty and are analyzed from the perspective of adversarial example detection. The authors showed that only the mutual information gets useful detection performance on adversarial examples. In fact, most other measures of uncertainty seem to be worse than random guessing on MNIST and Kaggle dogs vs. cats classification datasets.

在[69]中,研究了各种不确定性度量,例如熵、互信息、最大软方差,用于敌对实例检测。这些措施中的每一项都捕捉到不同类型的不确定性,并从敌对示例检测的角度进行分析。作者证明了只有相互信息才能在对抗实例上获得有用的检测性能。事实上,大多数其他不确定性的衡量标准似乎比在MNIST和卡格尔狗与猫分类数据集上的随机猜测更糟糕。

The approach in [70] is motivated by the observation that the DNN feature spaces are often unnecessarily large, and this provides extensive degrees of freedom for an attacker to construct adversarial examples. The authors propose to reduce the degrees of freedom for constructing adversarial examples by “squeezing” out unnecessary input features. Specifically, they compare the model’s prediction of the original test example with its prediction of the test example after squeezing, i.e., reducing the color depth of images, and using smoothing to reduce the variation among pixels. If the original and the squeezed inputs produce substantially different predictions then the example is declared adversarial.

[70]中的方法的动机是观察到DNN特征空间通常是不必要的大,这为攻击者构造对抗性例子提供了广泛的自由度。作者建议通过“挤出”不必要的输入特征来减少构建对抗性例子的自由度。具体来说,他们将模型对原始测试示例的预测与其对压缩后的测试示例的预测进行比较,即,减少图像的颜色深度,并使用平滑来减少像素之间的变化。如果原始输入和压缩输入产生了实质上不同的预测,那么这个例子被宣布为对抗性的。

In [71] SafetyNet is proposed which consists of the original classifier, and an adversary detector which looks at the internal state of the later layers in the original classifier. Here, the output from the ReLU is quantized to generate a discrete code based on some set of thresholds. They claimed that different code patterns appear for natural examples and adversarial examples. An adversarial example detector (i.e., RBFSVM) is used that compares a code produced at test time with a collection of examples, i.e., an attacker must make the network produce a code that is acceptable to the detector which is shown to be hard.

在[71]中,提出了安全网络,它由原始分类器和对手检测器组成,对手检测器查看原始分类器中后面各层的内部状态。这里,ReLU的输出被量化,以基于某组阈值生成离散代码。他们声称不同的代码模式出现在自然例子和敌对例子中。使用对立的示例检测器(即RBFSVM),将测试时产生的代码与示例集合进行比较,即攻击者必须使网络产生检测器可接受的代码,该代码被证明是硬代码。

In [72], the method improves the naive Bayes used in many generative classifiers by combining it with variational auto-encoder. They propose three adversarial example detection methods. The first two use the learned generative model as a proxy of the data manifold, and reject inputs that are far away from it. The third computes statistics for the classifiers output probability vector, and rejects inputs that lead to under-confident predictions. Experimental results suggest that deep Bayes classifiers are more robust than deep discriminative classifiers, and that the detection methods based on deep Bayes are effective against various attacks.

在[72]中,该方法通过与变分自动编码器相结合,改进了许多生成分类器中使用的朴素贝叶斯。他们提出了三种对立的示例检测方法。前两个使用学习的生成模型作为数据流形的代理,并拒绝远离它的输入。第三种方法计算分类器输出概率向量的统计量,并拒绝导致预测不可靠的输入。实验结果表明,深度贝叶斯分类器比深度判别分类器更鲁棒,基于深度贝叶斯的检测方法对各种攻击都是有效的。

In [73], the authors propose to model the outputs of the various layers (deep features) with parametric probability distributions (Gaussian and Gaussian Mixture Models). At test time, the log-likelihood scores of the features of a test sample are calculated with respect to these distributions and used as anomaly score to discriminate in-distribution samples (which should have high likelihood) from adversarial examples (which should have low likelihood).

在[73]中,作者提出用参数概率分布(高斯和高斯混合模型)对各层(深层特征)的输出进行建模。在测试时,针对这些分布计算测试样本特征的对数似然分数,并将其用作异常分数,以区分分布内样本(应该具有高似然性)和对立样本(应该具有低似然性)。

The main idea in [74] is to combine kNN based distance measure [75] with influence function which is a measure of how much a test sample classification is affected by each training sample. The motivation behind this approach is that for an in-distribution input, its kNN training samples (nearest neighbors in the embedding space) and the most helpful training samples (found using the influence function) should correlate. However, this correlation is much weaker for adversarial examples, and serves as an indication of the attack.

[74]中的主要思想是将基于kNN的距离度量[75]与影响函数相结合,影响函数是测试样本分类受每个训练样本影响程度的度量。这种方法背后的动机是,对于内分布输入,其kNN训练样本(嵌入空间中的最近邻居)和最有用的训练样本(使用影响函数找到的)应该相关。然而,这种相关性对于敌对的例子来说要弱得多,并且作为攻击的指示。

The motivation in [76] is the observation that different neural networks presented with the same adversarial example will make different mistakes. The authors propose to use such mistake patterns for adversarial example detection. Experiments on the MNIST and CIFAR10 datasets show that such detection approach generalizes well across different adversarial example generation methods.

[76]中的动机是观察到不同的神经网络呈现相同的对抗性例子会犯不同的错误。作者建议使用这样的错误模式来检测敌对的例子。在MNIST和CIFAR10数据集上的实验表明,这种检测方法可以很好地推广到不同的对抗示例生成方法。

In [77] robust feature alignment is used to detect adversarial examples. By using an object detector, the authors first extract higher-level robust features contained in images. Next, the approach quantifies the similarity between the image’s extracted features with the expected features of its predicted class. A similarity threshold is finally used to classify a test sample as benign or adversarial.

在[77]中,稳健的特征对齐用于检测敌对的例子。通过使用对象检测器,作者首先提取包含在图像中的更高级别的鲁棒特征。接下来,该方法量化图像的提取特征与其预测类别的期望特征之间的相似性。相似性阈值最终用于将测试样本分类为良性或对抗性。

In [78], anomaly detection is performed by introducing random feature nullification in both training and testing phases that ensures the non-deterministic nature of the DNN. Here, the randomization introduced at the test time ensures that the models processing of the input decreases the effectiveness of the adversarial examples even if the attacker learns critical features.

在[78]中,异常检测是通过在训练和测试阶段引入随机特征无效来执行的,这确保了DNN的非确定性。这里,在测试时引入的随机化确保了模型对输入的处理降低了敌对示例的有效性,即使攻击者学习了关键特征。

In [79], three strategies are presented. First, regularized feature vectors are used to retrain the last layer of the CNN. This can be used to detect whether the input is adversarial. Second, histograms are created from the absolute values of the hidden layer outputs and are combined to form a vector which is used by the SVM to classify. Third, the input is perturbed to reinforce the parts of the input example that are ignored by the DNN which can then be used for adversarial example detection. Finally, the authors combine the best aspects of these methods to develop a more robust approach.

在[79]中,提出了三种策略。首先,使用正则化的特征向量来重新训练CNN的最后一层。这可以用来检测输入是否是对抗性的。第二,从隐藏层输出的绝对值创建直方图,并将其组合以形成SVM用于分类的向量。第三,对输入进行扰动,以加强输入示例中被DNN忽略的部分,这些部分可用于敌对示例检测。最后,作者结合了这些方法的优点,开发了一个更健壮的方法。

In [80], a framework is presented for enhancing the robustness of DNN against adversarial examples. The idea is to use locality-preserving hash functions to transform examples to enhance the robustness. The hash representations of the examples are reconstructed by using a denoising auto-encoder (DAE) that enables the DNN classifier to attain the locality information in the latent space. Moreover, the DAE can detect the adversarial examples that are far from the support of the underlying training distribution.

在[80]中,提出了一个框架,用于增强DNN对敌对例子的鲁棒性。其思想是使用保持局部性的散列函数来转换例子以增强鲁棒性。示例的散列表示通过使用去噪自动编码器来重构,该编码器使得DNN分类器能够获得潜在空间中的位置信息。此外,DAE可以检测远离底层培训分布支持的敌对示例。

4. RELATIVE STRENGTHS AND WEAKNESS

The supervised techniques usually have higher performance compared to other methods as they use the labeled examples from both normal and anomaly classes. They are able to learn the boundary from the labeled training examples and then more easily classify the unseen test examples into normal or anomaly classes. However, when training data for anomalies (the known unknowns) may not represent the full spectrum of anomalies, supervised approaches may overfit and perform poorly on unseen anomalous data (the unknown unknowns). Furthermore, due to the lack of availability of labeled anomalous examples, supervised techniques are not as popular as the semi-supervised or unsupervised techniques.

与其他方法相比,监督技术通常具有更高的性能,因为它们使用来自正常和异常类的标记示例。他们能够从标记的训练例子中学习边界,然后更容易地将看不见的测试例子分类为正常或异常类。然而,当针对异常(已知未知)的训练数据可能不能代表异常的全部范围时,监督方法可能会过度训练,并且在看不见的异常数据(未知未知未知)上表现不佳。此外,由于缺乏标记的异常示例,监督技术不如半监督或无监督技术受欢迎。

Unsupervised techniques are quite flexible and broadly applicable as they do not rely on the availability of the anomalous data and corresponding labels. The techniques learn inherent characteristics or unique features solely from indistribution data that are useful in separating normal from anomalous examples. Unfortunately, this flexibility comes at the cost of robustness – the unsupervised techniques are very sensitive to noise, and data corruptions and are often less accurate than supervised or semi-supervised techniques.

无监督技术非常灵活,可广泛应用,因为它们不依赖于异常数据和相应标签的可用性。这些技术仅从分布数据中学习固有特征或独特特征,这些数据有助于区分正常和异常示例。不幸的是,这种灵活性是以鲁棒性为代价的——无监督技术对噪声和数据损坏非常敏感,并且通常不如有监督或半监督技术准确。

Semi-supervised techniques exploit unlabeled data in addition to labeled in-distribution data to improve the performance of unsupervised techniques. Though, whether unlabeled data is in-distribution or anomaly is not known, it is observed that unlabeled data is helpful in improving the performance of anomaly detection. Note that unlabeled data can be obtained easily in real-world applications making semisupervised techniques amenable in practice. These methods also suffer from the overfitting problem on unseen anomalies.

半监督技术除了利用标记的分布内数据之外,还利用未标记的数据来提高非监督技术的性能。尽管未标记数据是分布的还是异常的还不知道,但是观察到未标记数据有助于提高异常检测的性能。请注意,在现实世界的应用程序中可以很容易地获得未标记的数据,这使得半监督技术在实践中变得可行。这些方法还存在着对看不见的异常过度拟合的问题。

Distance-based methods, e.g., kNN approaches, require appropriate distance measure to be defined a priori. Most distance measures are not effective in high-dimension. Further, such methods are typically heuristic and require manual selection of parameters. Projection-based methods, e.g., GAN approaches, are very flexible and address the highdimensionality challenge. However, their performance is heavily dependent on the quality of the image manifold estimate. In certain applications, it may not be easy to estimate the image manifold with sample approximation or generative modeling. Probabilistic methods, e.g., density estimation approaches, make use of the distribution of the training data or features to determine the location of the anomaly boundary. The performance of such methods is very poor in the small data regime as reliable estimates cannot be obtained. Uncertainty-based methods, e.g., entropy approaches, require a metric that is sensitive enough to detect the effects of anomalies in the dataset. Although these methods are easy to implement in practice, the performance of such methods is highly dependent on the the quality of uncertainties. Uncertainty quantification in DL is an ongoing research topic and high quality uncertainty estimates will surely improve the performance of uncertainty-based methods.

基于距离的方法,例如kNN方法,需要预先定义适当的距离度量。大多数距离度量在高维空间中是无效的。此外,这种方法通常是启发式的,需要手动选择参数。基于投影的方法,如遗传神经网络方法,非常灵活,解决了高维挑战。然而,它们的性能严重依赖于图像流形估计的质量。在某些应用中,用样本逼近或生成建模来估计图像流形可能并不容易。概率方法,例如密度估计方法,利用训练数据或特征的分布来确定异常边界的位置。这种方法在小数据范围内的性能很差,因为无法获得可靠的估计。基于不确定性的方法,例如熵方法,需要足够敏感的度量来检测数据集中异常的影响。虽然这些方法在实践中很容易实现,但这些方法的性能高度依赖于不确定性的质量。不确定性量化是一个正在进行的研究课题,高质量的不确定性估计必将提高基于不确定性的方法的性能。

The computational complexity of these methods is another important aspect to consider. In general, probabilistic and uncertainty-based methods have computationally expensive training phases, however efficient testing. On the other hand, distance-based and projection-based methods, in general, are computationally expensive in the test phase. Depending on the application requirements, a user should choose the most appropriate anomaly detection method.

这些方法的计算复杂性是另一个需要考虑的重要方面。一般来说,基于概率和不确定性的方法有计算量很大的训练阶段,但是测试效率很高。另一方面,基于距离和基于投影的方法通常在测试阶段计算量很大。根据应用要求,用户应该选择最合适的异常检测方法。

5. APPLICATION DOMAINS

In this section, we briefly discuss several applications of OOD and adversarial example detection. We also suggest future research that is needed for these application domains.

在这一节中,我们简要地讨论了ODD和对抗实例检测的几种应用。我们还建议对这些应用领域需要的未来研究。

Intrusion Detection - An Intrusion Detection System is a system that monitors network traffic for suspicious activity and issues alerts when such activity is discovered. A key challenge for intrusion detection is the huge volume of data and sophisticated malicious patterns. Therefore, DL techniques are quite promising in the intrusion detection application.

入侵检测-入侵检测系统是一个监控网络流量的可疑活动,并在发现此类活动时发出警报的系统。入侵检测的一个关键挑战是巨大的数据量和复杂的恶意模式。因此,DL技术在入侵检测应用中很有前景。

In [81], a neural network based intrusion detector is trained to identify intruders. In [82], a deep hierarchical model is proposed for intrusion detection. The model is a combination of a restricted Boltzmann machine (RBM) for unsupervised feature learning and a supervised learning network called as Backpropagation network. In [83], a network intrusion model is proposed where feature learning is performed by stacking dilated convolutional autoencoders. These feature are then used to train a softmax classifier to perform supervised intrusion detection. In [84], an autoencoder based model in combination with a stochastic anomaly threshold determination method is proposed for intrusion detection. The algorithm computes the threshold using the empirical mean and standard deviation which are found from training set via the trained autoencoder.

在[81]中,基于神经网络的入侵检测器被训练来识别入侵者。在[82]中,提出了一种用于入侵检测的深度层次模型。该模型是用于无监督特征学习的受限玻尔兹曼机器(RBM)和被称为反向传播网络的有监督学习网络的组合。在[83]中,提出了一种网络入侵模型,其中特征学习是通过堆叠扩展卷积自动编码器来执行的。这些特征然后被用于训练softmax分类器以执行监督入侵检测。在[84]中,结合随机异常阈值确定方法,提出了一种基于自动编码器的模型用于入侵检测。该算法使用经验平均和标准偏差来计算阈值,经验平均和标准偏差是通过训练的自动编码器从训练集中找到的。

As mentioned earlier, these DL based systems are equally susceptible to both OOD and adversarial examples [85, 86, 87]. In [85], the authors analyze the performances of the stateof-the-art attack algorithms against DL-based intrusion detection. The susceptibility of DNNs used in the intrusion detection system is validated by experiments and the role of individual features is also explored. The authors in [86] demonstrated that an adversary can generate effective adversarial examples against DL based intrusion detection systems even when the internal information of the target model is not available to the adversary. Note that in intrusion detection applications, a large amount of labeled data corresponding to normal behavior is usually available, while labels for intrusions are not. Therefore, semi-supervised and unsupervised OOD and adversarial example detection techniques discussed in the previous sections are worthwhile directions to pursue.

如前所述,这些基于DL的系统同样容易受到ODD和对抗性例子的影响[85,86,87]。在[85]中,作者分析了最先进的攻击算法对基于DL的入侵检测的性能。通过实验验证了入侵检测系统中使用的DNNs的敏感性,并探讨了个体特征的作用。作者在[86]中证明,即使当目标模型的内部信息对对手不可用时,对手也可以针对基于DL的入侵检测系统生成有效的对抗示例。请注意,在入侵检测应用程序中,通常有大量对应于正常行为的标记数据可用,而入侵的标记则不可用。因此,在前几节中讨论的半监督和无监督的ODD和敌对示例检测技术是值得追求的方向。

Fraud Detection - Fraud detection refers to detection of fraudulent activities occurring in many e-commerce domains, such as, banking, insurance, law enforcement, etc. A good fraud detection system should be able to identify the fraudulent transactions accurately and should make the detection possible in real-time. There is an increase in interest in applying DL techniques in fraud detection systems. In [88], fraud detection is modeled as a sequence classification task. An LSTM is used to generate transaction sequences and incorporate aggregation functions like mean, absolute value to aggregate the learned features for fraud detection. Furthermore, in [89], feature sequencing is performed using CNNs for detecting transaction fraud. Recently, the authors in [90] analyzed the vulnerability of deep fraud detector to adversarial examples, i.e., slight perturbations in input transactions designed to fool the fraud detector. They show that the deployed deep fraud detector is highly vulnerable to attacks as the average precisionis decreased from 90% to as low as 20%.

欺诈检测-欺诈检测是指检测许多电子商务领域中发生的欺诈活动,如银行、保险、执法等。一个好的欺诈检测系统应该能够准确地识别欺诈交易,并且应该能够实时检测。人们越来越有兴趣在欺诈检测系统中应用DL技术。在[88]中,欺诈检测被建模为序列分类任务。LSTM用于生成交易序列,并结合均值、绝对值等聚合函数来聚合已学习的特征,以进行欺诈检测。此外,在[89]中,使用CNNs进行特征测序以检测交易欺诈。最近,作者在[90]中分析了深度欺诈检测器对敌对示例的脆弱性,即设计来欺骗欺诈检测器的输入交易中的轻微扰动。它们表明,部署的深度欺诈检测器极易受到攻击,因为平均准确率从90%下降到20%。

This motivates the study of the effect of unintentional and intentional anomalies in deep fraud detection systems. Techniques discussed in the previous sections will be applicable for such a problem and are potential viable solutions for designing robust deep fraud detection systems.

这激发了对深度欺诈检测系统中无意和有意异常的影响的研究。前面章节中讨论的技术将适用于此类问题,并且是设计健壮的深度欺诈检测系统的潜在可行解决方案。

Anomaly Detection in Healthcare and Industrial Domains - Anomaly detection in the healthcare domain try to detect abnormal patient conditions or instrumentation errors. Anomaly detection is a very critical problem in this domain and requires high degree of accuracy. Similarly, in industrial systems like wind turbines, power plants, and storage devices which are exposed to large amounts of stress on a daily basis, it is critical to detect any damages as quickly as possible. The medical abnormalities and industrial damage are rare events and detecting them can be modeled as an anomaly detection problem. Therefore, there is a surge of interest in applying DL in both medical [91] and industrial application domains [92].

医疗保健和工业领域的异常检测-医疗保健领域的异常检测试图检测异常患者状况或仪器错误。异常检测是这个领域的一个非常关键的问题,需要很高的精确度。类似地,在工业系统中,如风力涡轮机、发电厂和存储设备,它们每天都暴露在大量的压力下,尽快检测到任何损坏是至关重要的。医疗异常和工业损害是罕见的事件,检测它们可以建模为异常检测问题。因此,人们对在医学领域[91]和工业应用领域[92]应用DL的兴趣激增。

Unfortunately, similar to other DL applications, these systems are equally susceptible to OOD and adversarial examples. For example, the authors in [93] demonstrated that adversarial examples are capable of manipulating DL systems across three clinical domains: diabetic retinopathy from retinal fundoscopy, pneumothorax from chest-Xray, and melanoma from dermoscopic photographs.

不幸的是,类似于其他DL应用程序,这些系统同样容易受到ODD和对抗性例子的影响。例如,作者在[93]中证明了敌对的例子能够在三个临床领域操纵糖尿病视网膜病变系统:视网膜底镜检查的糖尿病视网膜病变、胸部x光检查的气胸和皮肤镜照片的黑色素瘤。

This motivates the study of the effect of anomalies in DL based healthcare and industrial systems. Techniques discussed in the previous sections can be used for designing robust healthcare and damage detection systems.

这激发了对基于DL的医疗保健和工业系统中异常影响的研究。前面章节中讨论的技术可用于设计强大的医疗保健和损害检测系统。

Malware Detection - Malware detection focuses on detecting malware software by monitoring the activity of the computer systems and classifying it as normal or anomalous. The velocity, volume, and the complexity of malware are posing new challenges to the anti-malware community. Current state-of-the-art research shows that recently, researchers started applying machine learning and DL methods for malware analysis and detection [94]. In [78], malware detection is performed by introducing random feature nullification in both training and testing phases that ensures the non-deterministic nature of the DNNs. Intuitively, the nondeterministic nature ensures that the models processing of the input decreases the effectiveness of the adversarial examples even if the attacker learns critical features. Furthermore, in [95], a stacked autoencoders model is used for malware detection. The model employs a greedy layerwise training operation for unsupervised feature learning and supervised parameter tuning. Furthermore, in [96], fake malware is generated and is learned to distinguish from the real data using a novel GAN architecture.

恶意软件检测-恶意软件检测侧重于通过监控计算机系统的活动并将其分类为正常或异常来检测恶意软件。恶意软件的速度、数量和复杂性给反恶意软件社区带来了新的挑战。当前最先进的研究表明,最近,研究人员开始将机器学习和DL方法应用于恶意软件分析和检测[94]。在[78]中,恶意软件检测是通过在训练和测试阶段引入随机特征无效来执行的,这确保了DNNs的非确定性。直观地说,这种不确定性确保了模型对输入的处理降低了对抗性例子的有效性,即使攻击者学习了关键特征。此外,在[95]中,堆叠自动编码器模型用于恶意软件检测。该模型采用贪婪分层训练操作进行无监督特征学习和有监督参数调整。此外,在[96]中,假恶意软件被生成,并被学习使用新颖的GAN体系结构来区别于真实数据。

Authors in [97, 98] expanded on existing adversarial example crafting algorithms to construct a highly-effective attack against malware detection models. Using the augmented adversarial crafting algorithm, authors managed to mislead the malware detection classifier for 63% of all malware samples. In [80], the authors analyzed the effect of several attacks on the Android malware classification task.

作者在[97,98]中扩展了现有的敌对示例制作算法,以构建针对恶意软件检测模型的高效攻击。使用增强的对抗式手工算法,作者设法误导了63%的恶意软件样本的恶意软件检测分类器。在[80]中,作者分析了几种攻击对Android恶意软件分类任务的影响。

Given the susceptibility of the state-of-the-art malware detection classifiers to adversarial examples, it will be useful to utilize OOD and adversarial example detection techniques in deep malware detection systems.

鉴于最先进的恶意软件检测分类器对敌对示例的敏感性,在深度恶意软件检测系统中利用ODD和敌对示例检测技术将是有用的。

Time Series and Video Surveillance Anomaly Detection - The task of detecting anomalies in multivariate time series data is quite challenging. Hence, efficient detection of multivariate time series anomalies is critical for fault diagnostics. RNN and LSTM based methods perform well in detecting anomalies in multivariate time series data. In [99], a generic framework based on DL for detecting anomalies in multivariate time series data is presented. Deep attention based models are used in [100] for anomaly detection for effective detection of anomalies. Many works have applied the deep learning models for video surveillance anomaly detection in [101, 102, 103].

时间序列和视频监控异常检测——在多变量时间序列数据中检测异常的任务非常具有挑战性。因此,多元时间序列异常的有效检测对于故障诊断至关重要。基于RNN和LSTM的方法在检测多变量时间序列数据中的异常方面表现良好。在[99]中,提出了一个基于DL的通用框架,用于检测多变量时间序列数据中的异常。[100]中使用基于深度注意力的模型进行异常检测,以有效检测异常。许多工作已经将深度学习模型应用于[101,102,103]中的视频监控异常检测。

Unfortunately, some recent papers [104, 105] have shown that one can design adversarial examples on time-series classifiers as well. Thus, in our opinion, future researchers should incorporate OOD and adversarial example detectors in their time series classification systems to improve the resilience and consider model robustness as an evaluative metric.

不幸的是,最近的一些论文[104,105]表明,人们也可以在时间序列分类器上设计对抗性的例子。因此,在我们看来,未来的研究人员应该在他们的时间序列分类系统中结合ODD和敌对的示例检测器,以提高弹性,并将模型鲁棒性作为一个评估指标。

6. CONCLUSION AND OPEN QUESTIONS

In this survey, we discussed various techniques for detecting OOD and adversarial examples given a pre-trained DNN. For each category of anomaly detection techniques, we discussed the strengths and weaknesses of these techniques. Finally, we discussed various application domains where the post-hoc processing, as well as, training based anomaly detection techniques are applicable.

在这项研究中,我们讨论了各种检测ODD方法和对抗的例子,给出了一个预先训练的DNN。对于每一类异常检测技术,我们都讨论了这些技术的优缺点。最后,我们讨论了后处理以及基于训练的异常检测技术适用的各种应用领域。

There are several open issues and worthwhile future directions for further research. Several of these are identified by analyzing and comparing existing literature and the research considered in this survey.

有几个开放的问题和值得进一步研究的未来方向。其中一些是通过分析和比较现有文献和本次研究中考虑的研究确定的。

Methods: We classified anomaly detection algorithms based on the availability of the labels of anomalous examples and the type of metrics used. Based on the availability of the labels, the techniques are classified as supervised, semisupervised, and unsupervised. Based on the type of metric, the techniques are classified as probability-based, distancebased, projection-based, and uncertainty-based. Each category of methods have their own strengths and weaknesses, and faces different challenges as discussed in Section 4. We conjecture that exploration of ensemble detection approaches can be a worthwhile future direction. The ensemble approach combines outputs of multiple detectors offering complementary strengths into a single one, thus yielding better performance compared to using individual detectors.

方法:我们根据异常示例标签的可用性和使用的度量类型对异常检测算法进行分类。基于标签的可用性,这些技术被分为有监督的、半监督的和无监督的。根据度量的类型,这些技术分为基于概率的、基于距离的、基于投影的和基于不确定性的。每一类方法都有各自的优势和劣势,并面临不同的挑战,如第4节所述。我们推测,探索集成检测方法可能是一个有价值的未来方向集成方法将提供互补强度的多个检测器的输出组合成单个检测器,因此与使用单个检测器相比,性能更好

Defining Anomalies: Majority of the research on detecting OOD and adversarial examples in DL focuses on detecting independent anomalies (e.g., adversarial examples generated independently from one another). However, anomalous behaviors can be much more complex requiring more sophisticated detection approaches than currently available. An example of this is discussed in [106] where a simple correlated anomaly generation approach was discussed. It was shown that current defenses are not capable of defending against this simple scheme. Further, defining collective and contextual anomalies [107] in the context of OOD and adversarial examples in DL can be very interesting and detecting them will certainly require the development of a new class of detectors. Also, we want to emphasize that it is important for future research on anomaly detection to be cognizant of the fact that anomalies may not adhere to our definitions and assumptions and can have extremely complex unknown behavior. This is similar to the concept of unknown-unknowns [108]. We believe that the research on domain generalization [109] and meta learning [110] can be used to solve some of these issues.

定义异常:大多数关于检测ODD设计和DL中对抗实例的研究集中于检测独立的异常。例如,彼此独立生成的对抗实例。然而,异常行为可能更加复杂,需要比目前更复杂的检测方法。在[106]中讨论了一个例子,其中讨论了一种简单的相关异常生成方法。结果表明,目前的防御措施无法抵御这个简单的计划。此外,在ODD环境中定义集体和上下文异常[107]以及在描述语言中定义敌对示例可能非常有趣,检测它们肯定需要开发一类新的检测器。此外,我们要强调的是,对于异常检测的未来研究,重要的是要认识到这样一个事实,即异常可能不符合我们的定义和假设,并且可能具有极其复杂的未知行为。这类似于未知-未知的概念[108]。我们相信对领域概括[109]和元学习[110]的研究可以用来解决其中的一些问题

Going beyond image classification: Most of the papers discussed in this survey (and in the literature) focus on the detection of anomalous examples in DNN based image classification problems. However, in recent years there has been a surge of interest in applying DL on other data types, e.g. text, graphs, trees, manifolds etc. These data types are ubiquitous in several high-impact applications including bioinformatics, neuroscience, social sciences, and molecular chemistry. Unfortunately, DL approaches in these data types also suffer from the existence of OOD and adversarial examples [111]. Post-hoc detection of such anomalies has not received much attention. Furthermore, going beyond classification problems and exploring the design and the detection of anomalies in DL based object detection, control, and planning problems can be a high-impact future research direction.

超越图像分类:本研究(以及文献中)中讨论的大多数论文集中于基于DNN的图像分类问题中异常示例的检测。然而,近年来,人们对在其他数据类型(如文本、图形、树、流形等)上应用DL的兴趣激增。这些数据类型在包括生物信息学、神经科学、社会科学和分子化学在内的一些高影响力应用中无处不在。不幸的是,这些数据类型中的数据描述语言方法也受到ODD和对抗性例子的影响[111]。这种异常的事后检测没有得到太多的关注。此外,超越分类问题,探索基于DL的对象检测、控制和规划问题中的设计和异常检测,可能是一个影响深远的未来研究方向

Performance Evaluation: Reliably evaluating the performance of OOD and adversarial example detection methods has proven to be extremely difficult. Previous evaluation methods are found to be ineffective and performing incorrect or incomplete evaluations [112, 113]. Absence of a standard definition for anomalies makes this problem very challenging. Furthermore, as anomalies become more sophisticated, it may become even harder to reliably evaluate the detection performance. Majority of current approaches evaluate the performance of anomaly detectors on OOD and adversarial examples. Assuming that training data may not represent the full spectrum of anomalies, this evaluation approach raises the risk of overfitting. Ideally, one should adopt an evaluation method that can assess the detection performance on adaptive and unseen anomalies (the unknown unknowns) over methods that only can assess the detection performance on previously seen anomalies (the known unknowns). Due to these reasons, there is an immediate need for designing principled benchmarks to reliably evaluate the anomaly detection performance [113, 114].

性能评估:事实证明,可靠地评估ODD设计和对立示例检测方法的性能极其困难。以前的评估方法被发现是无效的,执行不正确或不完整的评估[112,113]。缺乏对异常的标准定义使得这个问题非常具有挑战性。此外,随着异常变得更加复杂,可靠地评估检测性能可能变得更加困难。当前的大多数方法评估异常检测器在ODD和对抗示例上的性能。假设训练数据可能不代表全部异常,这种评估方法会增加过度拟合的风险。理想情况下,应该采用一种评估方法,该方法可以评估自适应和不可见异常(未知未知)的检测性能,而不是只能评估以前看到的异常(已知未知)的检测性能的方法。由于这些原因,迫切需要设计有原则的基准来可靠地评估异常检测性能[113,114]。

Theoretical analysis and Fundamental Limits: Finally, we need to make efforts on the theoretical front to understand the nature of the anomaly detection problem in DL-based systems. In the recent past, a pattern has emerged in which the majority of heuristics based defenses (both posthoc detection and training based) are easily broken by new attacks [115, 112]. Therefore, the development of a coherent theory and methodology that guides practical design for anomaly detection in DL-based systems [116], and fundamental characterizations of the existence of adversarial examples [117] is of utmost importance. How to leverage special learning properties such as the spatial and temporal consistencies to identify OOD examples [118, 119] also worth further exploration.

理论分析和基本限制:最后,我们需要在理论方面努力理解基于DL的系统中异常检测问题的本质。在最近的过去,出现了一种模式,其中大多数基于启发式的防御(基于事后检测和训练)很容易被新的攻击破坏[115,112]。因此,开发一个连贯的理论和方法来指导基于DL的系统中异常检测的实际设计[116],以及敌对实例存在的基本特征[117]是至关重要的。如何利用空间和时间一致性等特殊学习属性来识别ODD例子[118,119]也值得进一步探索

To summarize, OOD and adversarial example detection in DL-based systems is an open problem. We highlighted several aspects of the problem to be understood on both theoretical and algorithmic front to improve the effectiveness and feasibility of anomaly detection. We hope that this survey will provide a comprehensive understanding of the different approaches, show the bigger picture of the problem, and suggest few promising directions for researchers to pursue in further investigations on the anomaly detection in DL-based systems.

综上所述,ODD和对抗性实例检测是一个开放的问题。为了提高异常检测的有效性和可行性,我们强调了在理论和算法方面需要理解的问题的几个方面。我们希望这项研究将提供对不同方法的全面理解,展示问题的更大图景,并为研究人员在基于DL的系统中进行异常检测的进一步研究提供一些有希望的方向。

Acknowledgement

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

【Paper】Anomalous Instance Detection in Deep Learning:A Survey相关推荐

  1. 【RS-Attack】Data Poisoning Attacks to Deep Learning Based Recommender Systems NDSS‘21

    Data Poisoning Attacks to Deep Learning Based Recommender Systems NDSS'21 首个在基于深度学习的推荐系统中进行投毒攻击的研究.文 ...

  2. 【Paper】2011_Distributed fault detection for interconnected second-order systems

    Shames I, Teixeira A M H, Sandberg H, et al. Distributed fault detection for interconnected second-o ...

  3. 【文献翻译】Epileptic Seizures Detection Using Deep Learning Techniques: A Review

    利用深度学习技术检测癫痫发作:综述 文章目录 摘要 1 - 引言 2 - 基于深度学习技术的癫痫发作检测 A. 癫痫数据集 B. 预处理 C. 深度学习技术回顾 1) 卷积神经网络(CNN) 2D卷积 ...

  4. 【IJCAI 2016】Modularity Based Community Detection with Deep Learning 阅读小记

    一.动机   从低秩嵌入的角度来看,现有的多种社团检测算法中有两类代表:随机模型(Stochastic Model)和模块度最大化模型(Modularity Optimization Model).在 ...

  5. 论文阅读(Xiang Bai——【arXiv2016】Scene Text Detection via Holistic, Multi-Channel Prediction)...

    Xiang Bai--[arXiv2016]Scene Text Detection via Holistic, Multi-Channel Prediction 目录 作者和相关链接 方法概括 创新 ...

  6. 【Paper】2021_Distributed Consensus Tracking of Networked Agent Systems Under Denial-of-Service Attack

    Y. Wan, G. Wen, X. Yu and T. Huang, "Distributed Consensus Tracking of Networked Agent Systems ...

  7. 【Paper】2017_水下潜航器编队海洋勘测的协调控制方法研究

    友情链接:[paper]2019_Consensus Control of Multiple AUVs Recovery System Under Switching Topologies and T ...

  8. 【Paper】2019_Consensus Control of Multiple AUVs Recovery System Under Switching Topologies and Time D

    Zhang W, Zeng J, Yan Z, et al. Consensus control of multiple AUVs recovery system under switching to ...

  9. 【Paper】2009_Controllability of Multi-Agent Systems from a Graph-Theoretic Perspective 精炼版

    详细版请参考:[Paper]2009_Controllability of Multi-Agent Systems from a Graph-Theoretic Perspective 文章目录 5. ...

  10. 【Paper】2015_El H_Decentralized Control Architecture for UAV-UGV Cooperation

    Decentralized Control Architecture for UAV-UGV Cooperation 1 Introduction 2 Problem Statement and Ar ...

最新文章

  1. python字符照片_python图片转字符图片
  2. 使用JNDI+连接池
  3. python反转链表和成对反转
  4. iphone开发中数据持久化之——属性列表序列化(一)
  5. python中的__new__和__init__
  6. toj 4317 多连块拼图
  7. url decode problem
  8. Android网络编程http派/申请服务
  9. vue指令02---自动获取焦点(全局自定义指令Vue.directive())和全局过滤器Vue.filter() 的学习...
  10. php中月份以星期为单位,PHP的月份第二个星期六
  11. Centos Openssl升级
  12. java swing如何设置jtextarea对齐方式_【爵士钢琴】一次搞懂爵士经典Swing节奏!
  13. React Native基础知识
  14. SAS硬盘和SATA硬盘的区别与介绍
  15. 什么是Zigbee,主要有哪些特点,主要应用于哪些领域?
  16. 民进自强进修学院 计算机,#民进自强#中复班学生周记摘录
  17. forward与sendRedirect区别
  18. flutter整合极光推送完美版
  19. 10天内我国痛失20位两院院士!原中科院副院长王佛松逝世,享年89岁
  20. 从哪些维度评判代码质量的好坏?如何具备写出高质量代码的能力?

热门文章

  1. Hive元数据存储和表数据存储
  2. 什么是超级立方体,HyperCube
  3. You are what you read 笔记
  4. idea 引入包报错:Unable to provision, see the following errors
  5. 安庆集团-冲刺日志(第一天)
  6. photo的复数是photos
  7. 基因组Masked作用
  8. 最新中文行业垂直搜索引擎大全
  9. docker file详细介绍
  10. 获取设备唯一编号替代IMEI新方案