[深度学习论文笔记][Adversarial Examples] Deep Neural Networks are Easily Fooled: High Confidence Predictions

Nguyen, Anh, Jason Yosinski, and Jeff Clune. “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images.” 2015 IEEE Conference on Com-

puter Vision and Pattern Recognition (CVPR). IEEE, 2015. (Citations: 190).

1 Motivation
Produce images that are completely unrecognizable to humans, but CNNs believe to be recognizable objects with 99.99% confidence.

2 Implication of Adversarial Examples

For example, one can imagine a security camera that relies on face or voice recognition being compromised. Swapping white-noise for a face, fingerprints, or a voice might be
especially pernicious since other humans nearby might not recognize that someone is attempting to compromise the system.

Another area of concern could be image-based search engine rankings: background patterns that a visitor does not notice could fool a CNN-driven search engine into thinking a

page is about an altogether different topic.

3 Gradient Based

Like [Simonyan et al. 2013], compute adversarial examples by optimization, but we are optimize wrt the posterior probability (softmax output).

Starting with random noise, repeat update until the CNN confidence for the target class reaches 99.99%. Adding regularization makes images more recognizable but results in
slightly lower confidence scores. See the results in Fig.

4 GA Based
4.1 GA Approach
See Fig.

• Organisms: synthetic images that are used to fool CNN.
• fitness function: highest prediction value a CNN makes for that image belonging to a class.

• selection: MAP-Elites: keep the best individual found so far for each objective (each class).

4.2 Direct Encoding
Three integers (H, S, V) for each pixel of the image. Each pixel value is initialized with uniform random noise within [0, 255] range.

Each pixel value has probability p of being mutated by a polynomial mutation operator. p starts with 0.1 and drops by half every 1000 generations.

4.3 Indirect Encoding
Use compositional pattern-producing network (CPPN). CPPN is similar to CNN, which takes in the (x, y) position of a pixel as input, and outputs the tuple of HSV values for
that pixel. CPPN can evolve complex, regular images that resemble natural and man-made objects.

Evolution determines the topology, weights, and activation functions of each CPPN network in the population. Thus, elements in the genome can affect multiple parts of the
iamge. CPPN networks start with no hidden nodes, and nodes are added over time, encouraging evolution to first search for simple, regular images before adding complexity

4.4 Results
See Fig. 10.4. GA exploits specific discriminative features corresponding to each class learned by CNN. This is because evolution need only to produce features that are unique to, or discriminative for, a class, rather than produce an image that contains all of the typical features of a class. Many of the CPPN images feature a pattern repeated many times. The extra copies make the CNN more confident that the image belongs to the target class. It also show that CNN tend to learn low- and middle-level features rather than the global structure of objects.

Larger datasets are a way to make CNN more difficult to fool. Because there are more cat and dog classes, the GA had difficulty finding an image that scores high in a specific
dog category (e.g. Japanese spaniel), but low in any other related categories (e.g. Blenheim spaniel), which is necessary to produce a high confidence given that the final CNN layer is softmax. This explanation suggests that datasets with more classes can help ameliorate fooling.

It is not easy to prevent the CNN from being fooled by retraining them with fooling images (be recognize as a new “fooling images” class). While retrained CNN learn to
classify the negative examples as fooling images, a new batch of fooling images can be produced that fool these new networks, even after many retraining iterations.

5 Analysis
Discriminative models learn p(y|X) directly. They create decision boundaries that partition data into classification regions, see Fig. In a high-dimensional input space, the area a discriminative model allocates to a class may be much larger than the area occupied by training examples for that class. Synthetic images far from the decision boundary and deep into a classification region may produce high confidence predictions even though they are far from the natural images in the class. The large regions of high confidence exist in certain discriminative models due to a combination of their locally linear nature and high-dimensional input space.

In contrast, a generative model that represents the complete joint density p(X, y) = p(y|X)p(X). Such models may be more difficult to fool because fooling images could be
recognized by their low marginal probability p(X), and the CNNs confidence in a label prediction for such images could be discounted when p(X) is low.

6 References
[1]. Evolving AI Lab. https://www.youtube.com/watch?v=M2IebCN9Ht4.
[2]. CVPR 2015. http://techtalks.tv/talks/deep-neural-networks-are-easily-fooled-high-confidence-predictions-for-unrecog61573/.

[深度学习论文笔记][Adversarial Examples] Deep Neural Networks are Easily Fooled: High Confidence Predictions相关推荐

[论文阅读笔记]Deep Neural Networks are Easily Fooled:High Confidence Predictions for Unrecognizable Images
Deep Neural Networks are Easily Fooled:High Confidence Predictions for Unrecognizable Images(CVPR201 ...
论文-Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images
Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images Deep N ...
Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images
论文原文:Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images ...
[论文笔记]Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images
作者:Anh Nguyen, Jason Yosinski, Jeff Clune 链接:https://arxiv.org/pdf/1412.1897.pdf 摘要: 本文的工作基于Christia ...
Deep Neural Networks are Easily Fooled:High Confidence Predictions for Unrecognizable Images
在卷积神经网络如日中天的现在,重要会议上的论文自然成了广大学者研究的对象.苦苦寻觅,然而并不能搜到"大家们"对论文的见解.痛定思痛,决定对自己看过的论文写点小感.只是个人看法,如有 ...
[转载]Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images
论文内容总结首次提出假反例攻击,即:生成人类无法识别的图片,却能够让神经网络以高置信度分类成某个类别使用了多种不同的方法生成图片,称为"fooling images" 普通的E ...
Deep Neural Networks are Easily Fooled High Confidence Predictions for Unrecognizable Images
Layer-Wise Data-Free CNN Compression 我们的无数据网络压缩方法从一个训练好的网络开始,创建一个具有相同体系结构的压缩网络.这种方法在概念上类似于知识蒸馏[23],即 ...
对抗样本论文学习：Deep Neural Networks are Easily Fooled
近日看了一些对抗样本(adversarial examples)方面的论文,在这里对这些论文进行一下整理和总结. 以下仅代表个人理解,本人能力有限难免有错,还请大家给予纠正,不胜感激.欢迎一起讨论进步 ...
吴恩达深度学习2.1练习_Improving Deep Neural Networks(Initialization_Regularization_Gradientchecking)
版权声明:本文为博主原创文章,未经博主允许不得转载. https://blog.csdn.net/weixin_42432468 学习心得: 1.每周的视频课程看一到两遍 2.做笔记 3.做每周的作业 ...

[深度学习论文笔记][Adversarial Examples] Deep Neural Networks are Easily Fooled: High Confidence Predictions

[深度学习论文笔记][Adversarial Examples] Deep Neural Networks are Easily Fooled: High Confidence Predictions相关推荐

最新文章

热门文章