深度学习和机器博弈如何结合

Artificial Intelligence has known a great success in recent years as it provided us with powerful algorithms that use a large database to make accurate predictions or classifications. They are increasingly used for different purposes, including high-stake ones.

一个 rtificial智力已经知道了巨大的成功,近年来,因为它与使用大型数据库做出准确的预测或分类强大的算法为我们提供。 它们越来越多地用于不同的目的,包括高风险的目的。

And yet, they are not infallible.

但是,它们并不是绝对可靠的

In fact, most of these algorithms are trained on data that can be deliberately manipulated by an adversary looking to misguide it into making errors.

实际上,大多数这些算法都是在数据上训练的,而这些数据可能会被对手故意操纵以试图误导其导致错误。

Let’s take a simple example: email spam detection. At first, standard classifiers such as naïve Bayes were extremely efficient in terms of accuracy. However, spammers learned quickly how to fool them by changing “spam” worlds by their synonyms and adding more “non-spam” worlds. Consequently, spam filters were changed to detect these tricks. But spammers responded by using new ones. Therefore, this leads to an endless game between the defender and the attacker until an equilibrium state is reached.

让我们举一个简单的例子: 电子邮件垃圾邮件检测。 首先,诸如朴素贝叶斯之类的标准分类器在准确性方面非常高效。 但是,垃圾邮件发送者很快学会了如何通过用其同义词更改“垃圾邮件”世界并添加更多“非垃圾邮件”世界来欺骗他们的方法。 因此,垃圾邮件过滤器已更改为检测这些技巧。 但是,垃圾邮件发送者通过使用新邮件来应对。 因此,这导致了防御者和攻击者之间的无尽游戏,直到达到平衡状态

In this context, Game Theory can be very useful as it provides the mathematical tools that are needed to model the behaviors of the defender and the adversary behaviors in terms of defense and attack strategies.

在这种情况下, 博弈论非常有用,因为它提供了在防御和攻击策略方面建模防御者和对手行为的数学工具。

More specifically, game theory-based models enable to take into account:

更具体地说,基于博弈论的模型可以考虑以下因素:

  • the tradeoff made by the attacker between the cost of adapting to the classifier and the benefit he gains from his attack.

    攻击者在适应分类器的成本与其从攻击中获得的收益之间进行权衡。

  • the tradeoff made by the defender balance between the benefit of a correct attack detection and the cost of false alarm.

    防御者所做出的权衡是在正确的攻击检测收益与错误警报成本之间取得平衡。

Thus, game theory-based models can determine what suitable strategy is needed to reduce the defender’s loss from adversarial attacks.

因此,基于博弈论的模型可以确定需要哪种合适的策略来减少防御者因对抗性攻击而造成的损失

Spam filtering is not the only case for which these models can bring valuable information. This perspective can be used to describe many other situations with higher stakes: computer intrusion detection, fraud detection, aerial surveillance.

垃圾邮件过滤并不是这些模型可以带来有价值信息的唯一情况。 这种观点可以用来描述许多其他风险较高的情况:计算机入侵检测,欺诈检测,空中监视。

In this article, I will share with you my key findings about how to use Game theory for Adversarial Machine learning.

在本文中,我将与您分享有关如何将博弈论用于对抗性机器学习的主要发现

After reading this article, you will learn:

阅读本文后,您将学习:

  • How Game Theory can be used for in Machine Learning?博弈论如何用于机器学习?
  • How can Game Theory help in addressing adversarial learning problems?博弈论如何帮助解决对抗性学习问题?
  • How to make your Machine learning algorithms robust against adversarial attacks?如何使您的机器学习算法对对抗攻击具有鲁棒性?

基于博弈论方法的一个例子 (An Example of Game Theory-based Approach)

Let’s start with a simple example: spam detection.

让我们从一个简单的示例开始: 垃圾邮件检测。

The following section describes the game-theoretical model developed for adversarial learning by W. Liu and S. Chawal in the paper.

以下部分介绍了对抗性学习由W.刘和S. Chawal发达的博弈论模型纸 。

通用设置 (General Setting)

It can be modeled as a 2-player game between the Spammer (S) and the Defender (D).

可以将其建模为垃圾邮件发送者(S)防御者(D)之间的2人游戏。

  • The Spammer can choose 1) to attack the classifier by changing spam mails and get them through the spam filter, or 2) not to attack knowing that some spam emails might get through.

    垃圾邮件发送可以选择1)通过更改垃圾邮件并通过垃圾邮件过滤器来攻击分类器,或2)在知道某些垃圾邮件可能通过的情况下不进行攻击。

  • The Defender can choose 1) to retrain the classifier in order to maintain a low misclassification rate or 2) not to retrain the classifier despite the potential increase in spam mails misclassified.

    防御者可以选择1)重新训练分类器以保持较低的误分类率,或者2)尽管重新分类垃圾邮件的潜在数量增加,但不重新训练分类器。

We will assume that the Spammer will be the one to make the first move.

我们假设垃圾邮件发送者将是第一个采取行动的人。

There are 4 possible outcomes as shown in the graph below. It is possible to associate each scenario to a payoff for both players to reflect the relative ranking in terms of the final outcome.

如下图所示,有4种可能的结果 。 可以将每个方案与两个参与者的收益相关联,以反映最终结果方面的相对排名。

For instance, the scenario 2 is the worst scenario for the Defender and the best for the Spammer as his attacks on the non-retrained classifier will lead to a high number of misclassified spam mails.

例如, 方案2对于防御者来说是最糟糕的情况,而对于垃圾邮件发送者来说则是最好的,因为他对未经训练的分类程序的攻击将导致大量误分类的垃圾邮件

Game Tree between the Spammer and the Defender
垃圾邮件发送者和防御者之间的游戏树

型号定义 (Model definition)

The situation can be modeled as a Stackelberg game, i.e. a sequential game where there is a leader (the Spammer here ) and a follower (the Defender D).

可以将这种情况建模为Stackelberg游戏 ,即有领导者 (此处为垃圾邮件发送者)和跟随者 (防御者D)的顺序游戏。

Stackelberg games are usually used to model strategic interactions between rational agents in markets on which there is some hierarchical competition.

Stackelberg游戏 通常用于对存在一定等级竞争的市场中的理性主体之间的战略互动进行建模。

In this context, each player reacts to the other’s move by choosing an action among a set of possible actions U and V for S and D respectively. These sets are assumed to be bounded and convex.

在这种情况下,每个玩家通过分别从S和D的可能动作U和V中选择一个动作来对对方的动作做出React。 假定这些集合是有界的和凸的。

Each outcome is associated with a payoff function Js, and Jd. Payoff functions Ji are twice-differentiable mapping Ji(U,V) → R where R is a reaction.

每个结果都与回报函数Js和Jd相关联 收益函数Ji是二次微分映射Ji(U,V)→R ,其中R是React。

Therefore, it is possible to anticipate the reaction Ri of a player i as the one that maximizes his payoff i.e. :

因此,有可能将玩家i的ReactRi视为最大化其收益的玩家,即:

Moreover, the first to make a move can anticipate how the follower would rationally react and take it into account in his first decision. This is known as the rollback or the backward induction.

此外,第一个采取行动的人可以预期追随者将如何做出合理React,并在其第一个决定中将其考虑在内。 这称为回滚向后感应

This means that the first action of the spammers is the solution of the following optimization problem:

这意味着垃圾邮件发送者的首要行动是以下优化问题的解决方案:

And therefore, the Defender will choose the optimal solution:

因此,防御者将选择最佳解决方案:

This solution (u,v)is the Stackelberg equilibrium.

这个解(u,v)是Stackelberg平衡

Note that it is different from the Nash equilibrium where the two players of the game act simultaneously and the solution of the simultaneous equation is (0,0), i.e. no reaction from both players.

注意,这与纳什均衡不同, 纳什均衡中,游戏的两个参与者同时行动并且联立方程的解为(0,0) ,即两个参与者都没有React。

型号规格 (Model specification)

Now that we have defined the general setting, we still need to determine the players’ payoff function in the particular case of a classification problem.

既然我们已经定义了一般设置,在分类问题的特定情况下,我们仍然需要确定玩家的收益函数。

To simplify, we will first consider only one attribute. It can be then easily generalized to multiple attributes assuming they are conditionally independent given their class label.

为了简化,我们首先将仅考虑一个属性。 假设它们在给定其类别标签的条件下是独立的,则可以轻松地将其概括为多个属性。

First, what is the impact of the players’ moves?

首先,玩家行动的影响是什么?

Let’s define the following distributions:

让我们定义以下分布

  • P(μ’, σ): distribution of spam emails

    P(μ',σ) :垃圾邮件的分布

  • Q(μ, σ): distribution of non-spam emails

    Q (μ,σ) :非垃圾邮件的分布

with μ’<μ, as there are more non-spam emails than spam emails

μ'<μ,因为非垃圾邮件比垃圾邮件多

The adversary attacks by moving μ’ to μ’+u (i.e. towards μ). The defender reacts by moving boundary from 1/2 (μ’+μ’) to (also towards μ)

对手通过将μ'移至μ'+ u (即朝向μ )来进行攻击。 防御者的React是将边界从1/2( μ'+μ')移至(也移向μ )

Second, what are the payoffs of the players?

第二,玩家的收益是什么?

In order to evaluate the importance of a transformation u on the data, it is possible to use the Kullback–Leibler divergence KLD (also called relative entropy). It measures how a probability distribution is different from another reference probability distribution.

为了评估数据上转换u的重要性,可以使用Kullback-Leibler发散 KLD (也称为相对熵)。 它测量概率分布与另一个参考概率分布有何不同。

Spammer’s payoff: Since his goal is to increase the number of spam mails wrongly classified, his payoff can be expressed as follows:

垃圾邮件发送者的收益:由于他的目标是增加错误分类的垃圾邮件的数量,因此他的收益可以表示为:

where FNR is the increase of the False Negative Rate. Thus, α represents the strength of the cost penalty.

其中FNR是误报率的增加。 因此,α代表成本损失的强度。

Defender’s payoff: Since his goal is to increase the accuracy of the classification, his payoff can be expressed as follows:

防御者的收益:由于他的目标是提高分类的准确性,因此他的收益可以表示为:

Where TPR and TNR represents the increase of True Positive and True Negative rates. Thus, β controls the strength of the cost of retraining the classifier.

其中TPR和TNR代表真实正和真实负利率的增加。 因此,β控制着训练分类器成本的强度。

In this context, the equation below can be solved using a Genetic algorithm:

在这种情况下,可以使用遗传算法求解以下方程式:

The authors of the paper applied this methodology to find an equilibrium on synthetic and real data. Depending on the importance of the costs incurred to generate an attack and to retrain the classifier (modeled through α and β), they find different equilibria.

该论文的作者运用这种方法在合成数据和真实数据之间寻找平衡。 根据发生攻击和重新训练分类器(通过α和β建模)所产生的成本的重要性,他们找到不同的均衡。

基于博弈论方法的其他变体 (Other Variants of Game Theory-based Approaches)

The previous example relies on a set of assumptions: it models a game where the attacker and the defender compete sequentially one against another. It assumed the attacker is the first to make a move. It also supposed that the two players are aware of the benefits and the costs of their respective adversaries.

前面的示例基于一组假设:它对游戏进行了建模,攻击者和防御者相继竞争。 它假定攻击者是第一个采取行动的人 。 还假定这两个参与者都知道各自对手的利益和代价

Nevertheless, these assumptions don’t always hold. Fortunately, a wide range of models relying on a game-theoretic approach exists. They can be classified as follows.

但是,这些假设并不总是成立。 幸运的是,存在大量依赖于博弈论方法的模型。 它们可以分类如下。

零和与非零和博弈 (Zero-sum vs. Non-zero Sum Game)

In a 2-player zero game, the gain of the attacker is equal to the costs of the loss of the defender and vice-versa. This means that the players’ utilities sum to 0. This assumption can very pessimistic as the utility loss of the defender can be inferior to the utility of the adversary.

在2人零游戏中, 攻击者的收益等于防御者损失的成本,反之亦然。 这意味着玩家的效用之和为0。这个假设可能非常悲观,因为防御者的效用损失可能次于对手的效用。

同时游戏与顺序游戏 (Simultaneous Game vs. Sequential Game)

In simultaneous move games, players choose simultaneously their strategy without observing each other’s strategy. In sequential games, they choose their move one after another.

在同时移动游戏中,玩家在不观察对方策略的情况下同时选择他们的策略。 在顺序游戏中,他们选择一个接一个地移动。

Adversarial learning has mainly been modeled as a sequential game, the defender being the leader. In fact, it is often assumed that once the defender chooses a classifier the attacker can observe it and decide on his own strategy.

对抗学习主要被建模为顺序游戏,防御者是领导者。 实际上,通常认为,一旦防御者选择了一个分类器,攻击者就可以观察到它并决定自己的策略。

The model described in the previous section is one of the few models that consider the attacker as unable to observe the classifier before choosing his strategy.

上一节中描述的模型是认为攻击者在选择其策略之前无法观察分类器的少数模型之一。

贝叶斯博弈 (Bayesian Game)

Bayesian Game models a game in which players have incomplete information about other players. This is more likely as the defender might not know the exact cost of generating adversarial data and the attacker might not know the exact classification cost for the defender. They only have beliefs about these costs.

贝叶斯游戏对一个游戏进行建模,在该游戏中,玩家对其他玩家的信息不完整 。 这更有可能是因为防御者可能不知道生成对抗性数据的确切成本,而攻击者可能不知道防御者的确切分类成本。 他们只对这些成本有信心。

John C. Harsanyi, the well-known economist who made a major contribution in Game Theory specifically in incomplete information contexts, describes a Bayesian game in the following way:

著名经济学家约翰·哈桑尼(John C. Harsanyi)在博弈论中做出了重大贡献,尤其是在不完整的信息环境中,他以以下方式描述了贝叶斯博弈:

Each player in the game is associated with a set of types, with each type in the set corresponding to a possible payoff function for that player. In addition to the actual players in the game, there is a special player called Nature. Nature randomly chooses a type for each player according to a probability distribution across the players’ type spaces. This probability distribution is known by all players (the “common prior assumption”). This modeling approach transforms games of incomplete information into games of imperfect information.

游戏中的每个玩家都与一组类型相关联,该组中的每种类型对应于该玩家可能的支付功能 。 除了游戏中的实际玩家外,还有一个名为Nature的特殊玩家。 自然会根据玩家类型空间中的概率分布为每个玩家随机选择一种类型。 所有参与者都知道这种概率分布(“ 共同的先验假设 ”)。 这种建模方法将不完整信息的游戏转换为不完美信息的游戏

Source: Wikipedia

资料来源: 维基百科

如何使对抗学习稳健? (How to make Adversarial Learning Robust?)

Adversarial learning techniques that rely on a game theory-based framework can be relevant as it models behaviors of the learner and the adversary based on the benefits and costs incurred for retraining the model and generating an attacker.

依靠基于博弈论的框架的对抗学习技术可能是相关的,因为它基于重新训练模型和产生攻击者所产生的收益和成本对学习者和对手的行为进行建模。

But, what if the initial model was already robust to adversarial attacks?

但是,如果初始模型已经对对抗攻击具有鲁棒性怎么办?

One of the most common approaches to increase this robustness is to model malicious data that can be generated by the adversary beforehand and include it in the training phase.

提高这种鲁棒性的最常见方法之一是对恶意数据进行建模,该恶意数据可以由对手事先生成,并将其包括在训练阶段。

In this section, we will consider 3 main techniques used for generating adversarial data: Perturbation techniques, Transferring adversarial examples, Generative adversarial networks (GAN).

在本节中,我们将考虑用于生成对抗性数据的 3种主要技术:扰动技术,转移对抗性示例,生成对抗性网络(GAN)。

摄动技术 (Perturbation Techniques)

The idea is to produce synthetic adversarial data that can be used by potential adversaries. To do so, it is necessary to understand and anticipate how the adversary creates his attacks.

这个想法是产生可以被潜在对手使用的综合对手数据 。 为此,有必要了解并预测对手如何发动攻击。

Most of these techniques rely on adding a small amount of noise or perturbation to a valid example. Let’s take a look at some well-known examples :

这些技术大多数都依赖于向有效示例中添加少量噪声或干扰。 让我们看一些著名的例子:

L-BFGS

BFGS

Let’s note:

让我们注意:

  • f: the classifier mapping that takes for a given observation x consisting of m features and returns the class label

    f:分类器映射,它对给定观察值xm个特征组成,并返回类标签

  • loss: the associated continuous loss function

    损失:相关的连续损失函数

  • r: the perturbation

    r:摄动

It the goal of the attacker is to generate an example that is misclassified a l, he have to solve the following optimization problem:

攻击者的目标是生成一个错误分类的示例,他必须解决以下优化问题:

This can be done by performing a line-search to find the minimum c > 0 for which the minimizer r of the following problem satisfies f(x + r) = l :

这可以通过执行线搜索来找到最小值c> 0来实现,以下问题的最小化子r满足f(x + r)= l:

Note that this optimization problem does not have a closed-form solution for complex models such as deep neural networks. However, iterative numerical methods can be used which can make the generation slower. Nevertheless, its success rate is high.

请注意,对于诸如神经网络之类的复杂模型,此优化问题没有封闭形式的解决方案。 但是,可以使用迭代数值方法,这会使生成速度变慢。 但是,其成功率很高。

Fast Gradient Sign Method (FGSM)

快速梯度符号法(FGSM)

Let’s note :

让我们注意:

  • X: the clean observation

    X :干净的观察

  • ∇J(X,y): the gradient of the model loss function with respect to X

    ∇J(X,y):模型损失函数相对于X的梯度

  • ϵ: parameter controlling the importance of the adversarial perturbation

    ϵ:参数 控制对抗性干扰的重要性

The method generates adversarial perturbations by increasing the value of the loss function, as follows:

该方法通过增加损失函数的值来生成对抗性扰动,如下所示:

Note that adding a perturbation in the direction of the gradient enables the observation to be intentionally altered so that the model misclassifies it.

请注意,在梯度方向上添加扰动可使观察结果有意更改,因此模型将其分类错误。

This method is fast and computationally more feasible to implement than the previous one. However, its success rate is lower.

与前一种方法相比,该方法快速且在计算上更可行 。 但是,其成功率较低

Goodfellow et al. paper that describes this method have also lead to interesting observations such as:

Goodfellow等。 描述此方法的论文也引起了有趣的观察,例如:

  • Using the direction of perturbation rather than the amount of perturbation is more efficient in creating adversarial examples

    在创建对抗性示例时,使用扰动方向而不是扰动量会更有效

  • Training a classifier with adversarial examples is similar to regularization of the classifier

    用对抗性示例训练分类器类似于分类器的正则化

Iterative Fast Gradient Sign

迭代快速梯度符号

It is also possible to apply FGSM several times with small step size and clip the total while making the distortion between the clean and the adversarial example is lower than ϵ.

也可以以较小的步长应用FGSM几次并修剪总数,同时使干净示例和对抗示例之间的失真小于1/3。

Transferring Adversarial Example

传递对抗示例

Most of the techniques described above assume that the attacker knows the model used. They belong to what is called White-box attacks as opposed to Black-box attacks. However, in real-life situations this is not always the case.

上面描述的大多数技术都假定攻击者知道所使用的模型。 它们属于所谓的白盒攻击 ,而不是黑盒攻击 。 但是,在现实生活中,情况并非总是如此。

So, to what are the techniques to which the attacker usually resorts?

那么,攻击者通常采用什么技术?

The adversary can reconstruct the model by probing, i.e. sending valid and adversarial examples to the model and observing the output. This enables him to form a dataset that he can be used to train a substitute model. The generation of adversarial examples can be then achieved using White-Box algorithms.

对手可以通过探测来重建模型,即向模型发送有效和对抗的示例并观察输出。 这使他能够形成一个可用于训练替代模型的数据集。 然后可以使用White-Box算法实现对抗性示例的生成。

However, in some cases, probing can be limited by a maximal number of queries accepted or in terms of cost incurred by the adversary. To deal with this issue, the adversary can generate adversarial examples to fool a classifier and train, in parallel, another model. He can then reuse those same adversarial examples to fool multiple different classifiers.

但是,在某些情况下,探测可能会受到接受的最大查询数或对手所招致的成本的限制。 为了解决此问题,对手可以生成对抗示例,以欺骗分类器并并行训练另一个模型。 然后,他可以重用这些相同的对抗性示例来欺骗多个不同的分类器。

Note that using adversarial examples generated via a model to fool a Black-box model is possible and relies on the transferring property.

请注意,可以使用通过模型生成的对抗性示例来欺骗黑盒模型,并且依赖于transfer属性

Generative Adversarial Networks (GAN)

生成对抗网络(GAN)

Generative Adversarial Networks (GANs) entirely rely on a Game Theory approach. In these models, perturbed examples are generated from an adversary and simultaneously used to train the learner’s model, as shown in the graph below.

生成对抗网络(GAN)完全依赖于博弈论方法。 在这些模型中,从对手生成受干扰的示例,并同时将其用于训练学习者的模型,如下图所示。

FreeCodeCamp, Thalles SilvaFreeCodeCamp的文章 ,Thalles Silva

As shown in the figure above, the function used by the learner is called the discriminator while the function used by the adversary function is called the generator.

如上图所示,学习者使用的功能称为鉴别器,而对手使用的功能称为生成器

The discriminator and the generator interact through a zero-sum game as they are both looking to optimize a different and opposing objective function, or loss function.

鉴别器和生成器通过零和博弈进行交互因为它们都在寻求优化另一个相反的目标函数或损失函数。

In this context, the discriminator and the generator continuously adapt their prediction and data corruption mechanisms respectively.

在这种情况下,鉴别器和生成器分别连续地调整其预测和数据破坏机制。

结论 (Conclusion)

Nowadays, as individuals and businesses are embracing the digital revolution, Artificial Intelligence algorithms are increasingly used to solve complex problems in multiple contexts, some of which can have high stakes.

如今,随着个人和企业正在拥抱数字革命,越来越多地使用人工智能算法来解决多种情况下的复杂问题,其中某些问题可能具有很高的风险

Therefore, it is important not to underestimate their weaknesses when they can face adversarial attacks in hostile environments.

因此,重要的是不要低估他们的弱点,因为他们在敌对的环境中可能面临对抗性攻击。

The examples are as numerous as they are revealing: image recognition systems used for accessing private spaces or valuable information, fraud detection algorithms protecting individuals and companies’ wealth, etc.

这些例子所揭示的例子不胜枚举:用于访问私人空间或有价值信息的图像识别系统,保护个人和公司财富的欺诈检测算法等。

In this context, Game Theory provides useful tools to model the behavior of the adversary and the learner as it includes, on the one hand, the benefit for the adversary to attack and the cost to generate the adversarial data and, on the other hand, the costs of the learner to update the model.

在这种情况下,博弈论提供了有用的工具来对对手和学习者的行为进行建模,因为它一方面包括对手的攻击收益和生成对手数据的成本,另一方面包括学习者更新模型的成本。

Thus, game theory-based approaches cast light on the tradeoff adversaries and learners both made and can be used to assess the risks of implementing a specific technology. Consequently, it is a powerful decision-making tool that needs to be more wildly used in similar contexts.

因此,基于博弈论的方法将权衡对手和学习者所采用的方法,可以用来评估实施特定技术的风险。 因此,它是一个功能强大的决策工具,需要在类似的情况下更广泛地使用它。

翻译自: https://towardsdatascience.com/a-game-theoretical-approach-for-adversarial-machine-learning-7523914819d5

深度学习和机器博弈如何结合


http://www.taodudu.cc/news/show-3131673.html

相关文章:

  • ICML 2018 | 从强化学习到生成模型:40篇值得一读的论文
  • 自博弈学习初步
  • flutter_downloader文件下载插件
  • [Vue.js 1] 入门基础知识与开发
  • Vue模仿todo超详细讲解(附源码)
  • Android 4.0.1 源码下载,编译和运行
  • Python Django框架+jQuery Ajax实现CRUD
  • linux--Segfault详解
  • 四种第三方登录的方法
  • vuejs2.0 高级实战 全网稀缺 音乐WebAPP
  • 代码编写规范
  • Hadoop 和 Spark 知识点整理汇总
  • Kotlin项目实战之手机影音---主界面tab切换、home界面适配、获得首页网络数据
  • 嘿嘿,插播消息,最新一期的流言终结者
  • 编码(1)学点编码知识又不会死:Unicode的流言终结者和编码大揭秘
  • [转帖]流言终结者 —— “SQL Server 是Sybase的产品而不是微软的”
  • amd同步多线程_流言终结者系列:第三代锐龙关同步多线程能增加游戏帧数?
  • Java谣言终结者之Arraylist和Linkedlist到底谁快
  • 《流言终结者》介绍
  • java是引用传递还是值传递_流言终结者:Java是引用传递还是值传递?
  • 学点编码知识又不会死:Unicode的流言终结者和编码大揭秘
  • 微型计算机用什么显卡,流言终结者 侧板风扇真能给显卡降温吗
  • 流言终结者 1080P全高清都等于高画质?
  • 流言终结者 1080P画质都一样?(下)
  • 一天进步一点点 LDAP协议和AD的概念
  • 技术工艺 | FPC和PCB有哪些区别?
  • 台湾D-Link全景安全公司证书被盗,且被APT组织blacktech利用
  • 台湾清华大学物联网--001 物联网基础架构与应用简介
  • 台湾清华大学物联网--003 物联网传感器与传感网络设计
  • APT组织BlackEnergy继任者:GreyEnergy,台湾研华公司证书被其盗取

深度学习和机器博弈如何结合_对抗机器学习的博弈论方法相关推荐

  1. 深度学习训练中噪声减小吗_【机器学习 155】DoubleEnsemble

    更新:根据大家评论区和私信的反馈,我们把文章重新更新了一版在 arxiv 上.这次更新主要修改了方法部分的叙述和其他一些typos.欢迎大家围观- 相信关注我专栏的也有很多对于金融感兴趣的同学,这里给 ...

  2. [深度学习]实现一个博弈型的AI,从五子棋开始(1)

    好久没有写过博客了,多久,大概8年???最近重新把写作这事儿捡起来--最近在折腾AI,写个AI相关的给团队的小伙伴们看吧. 搞了这么多年的机器学习,从分类到聚类,从朴素贝叶斯到SVM,从神经网络到深度 ...

  3. 深度学习解决机器阅读理解任务的研究进展

    /*版权声明:可以任意转载,转载时请标明文章原始出处和作者信息.*/ author: 张俊林 关于阅读理解,相信大家都不陌生,我们接受的传统语文教育中阅读理解是非常常规的考试内容,一般形式就是给你一篇 ...

  4. 1.基于深度学习的知识追踪研究进展_刘铁园

    基于深度学习的知识追踪研究进展_刘铁园 1.知识追踪改进方向 针对可解释问题的改进 针对长期依赖问题的改进 针对缺少学习特征问题的改进 2.基于深度学习的知识追踪 DLKT 2.1 符号定义 2.2 ...

  5. 深度学习中的损失函数如何画图_如何用深度学习来做检索:度量学习中关于排序损失函数的综述(1)...

    作者:Ahmed Taha 编译:ronghuaiyang 原文链接: https://mp.weixin.qq.com/s/LC4ch4O2eUjgbMLH9zT0pw​mp.weixin.qq.c ...

  6. 2017年深度学习优化算法最新进展:如何改进SGD和Adam方法?

    2017年深度学习优化算法最新进展:如何改进SGD和Adam方法? 深度学习的基本目标,就是寻找一个泛化能力强的最小值,模型的快速性和可靠性也是一个加分点. 随机梯度下降(SGD)方法是1951年由R ...

  7. DL之paddlepaddle:百度深度学习框架paddlepaddle飞桨的简介、安装、使用方法之详细攻略

    DL之paddlepaddle:百度深度学习框架paddlepaddle飞桨的简介.安装.使用方法之详细攻略 目录 paddlepaddle百度深度学习框架的简介 1.飞桨全景图与四大领先技术 2.丰 ...

  8. 深度学习中梯度消失和梯度爆炸的根本原因及其缓解方法

    深度学习中梯度消失和梯度爆炸的根本原因及其缓解方法 一.梯度消失和爆炸的根本原因 1. 深层网络角度 2. 激活函数角度 二.梯度消失.爆炸的解决方案 1. 梯度剪切.正则 2. 采用其他的激活函数 ...

  9. 深度学习数据集中数据差异大_使用差异隐私来利用大数据并保留隐私

    深度学习数据集中数据差异大 The modern world runs on "big data," the massive data sets used by governmen ...

最新文章

  1. Silverlight实例教程 - Validation数据验证开篇
  2. crypt函数的使用(仅限LINUX)
  3. python3报错处理:UnicodeEncodeError: ‘ascii‘ codec can‘t encode characters in position 0-1
  4. django mysql开发_【python-Django开发】Django 配置MySQL数据库讲解!!!
  5. SAP空格无法带出历史记录的解决办法
  6. word2010页脚页码的总页数修改方法
  7. 导出测试点的信号名_小程序导出数据到excel表,借助云开发云函数实现excel数据的保存...
  8. C语言试题七十之请编写函数判断年份是否为闰年
  9. SpringBoot 2.x (3):文件上传
  10. 学习笔记 10.28
  11. 实现option上下移动_jQuery操作Select的Option上下移动及移除添加等等
  12. 低代码指南100方案:28高效HR如何做好面试管理,提高招聘效率?
  13. 黑莓装Linux系统,“黑莓的Linux桌面管理器”──Barry的使用
  14. robots文件的优化
  15. mysql skewed_Hive分区字段含中文报错问题解决方案
  16. 梁漱溟:人生的三种态度 | 合道的生活
  17. jQuery网格插件 ParamQuery
  18. 二进制转化成ascll_微机原理实验-二进制到ASCII码转换
  19. C++第33课--C++中的字符串类
  20. 远程医疗中使用AR眼镜,内窥镜,视频远程诊疗方案

热门文章

  1. ucla研究生计算机科学,揭秘UCLA研究生录取数据,达到什么标准才能稳被录?
  2. html插入视频开始前图片,视频前面加图片|录制的视频前加一个图片介绍怎么弄...
  3. Linux split文件切分工具的使用
  4. 天气爬虫网站(flask+sqlite3+selenium+echarts)
  5. springMVC开发过程中遇到的404错误的两种情况总结
  6. JS中常用的事件操作
  7. sql server,mysql,oracle的区别
  8. Java基于springboot+vue的房屋出租租房系统 前后端分离
  9. 一份完整的运营方案应包含的七个方面
  10. define和sbit的区别