深度学习都是非凸问题_神经网络的损失函数为什么是非凸的?

Ian Goodfellow曾经给在quora回答过，以下是原文：

There are various ways to test for convexity.

One is to just plot a cross-section of the function and look at it. If it has a non-convex shape, you don’t need to write a proof; you have disproven convexity by counter-example.

If you want to do this with algebra, one way is just to take the second derivatives of a function. If the second derivative of a function in 1-D space is ever negative, the function isn’t convex.

For neural nets, you have millions of parameters, so you need a test that works in high-dimensional space. In high-dimensional space, it turns out we can take the second derivative along one specific direction in space. For a unit vector d giving the direction and a Hessian matrix H of second derivatives, this is given by

$d^{T}Hd$

For most neural nets and most loss functions, it’s very easy to find a point in parameter space and a direction where

$d^{T}Hd$ is negative.

深度学习都是非凸问题_神经网络的损失函数为什么是非凸的?相关推荐

深度学习与计算机视觉系列(8)_神经网络训练与注意点
深度学习与计算机视觉系列(8)_神经网络训练与注意点作者:寒小阳时间:2016年1月. 出处:http://blog.csdn.net/han_xiaoyang/article/details ...
都2021年了，不会还有人连深度学习都不了解吧（三）- 损失函数篇
一.前言深度学习系列文章陆陆续续已经发了两篇,分别是激活函数篇和卷积篇,纯干货分享,想要入门深度学习的童鞋不容错过噢!书接上文,该篇文章来给大家介绍" 选择对象的标准 "-- 损 ...
深度学习与计算机视觉系列(7)_神经网络数据预处理，正则化与损失函数
作者:寒小阳 && 龙心尘时间:2016年1月. 出处: http://blog.csdn.net/han_xiaoyang/article/details/50451460 ...
深度学习--TensorFlow（4）BP神经网络（损失函数、梯度下降、常用激活函数、梯度消失梯度爆炸）
目录一.概念与定义二.损失函数/代价函数(loss) 三.梯度下降法二维w与loss: 三维w与loss: 四.常用激活函数 1.softmax激活函数 2.sigmoid激活函数 3.tanh ...
深度学习与计算机视觉系列(9)_串一串神经网络之动手实现小例子
深度学习与计算机视觉系列(9)_串一串神经网络之动手实现小例子作者:寒小阳时间:2016年1月. 出处:http://blog.csdn.net/han_xiaoyang/article/de ...
深度学习与计算机视觉系列(10)_细说卷积神经网络
转载自: 深度学习与计算机视觉系列(10)_细说卷积神经网络 - 龙心尘 - 博客频道 - CSDN.NET http://blog.csdn.net/longxinchen_ml/article/d ...
深度学习模型建立过程_所有深度学习都是统计模型的建立
深度学习模型建立过程 Deep learning is often used to make predictions for data driven analysis. But what are th ...
深度学习与自然语言处理教程(3) - 神经网络与反向传播（NLP通关指南·完结）
作者:韩信子@ShowMeAI 教程地址:https://www.showmeai.tech/tutorials/36 本文地址:https://www.showmeai.tech/article-d ...
深度学习与计算机视觉系列(4)_最优化与随机梯度下降\数据预处理，正则化与损失函数
1. 引言上一节深度学习与计算机视觉系列(3)_线性SVM与SoftMax分类器中提到两个对图像识别至关重要的概念: 用于把原始像素信息映射到不同类别得分的得分函数/score function 用 ...

深度学习都是非凸问题_神经网络的损失函数为什么是非凸的?

深度学习都是非凸问题_神经网络的损失函数为什么是非凸的?相关推荐

最新文章

热门文章