Symmetry Problem

若对于神经网络任意一层 l,l,l, 该层所有参数 ωli,jωi,jl\omega ^{l} _{i,j} 的初始值都一样,则在梯度下降每次迭代中:
{ωl−11,j=ωl−12,j,0≤j≤sl−1,ωli,1=ωli,2,1≤i≤sl+1,,2≤l≤L−1{ω1,jl−1=ω2,jl−1,0≤j≤sl−1,ωi,1l=ωi,2l,1≤i≤sl+1,,2≤l≤L−1\begin{cases} \omega ^{l - 1} _{1,j} = \omega ^{l - 1} _{2,j}, 0 \le j \le s_{l - 1}, \\ \omega ^{l} _{i,1} = \omega ^{l } _{i,2}, 1 \le i \le s_{l + 1}, \end{cases} , 2 \le l \le L - 1
以下图为例,颜色相同的两条线段所代表的权重都相等。

证明

使用数学归纳法。
假设当前迭代之前,命题成立。
由于 a(l)i=g(∑j=0sl−1ωl−1i,ja(l−1)j),1≤i≤sl,ai(l)=g(∑j=0sl−1ωi,jl−1aj(l−1)),1≤i≤sl,a ^{\left (l\right )} _{i} = g\left ( \sum \limits_{j = 0} ^{s_{l - 1}}\omega ^{l- 1} _{i,j} a ^{\left (l - 1\right )} _{j} \right ), 1 \le i \le s_{l}, 其中 ggg 为 logistic 函数。
因此 a1(l)=a2(l)" role="presentation">a(l)1=a(l)2a1(l)=a2(l)a ^{\left (l\right )} _{1} = a ^{\left (l\right )} _{2}
由于 ∂∂ω(l)i,jJ=δ(l+1)ia(l)j,1≤i≤sl+1,∂∂ωi,j(l)J=δi(l+1)aj(l),1≤i≤sl+1,\dfrac {\partial} {\partial \omega ^{\left (l\right ) }_{i,j}} J = \delta ^{\left (l + 1\right )} _{i} a ^{\left (l\right )} _{j}, 1 \le i \le s_{l + 1}, 其中 JJJ 为损失函数。
因此 (1)∂∂ωi,1(l)J=∂∂ωi,2(l)J,1≤i≤sl+1," role="presentation">∂∂ω(l)i,1J=∂∂ω(l)i,2J,1≤i≤sl+1,(1)(1)∂∂ωi,1(l)J=∂∂ωi,2(l)J,1≤i≤sl+1,\dfrac {\partial} {\partial \omega ^{\left (l\right ) }_{i,1}} J = \dfrac {\partial} {\partial \omega ^{\left (l\right ) }_{i,2}} J, 1 \le i \le s_{l + 1}, \tag {1}
由于 δ(l)j=a(l)j(1−a(l)j)∑i=1sl+1ω(l)i,jδ(l+1)iδj(l)=aj(l)(1−aj(l))∑i=1sl+1ωi,j(l)δi(l+1)\delta ^{\left (l\right )} _{j} = a ^{\left (l\right )} _{j} \left (1 - a ^{\left (l\right )} _{j} \right ) \sum \limits_{i = 1} ^{s _{l + 1} } \omega ^{\left (l\right ) }_{i,j} \delta ^{\left (l + 1\right )} _{i}
因此 δ(l)1=δ(l)2δ1(l)=δ2(l)\delta ^{\left (l\right )} _{1} = \delta ^{\left (l\right )} _{2}
由于 ∂∂ω(l−1)i,jJ=δ(l)ia(l−1)j,1≤i≤sl,∂∂ωi,j(l−1)J=δi(l)aj(l−1),1≤i≤sl,\dfrac {\partial} {\partial \omega ^{\left (l - 1\right ) }_{i,j}} J = \delta ^{\left (l\right )} _{i} a ^{\left (l - 1\right )} _{j}, 1 \le i \le s_{l},
因此 ∂∂ω(l−1)1,jJ=∂∂ω(l−1)2,jJ,0≤j≤sl−1,(2)(2)∂∂ω1,j(l−1)J=∂∂ω2,j(l−1)J,0≤j≤sl−1,\dfrac {\partial} {\partial \omega ^{\left (l - 1\right ) }_{1, j}} J = \dfrac {\partial} {\partial \omega ^{\left (l - 1\right ) }_{2, j}} J, 0 \le j \le s_{l - 1}, \tag {2}
由 (1), (2) 得,在下一轮的迭代之前,命题也成立。

Reason of Random Initialization - Neural Networks相关推荐

  1. Deep Neural Networks的Tricks

    Here we will introduce these extensive implementation details, i.e., tricks or tips, for building an ...

  2. 李菲菲课程笔记:Deep Learning for Computer Vision – Introduction to Convolution Neural Networks

    转载自:http://www.analyticsvidhya.com/blog/2016/04/deep-learning-computer-vision-introduction-convoluti ...

  3. Machine Learning week 5 quiz: Neural Networks: Learning

    Neural Networks: Learning 5 试题 1. You are training a three layer neural network and would like to us ...

  4. Paper:《Graph Neural Networks: A Review of Methods and Applications》翻译与解读

    Paper:<Graph Neural Networks: A Review of Methods and Applications>翻译与解读 目录 <Graph Neural N ...

  5. Paper:Xavier参数初始化之《Understanding the difficulty of training deep feedforward neural networks》的翻译与解读

    Paper:Xavier参数初始化之<Understanding the difficulty of training deep feedforward neural networks>的 ...

  6. CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第一章~第三章

    CV:翻译并解读2019<A Survey of the Recent Architectures of Deep Convolutional Neural Networks>第一章~第三 ...

  7. Paper之DL之BP:《Understanding the difficulty of training deep feedforward neural networks》

    Paper之DL之BP:<Understanding the difficulty of training deep feedforward neural networks> 目录 原文解 ...

  8. [C1W3] Neural Networks and Deep Learning - Shallow neural networks

    第三周:浅层神经网络(Shallow neural networks) 神经网络概述(Neural Network Overview) 神经网络的表示(Neural Network Represent ...

  9. DeepLearning.AI第一部分第三周、 浅层神经网络(Shallow neural networks)

    文章目录 3.1 一些简单的介绍 3.2神经网络的表示Neural Network Representation 3.3计算一个神经网络的输出Computing a Neural Network's ...

  10. Stanford机器学习---第五讲. 神经网络的学习 Neural Networks learning

    原文见http://blog.csdn.net/abcjennifer/article/details/7758797,加入了一些自己的理解 本栏目(Machine learning)包含单參数的线性 ...

最新文章

  1. 结构型模式 -- 代理模式(静态代理动态代理)
  2. go linux环境搭建,Linux 下 Go 环境搭建以及 Gin 安装
  3. Conditional
  4. Gitee 上线多项 PR 功能优化,进一步提升审查与提交效率
  5. HDU-2084-数塔(dp)
  6. 使用CPU时间戳进行高精度计时
  7. wps 甘特图_【WPS神技能】在Excel表格中用图表阶梯式的展示任务进程?找甘特图呀...
  8. Mac开发-脚本打包DMG
  9. 深度解读互联网+供应链金融八大模式
  10. “add measurements”(添加度量)菜单问题
  11. 新人如何通过小红书赚第一桶金?
  12. 基于C++的关键字检索系统
  13. 抑制剂以及抗体偶联物在免疫检查点中的作用
  14. 如何建立企业级数据分析能力?
  15. 获取PC 服务器 可用的GPU
  16. React SSR: 基于 express 自构建 SSR 服务端渲染
  17. 谁说程序员过了35岁之后就要去“送外卖”、“跑滴滴”?这几种发展走向照样解除焦虑
  18. React中文文档之State and Lifecycle
  19. 重装系统后安装的软件
  20. php与java语法的区别

热门文章

  1. InstallShield 12 制作安装包
  2. 学习perl点滴(二)
  3. 强人Hibernate文档笔记(下)
  4. FreeBSD从零开始---Web服务器搭建(二)
  5. php memcached 加锁,用memcached实现的php锁机制
  6. 容器算法迭代器初识----容器嵌套容器
  7. wait放弃对象锁_121、抽象类和接口使用场合;wait和sleep
  8. 1010 Radix (25 分) 超级坑恶魔坑
  9. fread函数和fwrite函数
  10. Java线程输出字母大小写_FastJson 输出值 首字母大小写问题