Neural Networks: Learning

5 试题

1.

You are training a three layer neural network and would like to use backpropagation to compute the gradient of the cost function. In the backpropagation algorithm, one of the steps is to update

Δ(2)ij:=Δ(2)ij+δ(3)i∗(a(2))j

for every i,j. Which of the following is a correct vectorization of this step?

Δ(2):=Δ(2)+δ(3)∗(a(2))T

Δ(2):=Δ(2)+(a(2))T∗δ(3)

Δ(2):=Δ(2)+δ(2)∗(a(3))T

Δ(2):=Δ(2)+(a(2))T∗δ(2)

2.

Suppose Theta1 is a 5x3 matrix, and Theta2 is a 4x6 matrix. You set thetaVec=[Theta1(:);Theta2(:)]. Which of the following correctly recovers Theta2?

reshape(thetaVec(16:39),4,6)

reshape(thetaVec(15:38),4,6)

reshape(thetaVec(16:24),4,6)

reshape(thetaVec(15:39),4,6)

reshape(thetaVec(16:39),6,4)

3.

Let J(θ)=2θ4+2. Let θ=1, and ϵ=0.01. Use the formula J(θ+ϵ)−J(θ−ϵ)2ϵ to numerically compute an approximation to the derivative at θ=1. What value do you get? (When θ=1, the true/exact derivative is dJ(θ)dθ=8.)

8.0008

7.9992

10

8

4.

Which of the following statements are true? Check all that apply.

Using gradient checking can help verify if one's implementation of backpropagation is bug-free.

For computational efficiency, after we have performed gradient checking to

verify that our backpropagation code is correct, we usually disable gradient checking before using backpropagation to train the network.

Computing the gradient of the cost function in a neural network has the same efficiency when we use backpropagation or when we numerically compute it using the method of gradient checking.

Gradient checking is useful if we are using one of the advanced optimization methods (such as in fminunc) as our optimization algorithm. However, it serves little purpose if we are using gradient descent.

5.

Which of the following statements are true? Check all that apply.

If we are training a neural network using gradient descent, one reasonable "debugging" step to make sure it is working is to plot J(Θ) as a function of the number of iterations, and make sure it is decreasing (or at least non-increasing) after each iteration.

Suppose you have a three layer network with parameters Θ(1) (controlling the function mapping from the inputs to the hidden units) and Θ(2) (controlling the mapping from the hidden units to the outputs). If we set all the elements of Θ(1) to be 0, and all the elements of Θ(2) to be 1, then this suffices for symmetry breaking, since the neurons are no longer all computing the same function of the input.

If we initialize all the parameters of a neural network to ones instead of zeros, this will suffice for the purpose of "symmetry breaking" because the parameters are no longer symmetrically equal to zero.

Suppose you are training a neural network using gradient descent. Depending on your random initialization, your algorithm may converge to different local optima (i.e., if you run the algorithm twice with different random initializations, gradient descent may converge to two different solutions).

Machine Learning week 5 quiz: Neural Networks: Learning相关推荐

  1. Machine Learning week 4 quiz: Neural Networks: Representation

    Neural Networks: Representation 5 试题 1. Which of the following statements are true? Check all that a ...

  2. (2018)All-optical machine learning using diffractive deep neural networks

    "All-optical machine learning using diffractive deep neural networks",这篇Science上的文章发表于2018 ...

  3. Programing Exercise 4:Neural Networks Learning

    本文讲的是coursera上斯坦福大学机器学习公开课(吴文达)课程第五周Neural Networks :learning 的课后作业.本文给出了作业实现的具体代码,并给出相应的注释和解释,供各位同学 ...

  4. 【论文导读】- STFL: A Spatial-Temporal Federated Learning Framework for Graph Neural Networks

    文章目录 论文信息 摘要 Contributions Methodology Graph Generation Graph Neural Network 联邦学习 Experiment 数据集 Nod ...

  5. Stanford机器学习---第五讲. 神经网络的学习 Neural Networks learning

    原文见http://blog.csdn.net/abcjennifer/article/details/7758797,加入了一些自己的理解 本栏目(Machine learning)包含单參数的线性 ...

  6. 《Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks》阅读笔记

    <Context Encoders: Feature Learning by Inpainting>阅读笔记 摘要 我们提出了一种基于上下文的像素预测驱动的无监督视觉特征学习算法.类似于自 ...

  7. ICLR21: EGNN Dirichlet Energy Constrained Learning for Deep Graph Neural Networks

    1. Dirichlet Energy Constrained Learning 1.1 dirichlet energy 归一化拉普拉斯矩阵+ 归一化邻接矩阵 通过拉普拉斯矩阵和特征矩阵的形式求得 ...

  8. A remark on the error-backpropagation learning algorithm for spiking neural networks

    关于尖峰神经网络的误差反向传播学习算法的一点评论✩ 摘要 在用于脉冲神经网络的误差反向传播学习算法中,必须将触发时间tαt^\alphatα区分为状态函数x(t)x(t)x(t)的函数.但是这种区分是 ...

  9. 论文阅读——CCN GAC WORKSHOP:ISSUES WITH LEARNING IN BIOLOGICAL RECURRENT NEURAL NETWORKS

    一.介绍 生物中介擅长于涉及到长期短暂从属的学习任务.在学习这样的任务中的关键挑战之一是短暂信誉分配的问题:可靠地分配重要性给过去神经元状态.解决这个问题在于学习短期和长期的联合. 神经网络由周期神经 ...

最新文章

  1. Redis学习笔记 - 数据类型与API(1)Key
  2. 实验五:任意输入10个int类型数据,排序输出,再找出素数
  3. boost::mp11::mp_replace_at相关用法的测试程序
  4. Go赋值使用:类型{} 定位使用.
  5. P1983-车站分级【图论,记忆化dfs,构图】
  6. stl中copy()函数_std :: copy()函数以及C ++ STL中的示例
  7. 小程序marker 气泡怎么用_小程序直播怎么用,看这里!
  8. Wide Deep 模型详解
  9. matlab生成数据以二进制数据格式写入txt文件中
  10. 指南-AT应用指南-AT指令指南-音频播放和TTS
  11. div垂直居中的N种方法以及多行文本垂直居中的方法
  12. 【Datawhale组队学习】机器学习数学基础 - 函数极限与连续性【Task 01】
  13. STM32 芯片锁死无法烧录问题解决
  14. 电影《阿凡达》观后感
  15. 电商行业前景怎么样?
  16. ZZULIOJ:1125: 上三角矩阵的判断
  17. 华夫饼为什么不松软_华夫饼0添加太难了,在家才能做到,松软有营养,好吃又减肥...
  18. MFC模拟 Windows 文件可视化系统
  19. 【定时任务】Springboot定时任务
  20. 程序员的苦与痛,又有谁懂!改完这个bug就离职,网友:大佬牛逼

热门文章

  1. 全国计算机一级wps网络,全国计算机一级《WPS》考试试题及答案
  2. 并发编程-15并发容器(J.U.C)核心 AbstractQueuedSynchronizer 抽象队列同步器AQS介绍
  3. Android绘图机制与处理技巧-更新中
  4. x_html语言名词解释,第2章++XHTML标记语言(97页)-原创力文档
  5. php 限制登陆设备,登陆界面限制到只允许一台机器在线-PHP教程,PHP应用
  6. 改计算机用户头像,Windows 8.1
  7. python操作json_Python学习之利用Python处理JSON格式数据
  8. DVWA 不跳转_渗透测试入门-DVWA应用渗透软件安装与使用
  9. 目录创建 android,创建目录浏览器  |  Android 开发者  |  Android Developers
  10. mac php7 mysql.so_mac下安装php7详解