李宏毅一天搞懂机器学习PPT,SildeShare链接:https://www.slideshare.net/tw_dsconf/ss-62245351?qid=108adce3-2c3d-4758-a830-95d0a57e46bc&v=&b=&from_search=3
也可以在csdn下载中下载(资源附学习笔记全文):https://download.csdn.net/download/wozaipermanent/11998637

1 Introduction of Deep Learning

1.1 Three Steps for Deep Learning

  • Step1: define a set of function (Neural Network)
  • Step2: goodness of function
  • Step3: pick the best function

1.2 Step1: Neural Network



1.2.1 Fully Connect Feedforward Network

1.2.2 Output Layer(Option)

  • Softmax(归一化指数函数):它能将一个含任意实数的k维向量Z“压缩”到另一个k维向量σ(Z)\sigma(Z)σ(Z)中,使得每一个元素的范围都在(0, 1)之间,并且所有元素的和为1。

1.2.3 Example Application

  • Handwriting Digit Recognition

1.3 Step2: Goodness of Function

1.3.1 Learning Target

1.3.2 Loss

  • Total Loss:

1.4 Step3: Pick the Best Function

1.4.1 Gradient Descent

  • RBM(Restricted Boltzmann Machine): 受限玻尔兹曼机,这部分可以参考链接:https://zhuanlan.zhihu.com/p/22794772

  • Then Compute ∂L/∂w\partial L / \partial w∂L/∂w , if Negative then Increase w; elif Positive then decrease w

  • η\etaη is called “learning rate”

Gradient Descent Diagram:

  • Randomly pick a starting point

1.4.2 Gradient Descent Difficulty

  • Backpropagation(反向传播算法):an efficient way to compute ∂L/∂w\partial L / \partial w∂L/∂w , link below:

    • http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Lecture/DNN%20backprop.ecm.mp4/index.html
    • https://www.jianshu.com/p/2e02bc6384a8

1.5 Deep is Better

1.5.1 Universality Theorem

  • Any continuous function f:RN→RMf: R^N → R^Mf:RN→RM can be realized by a network with one hidden layer (given enough hidden neurons). Ref: http://neuralnetworksanddeeplearning.com/chap4.html

1.5.2 Thin + Tall is Better

  • Neural network consists of neurons

  • A hidden layer network can represent any continuous function

  • Using multiple layers of neurons to represent some functions are much simper

  • Less parameters, less data

1.5.3 Modularization

1.6 Toolkit

1.6.1 Keras

  • Documentation: https://keras.io or https://morvanzhou.github.io/tutorials/machine-learning/keras/

1.6.2 Example of Handwriting Digit Recognition

Step1: define a set of function

Step2: goodness of function

Step3: pick the best function

Testing

score = model.evaluate(x_test, y_test)
print('Total loss on Testing Set: ', score[0])
print('Accuracy of Testing Set: ', score[1])
result = model.predict(x_test)

1.6.3 GPU to Speeding Training

  • Way1

    THEANO_FLAGGS=device=gpu0 python YourCode.py
    
  • Way2

    import os
    os.environ["THEANO_FLAGS"] = "device=gpu0"
    

2 Tips for Training Deep Neural Network

2.1 Good Results on Training Data

2.1.1 Choosing Proper Loss

2.1.2 Mini-Batch

2.1.3 New Activation Function

Vanishing Gradient Problem

ReLU

model.add(Activation('sigmoid'))
model.add(Activation('relu'))

ReLU - variant

2.1.4 Adaptive Learning Rate

Learning Rates

  • If learning rate is too large, total loss may not decrease after each update
  • If learning rate is too small, training would be too slow

Adagrad

Notes:

  • Learning rate is smaller and smaller for all parameters
  • Smaller derivatives, larger learning rate, and vice versa

2.1.5 Momentum

  • Adam: RMSProp (Advanced Adagrad) + Momentum. Adam (Adaptive Moment Estimation)本质上是带有动量项的RMSprop,它利用梯度的一阶矩估计和二阶矩估计动态调整每个参数的学习率。Adam的优点主要在于经过偏置校正后,每一次迭代学习率都有个确定范围,使得参数比较平稳。

2.2 Good Results on Testing Data

2.2.1 Early Stopping

Why Overfitting

  • Learning target is defined by the training data.
  • The parameters achieving the learning target do not necessary have good results on the testing data.

Early Stopping

2.2.2 Weight Decay

Weight decay is one kind of regularization.

  • Our brain prunes out the useless link between neurons.
  • Doing the same thing to machine’s brain imporves the performance.

2.2.3 Dropout

Training

  • Each time before updating the parameters

    • Each neuron has p% to dropout

      • The structure of the network is changed.
    • Using the new network for training
  • For each mini-batch, we resample the dropout neurons

Testing

Dropout - Intuitive Reason

Drop is a Kind of Ensemble

Try It

2.2.4 Network Structure

e.g. CNN is another good example.

3 Variants of Neural Network

3.1 Convolutional Neural Network (CNN)

3.1.1 Why CNN for Image

  • When processing image, the first layer of fully connected network would be very large.
  • Some patterns are much smaller than the whole image. A neuron does not have to see the whole image to discover the pattern.
  • The same patterns appear in different regions.
  • Subsampling the pixels will not change the object, so we can subsample the pixels to make image smaller.

3.1.2 Three Steps

Step1: Convolutional Neural Network

Convolution

Max Pooling

  • Smaller than the original image.
  • The number of the channel is the number of filters.
Flatten

Summary

Step2: goodness of function & Step3: pick the best function

3.2 Recurrent Neural Network (RNN)

Step1: Recurrent Neural Network

LSTM

Step2: goodness of function

Step3 : pick the best function

4 Next Wave

4.1 Supervised Learning

4.1.1 Ultra Deep Network

4.1.2 Attention Model

4.2 Reinforcement Learning

4.2.1 Scenario of Reinforcement Learning

4.2.2 Supervised v.s. Reinforcement

4.2.3 Difficulties of Reinforcement Learning

  • It may be better to sacrifice immediate reward to gain more long-term reward.
  • Agent’s actions affect the subsequent data it receives.

4.3 Unsupervised Learning

4.3.1 Image: Realizing what the World Looks Like

4.3.2 Text: Understanding the Meaning of Words

  • Machine learn the meaning of words from reading a lot of documents without supervision
  • A word can be understood by its context

4.3.3 Audio: Learning Human Language Without Supervision

  • Audio segment corresponding to an unknown word (Fixed-length vector)
  • The audio segments correspondsing to words with similar pronunciations are close to each other.

李宏毅——一天搞懂深度学习PPT学习笔记相关推荐

  1. 太强了! 李宏毅:1 天搞懂深度学习,我总结了 300 页 PPT

    <1 天搞懂深度学习>,300 多页的 ppt,台湾李宏毅教授写的,非常棒.不夸张地说,是我看过最系统,也最通俗易懂的,关于深度学习的文章. 这份 300 页的 PPT,被搬运到了 Sli ...

  2. 下载 | 李宏毅:1 天搞懂深度学习,我总结了 300 页 PPT

    <1 天搞懂深度学习>,300 多页的 ppt,台湾李宏毅教授写的,非常棒.不夸张地说,是我看过最系统,也最通俗易懂的,关于深度学习的文章. 这份 300 页的 PPT,被搬运到了 Sli ...

  3. 【深度学习】李宏毅:1 天搞懂深度学习,我总结了 300 页 PPT(附思维导图)...

    转载自:机器学习算法那些事 ID:Charlotte77 公众号:Charlotte数据挖掘 By    Charlotte77 前言:李宏毅的教材,非常经典,B站有配套视频,文末附下载链接!     ...

  4. 干货 | 台大“一天搞懂深度学习”课程PPT(下载方式见文末!!)

    微信公众号 关键字全网搜索最新排名 [机器学习算法]:排名第一 [机器学习]:排名第一 [Python]:排名第三 [算法]:排名第四 Deep Learing Tutorial 本篇文章我们给出了一 ...

  5. 一文搞懂深度学习正则化的L2范数

    想要彻底弄明白L2范数,必须要有一定的矩阵论知识,L2范数涉及了很多的矩阵变换.在我们进行数学公式的推到之前,我们先对L2范数有一个感性的认识. L2范数是什么? L2范数的定义其实是一个数学概念,其 ...

  6. [1天搞懂深度学习] 读书笔记 lecture I:Introduction of deep learning

    - 通常机器学习,目的是,找到一个函数,针对任何输入:语音,图片,文字,都能够自动输出正确的结果. - 而我们可以弄一个函数集合,这个集合针对同一个猫的图片的输入,可能有多种输出,比如猫,狗,猴子等, ...

  7. 计算机科学CSTA,学编程,搞懂CSTA K-12计算机科学学习标准

    科学信息技术逐步成为现代人生活和经济的核心.不论是为了适应频繁使用计算机的当今社会,还是为了将来的职业做好准备,学生们都必须对计算机科学原理和实践拥有一个更加清晰的理解.在人们对于信息技术教育的普及与 ...

  8. 一文搞懂mysql:mysql学习目录链接大全

    之前学习了mysql.整理出来分享给大家. 序号 名字 1 mysql数据库入门教程(1):数据库的相关概念,存储特点,软件安装教程,数据库启动,服务端登录退出 2 mysql数据库入门教程(2):常 ...

  9. 搞懂深度网络初始化(Xavier and Kaiming initialization)

    参数初始化就是这么一个容易被忽视的重要因素,因为不仅使用者对其重要性缺乏概念,而且这些操作都被TF.pytorch这些框架封装了,你可能不知道的是,糟糕的参数初始化是会阻碍复杂非线性系统的训练的. 本 ...

最新文章

  1. 清瘦的记录者: 一个比dbutils更小巧、好用的的持久化工具
  2. 「LibreOJ β Round #4」子集
  3. 客户关系管理系统部分代码实现
  4. 中级软考 计算机指令执行过程(取指、分析、执行)计算机重叠流水线问题
  5. 【转】批量删除redis中的key
  6. php 快速排序函数,PHP实现快速排序算法的三种方法
  7. 蓝桥杯 ADV-187 算法提高 勾股数
  8. EL表达式中fn函数
  9. Atitit.自定义存储引擎的接口设计 api 标准化 attilax 总结  mysql
  10. html代码实现全国地图分布,echarts基于canvas中国地图省市地区介绍代码
  11. java混淆书籍介绍,第二代Java混淆器Allatori功能介绍教程资源
  12. 2022凯立德导航懒人包完整版(地图包)绝对可以用
  13. 网络流精讲——最大流 包教包会
  14. Nginx+Tomcat负载均衡--win7配置详解
  15. 音视频开发系列-H264编码原理
  16. ssh免密码登录快速配置方法
  17. 如何查看国内sci期刊有哪些
  18. 数据结构第一次上机实验报告
  19. 如何把图片转换成文字?这几个方法或许可以帮到你
  20. 你不可不知的《哈利波特》秘密(四)

热门文章

  1. $‘\r‘: command not found,syntax error near unexpected token `$‘in\r‘‘
  2. 分享几个去图片水印好用的软件给你
  3. Python: queue.Queue
  4. 音乐生毕业论文有什么好的选题?
  5. 家庭光纤宽带有必要升级千兆双频路由器吗?
  6. 视频教程-Unity网络游戏架构设计-Unity3D
  7. 1.14 JavaScript5:常用DOM操作
  8. OTB数据集和VOT数据集融合跟踪算法接口示例
  9. 阿里云轻量服务器windows系统远程桌面无法连接?
  10. 执行unzip命令报错Archive: home.zip End-of-central-directory signature not found. Either this file is