Computing Parameters Analytically

Normal Equation

Find the optimum θ\thetaθ without iteration

  • Minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero.

Formula:

θ=(XTX)−1XTy\theta={(X^TX)}^{-1}X^Tyθ=(XTX)−1XTy

Octave: pinv (X’X) X’*y

Design matrix (X)

m examples(x(1),y(1)),...(x(m),y(m))(x^{(1)}, y^{(1)}) ,...(x^{(m)} ,y^{(m)})(x(1),y(1)),...(x(m),y(m));n features
x(i)=[x0(i)x1(i)⋅⋅⋅xn(i)]∈Rn+1x^{(i)}=\begin{bmatrix} x_0^{(i)}\\ x_1^{(i)}\\ \cdot\\ \cdot\\ \cdot\\ x_n^{(i)} \end{bmatrix}\in\R^{n+1} x(i)=⎣⎢⎢⎢⎢⎢⎢⎢⎡​x0(i)​x1(i)​⋅⋅⋅xn(i)​​⎦⎥⎥⎥⎥⎥⎥⎥⎤​∈Rn+1

X=[−(x(1))T−−(x(2))T−⋅⋅⋅−(x(m))T−](m×(n+1)−dimensional)X=\begin{bmatrix} -(x^{(1)})^T-\\ -(x^{(2)})^T-\\ \cdot\\ \cdot\\ \cdot\\ -(x^{(m)})^T- \end{bmatrix}(m\times(n+1)-dimensional) X=⎣⎢⎢⎢⎢⎢⎢⎡​−(x(1))T−−(x(2))T−⋅⋅⋅−(x(m))T−​⎦⎥⎥⎥⎥⎥⎥⎤​(m×(n+1)−dimensional)

There is no need to do feature scaling.

Comparison of gradient descent and normal equation:

GradientDesentNormalEquationNeedtochoosealphaNoneedtochoosealphaNeedsmanyiterationsNoneedtoiterateo(kn2)o(n3),needtocalculateinverseofXTXWorkswellwhennislargeSlowifnisverylarge\begin{array}{|c|clr|} \hline Gradient \;Desent&Normal\;Equation\\ \hline Need\;to\;choose\;alpha&No\;need\;to\;choose\;alpha\\ \hline Needs\;many\;iterations&No\;need\;to\;iterate\\ \hline \mathcal{o}(kn^2)&\mathcal{o}(n^3),need\;to\;calculate\;inverse\;of\;X^TX\\ \hline Works\;well\;when\;n\;is\;large&Slow\;if\;n\;is\;very\;large\\ \end{array} GradientDesentNeedtochoosealphaNeedsmanyiterationso(kn2)Workswellwhennislarge​NormalEquationNoneedtochoosealphaNoneedtoiterateo(n3),needtocalculateinverseofXTXSlowifnisverylarge​​
With the normal equation, computing the inversion has complexity O(n3)\mathcal{O}(n^3)O(n3). So if we have a very large number of features, the normal equation will be slow.

Normal Equation Noninvertibility

If XTXX^TXXTX is noninvertible, the common causes might be having :

  • Redundant features, where two features are very closely related (i.e. they are linearly dependent)
  • Too many features (e.g. m ≤ n). In this case, delete some features or use “regularization” (to be explained in a later lesson).

Solutions

  • Deleting a feature that is linearly dependent with another .(Redundant features)
  • Deleting one or more features or use regularization when there are too many features( e.g. m≤nm\leq nm≤n).

Computing Parameters Analytically相关推荐

  1. 【Machine Learning】【Andrew Ng】- notes(Week 2: Computing Parameters Analytically)

    Normal Equation Gradient descent gives one way of minimizing J. Let's discuss a second way of doing ...

  2. Machine Learning:Computing Parameters Analytically

    Normal Equation 梯度下降提供了一种最小化J的方法.现在我们要讨论第二种方法,这是一种显式地执行最小化而不借助迭代的算法. 在正规方程中,我们将通过明确地针对J取导数并将其置为0,这使我 ...

  3. 吴恩达机器学习(第2周--Computing Parameters Analytically)

    第2周--Normal Equation 第2周--Normal Equation Non-invertibility 第2周--Working and Submitting Programming ...

  4. Machine Learning课堂笔记之Computing Parameters Analytically

    转载于:https://www.cnblogs.com/silverbulletcy/p/7977615.html

  5. Andrew NG 《machine learning》week 2,class3 —Computing Parameter Analytically

    Andrew NG <machine learning>week 2,class3 -Computing Parameter Analytically 3.1 Normal Equatio ...

  6. Machine Learning课程 by Andrew Ng

    大名鼎鼎的机器学习大牛Andrew Ng的Machine Learning课程,在此mark一下: 一:Coursera: https://www.coursera.org/learn/machine ...

  7. Machine Learning – 第2周(Linear Regression with Multiple Variables、Octave/Matlab Tutorial)

    Machine Learning – Coursera Octave for Microsoft Windows GNU Octave官网 GNU Octave帮助文档 (有900页的pdf版本) O ...

  8. 多元线性回归的基础理解

    多元线性回归  Multivarate Linear Regression Multiple Feature 前面我们学习了一元线性回归,也动手亲自从底层编写了梯度下降算法来实现一元线性回归.相信大家 ...

  9. Machine Learning Stanford (week 2)

    文章目录 1. Multivariate Linear Regression 1.1 Multiple Features 1.2 Gradient Descent For Multiple Varia ...

最新文章

  1. MySQL创建视图(CREATE VIEW)
  2. 打破传统降噪技术 看网易云信在语音降噪的实践应用
  3. P3807-[模板]卢卡斯定理
  4. dlopen linux 实例_Linux静态库和动态库
  5. leetcode 338. 比特位计数
  6. 批量修改栏目名_Endnote中英文混排批量修改小技巧
  7. 京东轮播图的原生代码
  8. ios设置阴历或农历生日(以iPhone X为例)
  9. MySQL 5.7都即将停只维护了,是时候学习一波MySQL 8了
  10. tensorflow实现非线性拟合
  11. 合成全身火焰燃烧人物海报图片的PS教程
  12. 2023年会议教学庭审录像机产品分析
  13. 教学打铃单片机实现(课程设计)
  14. Go入门系列(十八) 反射、包和测试工具
  15. 利用python进行数据分析学习笔记
  16. python---打包exe文件运行自动化
  17. 古龙群侠传 服务器维护,【图片】【原创】古龙群侠传最全流程攻略~~【环家的那只熊吧】_百度贴吧...
  18. 64 SUSE 下GCC 4.8.2 编译的 skipping incompatible 问题
  19. 1121. Damn Single
  20. HDU1248:寒冰王座(完全背包)

热门文章

  1. 卡尔曼滤波原理与工程实践
  2. 金龙鱼粮油的高光和益海嘉里的隐忧:巨无霸迫切需要一个本土标签
  3. 机房改造承重不够怎么办?
  4. 《惢客创业日记》2018.12.11(周二) 创业者从0到1的10个阶段(二)
  5. 麟羽kpl单人切双c内部语言,最后一帖,以后不再讨论麟羽
  6. 单例设计模式实现总结
  7. 移动app商城UI模板(仿淘宝)
  8. appium最全安装指南
  9. [经验总结]我的Doxygen配置文件
  10. Mysql ERROR 1067: Invalid default value for ‘date’ 解决