Computing Parameters Analytically
Computing Parameters Analytically
Normal Equation
Find the optimum θ\thetaθ without iteration
- Minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero.
Formula:
θ=(XTX)−1XTy\theta={(X^TX)}^{-1}X^Tyθ=(XTX)−1XTy
Octave: pinv (X’X) X’*y
Design matrix (X)
m examples(x(1),y(1)),...(x(m),y(m))(x^{(1)}, y^{(1)}) ,...(x^{(m)} ,y^{(m)})(x(1),y(1)),...(x(m),y(m));n features
x(i)=[x0(i)x1(i)⋅⋅⋅xn(i)]∈Rn+1x^{(i)}=\begin{bmatrix} x_0^{(i)}\\ x_1^{(i)}\\ \cdot\\ \cdot\\ \cdot\\ x_n^{(i)} \end{bmatrix}\in\R^{n+1} x(i)=⎣⎢⎢⎢⎢⎢⎢⎢⎡x0(i)x1(i)⋅⋅⋅xn(i)⎦⎥⎥⎥⎥⎥⎥⎥⎤∈Rn+1
X=[−(x(1))T−−(x(2))T−⋅⋅⋅−(x(m))T−](m×(n+1)−dimensional)X=\begin{bmatrix} -(x^{(1)})^T-\\ -(x^{(2)})^T-\\ \cdot\\ \cdot\\ \cdot\\ -(x^{(m)})^T- \end{bmatrix}(m\times(n+1)-dimensional) X=⎣⎢⎢⎢⎢⎢⎢⎡−(x(1))T−−(x(2))T−⋅⋅⋅−(x(m))T−⎦⎥⎥⎥⎥⎥⎥⎤(m×(n+1)−dimensional)
There is no need to do feature scaling.
Comparison of gradient descent and normal equation:
GradientDesentNormalEquationNeedtochoosealphaNoneedtochoosealphaNeedsmanyiterationsNoneedtoiterateo(kn2)o(n3),needtocalculateinverseofXTXWorkswellwhennislargeSlowifnisverylarge\begin{array}{|c|clr|} \hline Gradient \;Desent&Normal\;Equation\\ \hline Need\;to\;choose\;alpha&No\;need\;to\;choose\;alpha\\ \hline Needs\;many\;iterations&No\;need\;to\;iterate\\ \hline \mathcal{o}(kn^2)&\mathcal{o}(n^3),need\;to\;calculate\;inverse\;of\;X^TX\\ \hline Works\;well\;when\;n\;is\;large&Slow\;if\;n\;is\;very\;large\\ \end{array} GradientDesentNeedtochoosealphaNeedsmanyiterationso(kn2)WorkswellwhennislargeNormalEquationNoneedtochoosealphaNoneedtoiterateo(n3),needtocalculateinverseofXTXSlowifnisverylarge
With the normal equation, computing the inversion has complexity O(n3)\mathcal{O}(n^3)O(n3). So if we have a very large number of features, the normal equation will be slow.
Normal Equation Noninvertibility
If XTXX^TXXTX is noninvertible, the common causes might be having :
- Redundant features, where two features are very closely related (i.e. they are linearly dependent)
- Too many features (e.g. m ≤ n). In this case, delete some features or use “regularization” (to be explained in a later lesson).
Solutions
- Deleting a feature that is linearly dependent with another .(Redundant features)
- Deleting one or more features or use regularization when there are too many features( e.g. m≤nm\leq nm≤n).
Computing Parameters Analytically相关推荐
- 【Machine Learning】【Andrew Ng】- notes(Week 2: Computing Parameters Analytically)
Normal Equation Gradient descent gives one way of minimizing J. Let's discuss a second way of doing ...
- Machine Learning:Computing Parameters Analytically
Normal Equation 梯度下降提供了一种最小化J的方法.现在我们要讨论第二种方法,这是一种显式地执行最小化而不借助迭代的算法. 在正规方程中,我们将通过明确地针对J取导数并将其置为0,这使我 ...
- 吴恩达机器学习(第2周--Computing Parameters Analytically)
第2周--Normal Equation 第2周--Normal Equation Non-invertibility 第2周--Working and Submitting Programming ...
- Machine Learning课堂笔记之Computing Parameters Analytically
转载于:https://www.cnblogs.com/silverbulletcy/p/7977615.html
- Andrew NG 《machine learning》week 2,class3 —Computing Parameter Analytically
Andrew NG <machine learning>week 2,class3 -Computing Parameter Analytically 3.1 Normal Equatio ...
- Machine Learning课程 by Andrew Ng
大名鼎鼎的机器学习大牛Andrew Ng的Machine Learning课程,在此mark一下: 一:Coursera: https://www.coursera.org/learn/machine ...
- Machine Learning – 第2周(Linear Regression with Multiple Variables、Octave/Matlab Tutorial)
Machine Learning – Coursera Octave for Microsoft Windows GNU Octave官网 GNU Octave帮助文档 (有900页的pdf版本) O ...
- 多元线性回归的基础理解
多元线性回归 Multivarate Linear Regression Multiple Feature 前面我们学习了一元线性回归,也动手亲自从底层编写了梯度下降算法来实现一元线性回归.相信大家 ...
- Machine Learning Stanford (week 2)
文章目录 1. Multivariate Linear Regression 1.1 Multiple Features 1.2 Gradient Descent For Multiple Varia ...
最新文章
- MySQL创建视图(CREATE VIEW)
- 打破传统降噪技术 看网易云信在语音降噪的实践应用
- P3807-[模板]卢卡斯定理
- dlopen linux 实例_Linux静态库和动态库
- leetcode 338. 比特位计数
- 批量修改栏目名_Endnote中英文混排批量修改小技巧
- 京东轮播图的原生代码
- ios设置阴历或农历生日(以iPhone X为例)
- MySQL 5.7都即将停只维护了,是时候学习一波MySQL 8了
- tensorflow实现非线性拟合
- 合成全身火焰燃烧人物海报图片的PS教程
- 2023年会议教学庭审录像机产品分析
- 教学打铃单片机实现(课程设计)
- Go入门系列(十八) 反射、包和测试工具
- 利用python进行数据分析学习笔记
- python---打包exe文件运行自动化
- 古龙群侠传 服务器维护,【图片】【原创】古龙群侠传最全流程攻略~~【环家的那只熊吧】_百度贴吧...
- 64 SUSE 下GCC 4.8.2 编译的 skipping incompatible 问题
- 1121. Damn Single
- HDU1248:寒冰王座(完全背包)