机器学习(Andrew Ng)作业代码(Exercise 1~2)
Programming Exercise 1: Linear Regression
单变量线性回归
warmUpExercise
要求:输出5阶单位阵
直接使用eye(5,5)
即可
function A = warmUpExercise()
%WARMUPEXERCISE Example function in octave
% A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrixA = [];
% ============= YOUR CODE HERE ==============
% Instructions: Return the 5x5 identity matrix
% In octave, we return values by defining which variables
% represent the return values (at the top of the file)
% and then set them accordingly. A=eye(5,5);% ===========================================
end
plotData
要求:读入若干组数据(x,y),将它们绘制成散点图
使用MATLAB的plot()
命令即可
function plotData(x, y)
%PLOTDATA Plots the data points x and y into a new figure
% PLOTDATA(x,y) plots the data points and gives the figure axes labels of
% population and profit.% ====================== YOUR CODE HERE ======================
% Instructions: Plot the training data into a figure using the
% "figure" and "plot" commands. Set the axes labels using
% the "xlabel" and "ylabel" commands. Assume the
% population and revenue data have been passed in
% as the x and y arguments of this function.
%
% Hint: You can use the 'rx' option with plot to have the markers
% appear as red crosses. Furthermore, you can make the
% markers larger by using plot(..., 'rx', 'MarkerSize', 10);figure; % open a new figure windowdata=load('ex1data1.txt');[n,m]=size(data);xdata=data(:,1);ydata=data(:,2);for i=1:nplot(xdata,ydata,'rx');endxlabel('X Axis');ylabel('Y Axis');
% ============================================================
end
输出结果:
computeCost
要求:读入\(m\)组数据(X,y),计算用\(y=\theta^T X(\theta=(\theta_0,\theta_1)^T,X=(1,x^{(i)})^T)\)拟合这组数据的均方误差\(J(\theta)\)
\[J(\theta)=\frac 1 {2m}\sum_{i=1}^m(\theta^T X^{(i)}-y^{(i)})^2\]
function J = computeCost(X, y, theta)
%传入:X为m*2矩阵,每一行第一列为1,第二个为X值,y为m维行向量,theta为二维列向量
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y% Initialize some useful valuesm = length(y); % number of training examples% You need to return the following variables correctly J = 0;% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.for i=1:mJ=J+(theta'*X(i,:)'-y(i))^2;endJ=J/(2*m);% =========================================================================
end
gradientDescent
要求:给出m组数据\((x^{(i)},y^{(i)})\),学习率\(\alpha\),梯度下降法迭代num_iters
次后返回最终的参数\(\theta\)和每次迭代后的均方误差损失J_history
公式推导:
\[J(\theta)=\frac 1 {2m}\sum_{i=1}^m(\theta_0 +\theta_1 x^{(i)}-y^{(i)})^2\]
\[\frac{\partial J(\theta)}{\partial \theta_0}= \frac 1 m \sum_{i=1}^m(\theta_0 +\theta_1 x^{(i)}-y^{(i)}) \]
\[\frac{\partial J(\theta)}{\partial \theta_1}= \frac 1 m \sum_{i=1}^m(\theta_0 +\theta_1 x^{(i)}-y^{(i)})x^{(i)} \]
梯度下降过程中,每次迭代同时更新\(\theta_0,\theta_1\):
\[\theta _0 := \theta _0- \alpha \frac{\partial J(\theta)}{\partial \theta_0}\]
\[\theta _1 := \theta _1- \alpha \frac{\partial J(\theta)}{\partial \theta_1}\]
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector% theta. %% Hint: While debugging, it can be useful to print out the values% of the cost function (computeCost) and gradient here.%dJ_dtheta0=0;dJ_dtheta1=0;for i=1:mdJ_dtheta0=dJ_dtheta0+(theta'*X(i,:)'-y(i));dJ_dtheta1=dJ_dtheta1+(theta'*X(i,:)'-y(i))*X(i,2);enddJ_dtheta0=dJ_dtheta0/m;dJ_dtheta1=dJ_dtheta1/m;theta=theta-alpha*([dJ_dtheta0,dJ_dtheta1])';% ============================================================% Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta)endend
最终运行结果
Fig1.梯度下降法线性回归拟合出的直线
Fig2.\(J(\theta)\)的曲面图像
Fig3.\(J(\theta)\)的等高线图,红叉代表了\(J(\theta)\)最小处的点
多变量线性回归
featureNormalize
要求:给出m组输入数据的特征(这里特征维数为2),即m行2列矩阵X,将输入数据Z-score归一化到区间[-1,1]。注意,这时还没有给X添加一列1
Z-score归一化方法:
对于第i维特征,计算出m组数据该特征的平均值\(\mu\)和标准差\(\sigma\),则
\[x_i^{(t)}:=\frac {x_i^{(t)}-\mu}{\sigma}\]
归一化后每一维特征平均值为0,标准差为1
function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X
% FEATURENORMALIZE(X) returns a normalized version of X where
% the mean value of each feature is 0 and the standard deviation
% is 1. This is often a good preprocessing step to do when
% working with learning algorithms.% You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));
sigma = zeros(1, size(X, 2));% ====================== YOUR CODE HERE ======================
% Instructions: First, for each feature dimension, compute the mean
% of the feature and subtract it from the dataset,
% storing the mean value in mu. Next, compute the
% standard deviation of each feature and divide
% each feature by it's standard deviation, storing
% the standard deviation in sigma.
%
% Note that X is a matrix where each column is a
% feature and each row is an example. You need
% to perform the normalization separately for
% each feature.
%
% Hint: You might find the 'mean' and 'std' functions useful.
% mu(1,1)=mean(X(:,1));mu(1,2)=mean(X(:,2));sigma(1,1)=std(X(:,1));sigma(1,2)=std(X(:,2));X_norm(:,1)=(X_norm(:,1)-mu(1,1))/sigma(1,1);X_norm(:,2)=(X_norm(:,2)-mu(1,2))/sigma(1,2);
% ============================================================end
computeCostMulti
要求:给出m组输入数据(m*3矩阵X,每一行第一列为1)和真实输出y,计算用\(y=\theta^TX^{(i)T}\)拟合这些数据的均方误差\(J(\theta)\)
\[J(\theta)=\frac 1 {2m}\sum_{i=1}^m(\theta^TX^{(i)T}-y^{(i)})^2\]
function J = computeCostMulti(X, y, theta)
%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables
% J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.for i=1:mJ=J+(theta'*X(i)'-y(i))^2;endJ=J/(2*m);
% =========================================================================end
gradientDescentMulti
要求:给出m组数据\((x^{(i)},y^{(i)})\),学习率\(\alpha\),梯度下降法迭代num_iters
次后返回最终的参数\(\theta\)和每次迭代后的均方误差损失J_history
注意,这里每个输入数据的第一维特征都是1(后来补上的)
公式推导:
对于第t个参数\(\theta_t\),其更新公式为:
\[\theta _t := \theta _t- \alpha \frac{\partial J(\theta)}{\partial \theta_t}\]
\[\frac{\partial J(\theta)}{\partial \theta_t}= \frac 1 m \sum_{i=1}^m(\theta^TX^{(i)T}-y^{(i)})x_t^{(i)}\]
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
% theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector% theta. %% Hint: While debugging, it can be useful to print out the values% of the cost function (computeCostMulti) and gradient here.%paramsize=length(theta);dJ_dtheta=zeros(paramsize,1);for i=1:paramsizefor j=1:mdJ_dtheta(i,1)=dJ_dtheta(i,1)+(theta'*X(j,:)'-y(j))*X(j,i);endendfor i=1:paramsizedJ_dtheta(i,1)=dJ_dtheta(i,1)/m;theta(i)=theta(i)-alpha*dJ_dtheta(i);end% ============================================================% Save the cost J in every iteration J_history(iter) = computeCostMulti(X, y, theta);endend
最终测试结果
Fig1.收敛曲线
最小二乘法(投影法)求\(\theta\)
对于单变量的线性回归问题,
结果与梯度下降法近似。
Programming Exercise 2: Logistic Regression
Logistic回归二分类
plotData
function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure
% PLOTDATA(x,y) plots the data points with + for the positive examples
% and o for the negative examples. X is assumed to be a Mx2 matrix.% Create New Figure
figure; hold on;% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
% 2D plot, using the option 'k+' for the positive
% examples and 'ko' for the negative examples.
%m=size(X,1);pos=find(y==1);neg=find(y==0);plot(X(pos,1),X(pos,2),'+','LineWidth', 2,'MarkerSize', 7);plot(X(neg,1),X(neg,2),'o', 'MarkerFaceColor', 'y','MarkerSize', 7);
% =========================================================================hold off;end
sigmoid
Sigmoid函数:
\[Sigmoid(x)=\frac 1 {1+e^{-x}}\]
function g = sigmoid(z)
%SIGMOID Compute sigmoid functoon
% J = SIGMOID(z) computes the sigmoid of z.% You need to return the following variables correctly
g = zeros(size(z));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
% vector or scalar).g=1/(1+exp(-z));% =============================================================end
costFunction
Logistic回归采用交叉熵误差函数:
\[J(\theta)=\frac 1 m \sum_{i=1}^m[-y^{(i)}log(h_\theta(X^{(i)}))-(1-y^{(i)})log(1-h_\theta(X^{(i)}))]\]
梯度下降公式推导:
\[Sigmoid'(x)=Sigmoid(x)(1-Sigmoid(x))\]
\[\frac{\partial J(\theta)}{\partial \theta_t}=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}\frac {g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{{h_\theta(X^{(i)})}}-(1-y^{(i)})\frac {-g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{1-h_\theta (X^{(i)})}]x_t^{(i)}\]
\[=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}(1-g(\theta^T X^{(i)}))+(1-y^{(i)})g(\theta^T X^{(i)})]x_t^{(i)}\]
\[=\frac 1 m \sum_{i=1}^m\ (g(\theta^T X^{(i)})-y^{(i)})x_t^{(i)}\]
function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
% J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
% parameter for logistic regression and the gradient of the cost
% w.r.t. to the parameters.% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
% You should set J to the cost.
% Compute the partial derivatives and set grad to the partial
% derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%for i=1:mJ=J+(-y(i)*log(sigmoid(theta'*X(i,:)'))-(1-y(i))*log(1-sigmoid(theta'*X(i,:)')));endJ=J/m;for t=1:size(theta)for i=1:mgrad(t)=grad(t)+(sigmoid(theta'*X(i,:)')-y(i))*X(i,t);endgrad(t)=grad(t)/m;end% =============================================================end
predict
function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic
%regression parameters theta
% p = PREDICT(theta, X) computes the predictions for X using a
% threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)m = size(X, 1); % Number of training examples% You need to return the following variables correctly
p = zeros(m, 1);% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
% your learned logistic regression parameters.
% You should set p to a vector of 0's and 1's
%for i=1:mtmp=theta'*X(i,:)';if(tmp>=0)p(i)=1;elsep(i)=0;endend
% =========================================================================end
最终测试结果
带正则化的Logistic回归二分类
costFunctionReg
带正则化的Logistic回归中,损失函数加入了正则化项
\[J(\theta)=\frac 1 m \sum_{i=1}^m[-y^{(i)}log(h_\theta(X^{(i)}))-(1-y^{(i)})log(1-h_\theta(X^{(i)}))]+\frac \lambda {2m} \sum_{i=1}^n \theta_i^2\]
其中\(\lambda\)为惩罚参数,\(\lambda\)越大,\(\theta_1\cdots \theta_n\)越小,\(h_\theta(X)\)越接近\(Sigmoid(\theta_0)\),越不容易过拟合,倾向于欠拟合。注意这里没有把\(\theta_0\)加入正则化项中
梯度下降公式推导:
\[Sigmoid'(x)=Sigmoid(x)(1-Sigmoid(x))\]
\[\frac{\partial J(\theta)}{\partial \theta_0}=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}\frac {g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{{h_\theta(X^{(i)})}}-(1-y^{(i)})\frac {-g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{1-h_\theta (X^{(i)})}]x_0^{(i)}\]
\[=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}(1-g(\theta^T X^{(i)}))+(1-y^{(i)})g(\theta^T X^{(i)})]x_0^{(i)}\]
\[=\frac 1 m \sum_{i=1}^m\ (g(\theta^T X^{(i)})-y^{(i)})x_0^{(i)}\]
\[\frac{\partial J(\theta)}{\partial \theta_t}=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}\frac {g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{{h_\theta(X^{(i)})}}-(1-y^{(i)})\frac {-g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{1-h_\theta (X^{(i)})}]x_t^{(i)}+\frac \lambda m \theta_t\]
\[=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}(1-g(\theta^T X^{(i)}))+(1-y^{(i)})g(\theta^T X^{(i)})]x_t^{(i)}+\frac \lambda m \theta_t\]
\[=\frac 1 m \sum_{i=1}^m\ (g(\theta^T X^{(i)})-y^{(i)})x_t^{(i)}+\frac \lambda m \theta_t,\ \ \ \ t> 1\]
function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
% J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
% theta as the parameter for regularized logistic regression and the
% gradient of the cost w.r.t. to the parameters. % Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
% You should set J to the cost.
% Compute the partial derivatives and set grad to the partial
% derivatives of the cost w.r.t. each parameter in thetafor i=1:mtmp=sigmoid(theta'*X(i,:)');J=J+(-y(i)*log(tmp)-(1-y(i))*log(1-tmp));endJ=J/m;regsum=0;for i=2:size(theta)regsum=regsum+theta(i)*theta(i);endregsum=regsum*lambda/(2*m);J=J+regsum;for i=1:mgrad(1)=grad(1)+(sigmoid(theta'*X(i,:)')-y(i))*X(i,1);endgrad(1)=grad(1)/m;for t=2:size(theta)for i=1:mgrad(t)=grad(t)+(sigmoid(theta'*X(i,:)')-y(i))*X(i,t);endgrad(t)=grad(t)+lambda*theta(t)/m;end
% =============================================================end
最终测试结果
Fig.不同\(\lambda\)取值下获得的决策边界
转载于:https://www.cnblogs.com/qpswwww/p/9273830.html
机器学习(Andrew Ng)作业代码(Exercise 1~2)相关推荐
- 机器学习(Andrew Ng)作业代码(Exercise 3~4)
Programming Exercise 3: Multi-class Classification and Neural Networks 带正则化的多分类Logistic回归 lrCostFunct ...
- 机器学习-Andrew Ng课程笔记
本文是参考Andrew Ng网上课程和别人总结的机器学习笔记,作为笔者学习的参考,笔者主要实现了一些程序问题. 持续更新中- 1 两个参数梯度下降法 现在有一组数据,plot之后如下图 x = np. ...
- Coursera机器学习(Andrew Ng)笔记:无监督学习与维度约减
无监督学习与维度约减 机器学习初学者,原本是写来自己看的,写的比较随意.难免有错误,还请大家批评指正!对其中不清楚的地方可以留言,我会及时更正修改 Unsupervised learning & ...
- Coursera 的机器学习 (Andrew Ng) 课程 视频百度云
关于视频资源的几个问题 1.视频来自哪里? 视频本题主从 youtube下载. 2.视频为什么不全? 2016/9/17包括了第一周的视频.以后会再次更新. 视频下载地址 链接:http://pan. ...
- 斯坦福大学机器学习相关网站——Andrew Ng
1.斯坦福大学机器学习课程 网易公开课上的中文视频 http://v.163.com/movie/2008/1/M/C/M6SGF6VB4_M6SGHFBMC.html 2.斯坦福大学机器学习--An ...
- 使用python下载网易云课堂中Andrew Ng的机器学习课程
看了网易云课堂上stanford大学教授Andrew Ng的机器学习课程,觉得很不错,就想下载下来,正好也在学习python,所以就有了这么一段代码.参考了博客http://blog.csdn.net ...
- Andrew Ng -- machine learning ex2/吴恩达机器学习ex2
这个项目包含了吴恩达机器学习ex2的python实现,主要知识点为逻辑回归.正则化,题目内容可以查看数据集中的ex2.pdf 代码来自网络(原作者黄广海的github),添加了部分对于题意的中文翻译, ...
- 斯坦福大学Andrew Ng - 机器学习笔记(3) -- 神经网络模型
大概用了一个月,Andrew Ng老师的机器学习视频断断续续看完了,以下是个人学习笔记,入门级别,权当总结.笔记难免有遗漏和误解,欢迎讨论. 鸣谢:中国海洋大学黄海广博士提供课程视频和个人笔记,在此深 ...
- Andrew Ng机器学习课程6
Andrew Ng机器学习课程6 说明 在前面跟随者台大机器学习基石课程和机器学习技法课程的设置,对机器学习所涉及到的大部分的知识有了一个较为全面的了解,但是对于没有动手写程序并加以使用的情况,基本上 ...
- Andrew Ng机器学习课程14(补)
Andrew Ng机器学习课程14(补) 声明:引用请注明出处http://blog.csdn.net/lg1259156776/ 利用EM对factor analysis进行的推导还是要参看我的上一 ...
最新文章
- SSL/TLS原理详解
- Java技术专题之JVM你的内存泄露了吗?
- 【转】android程序连接网络出现android.os.NetworkOnMainThreadExceptionat
- 【学习笔记】MOOC 数学文化赏析 笔记【补档】
- 2021大数据1班《Python程序设计基础》学生学期总结
- CCF2018-3-2 碰撞的小球
- Java基础之创建对象的五种方式
- NOI2019 SX 模拟赛 no.5
- Mac 如何查看电脑的蓝牙版本信息
- C/C++[codeup 1923]排序
- ubuntu备份与恢复
- CleanMyMac X最新2022如何激活许可证解决教程
- 职业规划-IT方向(超详细,超具体)
- apk改之理安装教程
- 世界上再也找不到第二位程序员大叔能写出这样纯美的数学小说了
- 如何备份android,如何备份安卓手机系统
- python运行时不让电脑休眠_python实现windows休眠
- 新浪微博模拟登陆passwd参数rsa解密
- 链接的图片转base64,字符串转流pdf预览-zip下载
- PPP、PPPOE、PPTP、L2TP应用场合
热门文章
- 《测试类职位面试360度》
- SQL Server 阻止组件 xp_cmdshell
- 关于opencv设置视频的属性无效问题
- @Html.Partial,@Html.Action,@Html.RenderPartial,@Html.RenderAction区别 .(转)
- Web前端开发规范文档(转)
- 转:Ajax调用Webservice和后台方法
- 基础训练 龟兔赛跑预测
- Trie树 01Trie
- 狂人传记:戎马半生 何以安家
- Netsuite Foreign Currency Revaluation 外币评估