Programming Exercise 1: Linear Regression

单变量线性回归

warmUpExercise

要求:输出5阶单位阵
直接使用eye(5,5)即可

function A = warmUpExercise()
%WARMUPEXERCISE Example function in octave
%   A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrixA = [];
% ============= YOUR CODE HERE ==============
% Instructions: Return the 5x5 identity matrix
%               In octave, we return values by defining which variables
%               represent the return values (at the top of the file)
%               and then set them accordingly. A=eye(5,5);% ===========================================
end

plotData

要求:读入若干组数据(x,y),将它们绘制成散点图

使用MATLAB的plot()命令即可

function plotData(x, y)
%PLOTDATA Plots the data points x and y into a new figure
%   PLOTDATA(x,y) plots the data points and gives the figure axes labels of
%   population and profit.% ====================== YOUR CODE HERE ======================
% Instructions: Plot the training data into a figure using the
%               "figure" and "plot" commands. Set the axes labels using
%               the "xlabel" and "ylabel" commands. Assume the
%               population and revenue data have been passed in
%               as the x and y arguments of this function.
%
% Hint: You can use the 'rx' option with plot to have the markers
%       appear as red crosses. Furthermore, you can make the
%       markers larger by using plot(..., 'rx', 'MarkerSize', 10);figure; % open a new figure windowdata=load('ex1data1.txt');[n,m]=size(data);xdata=data(:,1);ydata=data(:,2);for i=1:nplot(xdata,ydata,'rx');endxlabel('X Axis');ylabel('Y Axis');
% ============================================================
end

输出结果:

computeCost

要求:读入\(m\)组数据(X,y),计算用\(y=\theta^T X(\theta=(\theta_0,\theta_1)^T,X=(1,x^{(i)})^T)\)拟合这组数据的均方误差\(J(\theta)\)

\[J(\theta)=\frac 1 {2m}\sum_{i=1}^m(\theta^T X^{(i)}-y^{(i)})^2\]

function J = computeCost(X, y, theta)
%传入:X为m*2矩阵,每一行第一列为1,第二个为X值,y为m维行向量,theta为二维列向量
%COMPUTECOST Compute cost for linear regression
%   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
%   parameter for linear regression to fit the data points in X and y% Initialize some useful valuesm = length(y); % number of training examples% You need to return the following variables correctly J = 0;% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
%               You should set J to the cost.for i=1:mJ=J+(theta'*X(i,:)'-y(i))^2;endJ=J/(2*m);% =========================================================================
end

gradientDescent

要求:给出m组数据\((x^{(i)},y^{(i)})\),学习率\(\alpha\),梯度下降法迭代num_iters次后返回最终的参数\(\theta\)和每次迭代后的均方误差损失J_history

公式推导:
\[J(\theta)=\frac 1 {2m}\sum_{i=1}^m(\theta_0 +\theta_1 x^{(i)}-y^{(i)})^2\]

\[\frac{\partial J(\theta)}{\partial \theta_0}= \frac 1 m \sum_{i=1}^m(\theta_0 +\theta_1 x^{(i)}-y^{(i)}) \]

\[\frac{\partial J(\theta)}{\partial \theta_1}= \frac 1 m \sum_{i=1}^m(\theta_0 +\theta_1 x^{(i)}-y^{(i)})x^{(i)} \]

梯度下降过程中,每次迭代同时更新\(\theta_0,\theta_1\):

\[\theta _0 := \theta _0- \alpha \frac{\partial J(\theta)}{\partial \theta_0}\]

\[\theta _1 := \theta _1- \alpha \frac{\partial J(\theta)}{\partial \theta_1}\]

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by
%   taking num_iters gradient steps with learning rate alpha% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector%               theta. %% Hint: While debugging, it can be useful to print out the values%       of the cost function (computeCost) and gradient here.%dJ_dtheta0=0;dJ_dtheta1=0;for i=1:mdJ_dtheta0=dJ_dtheta0+(theta'*X(i,:)'-y(i));dJ_dtheta1=dJ_dtheta1+(theta'*X(i,:)'-y(i))*X(i,2);enddJ_dtheta0=dJ_dtheta0/m;dJ_dtheta1=dJ_dtheta1/m;theta=theta-alpha*([dJ_dtheta0,dJ_dtheta1])';% ============================================================% Save the cost J in every iteration    J_history(iter) = computeCost(X, y, theta)endend

最终运行结果

Fig1.梯度下降法线性回归拟合出的直线

Fig2.\(J(\theta)\)的曲面图像

Fig3.\(J(\theta)\)的等高线图,红叉代表了\(J(\theta)\)最小处的点

多变量线性回归

featureNormalize

要求:给出m组输入数据的特征(这里特征维数为2),即m行2列矩阵X,将输入数据Z-score归一化到区间[-1,1]。注意,这时还没有给X添加一列1

Z-score归一化方法:
对于第i维特征,计算出m组数据该特征的平均值\(\mu\)和标准差\(\sigma\),则

\[x_i^{(t)}:=\frac {x_i^{(t)}-\mu}{\sigma}\]

归一化后每一维特征平均值为0,标准差为1

function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X
%   FEATURENORMALIZE(X) returns a normalized version of X where
%   the mean value of each feature is 0 and the standard deviation
%   is 1. This is often a good preprocessing step to do when
%   working with learning algorithms.% You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));
sigma = zeros(1, size(X, 2));% ====================== YOUR CODE HERE ======================
% Instructions: First, for each feature dimension, compute the mean
%               of the feature and subtract it from the dataset,
%               storing the mean value in mu. Next, compute the
%               standard deviation of each feature and divide
%               each feature by it's standard deviation, storing
%               the standard deviation in sigma.
%
%               Note that X is a matrix where each column is a
%               feature and each row is an example. You need
%               to perform the normalization separately for
%               each feature.
%
% Hint: You might find the 'mean' and 'std' functions useful.
%       mu(1,1)=mean(X(:,1));mu(1,2)=mean(X(:,2));sigma(1,1)=std(X(:,1));sigma(1,2)=std(X(:,2));X_norm(:,1)=(X_norm(:,1)-mu(1,1))/sigma(1,1);X_norm(:,2)=(X_norm(:,2)-mu(1,2))/sigma(1,2);
% ============================================================end

computeCostMulti

要求:给出m组输入数据(m*3矩阵X,每一行第一列为1)和真实输出y,计算用\(y=\theta^TX^{(i)T}\)拟合这些数据的均方误差\(J(\theta)\)

\[J(\theta)=\frac 1 {2m}\sum_{i=1}^m(\theta^TX^{(i)T}-y^{(i)})^2\]

function J = computeCostMulti(X, y, theta)
%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables
%   J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the
%   parameter for linear regression to fit the data points in X and y% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
%               You should set J to the cost.for i=1:mJ=J+(theta'*X(i)'-y(i))^2;endJ=J/(2*m);
% =========================================================================end

gradientDescentMulti

要求:给出m组数据\((x^{(i)},y^{(i)})\),学习率\(\alpha\),梯度下降法迭代num_iters次后返回最终的参数\(\theta\)和每次迭代后的均方误差损失J_history

注意,这里每个输入数据的第一维特征都是1(后来补上的)

公式推导:
对于第t个参数\(\theta_t\),其更新公式为:
\[\theta _t := \theta _t- \alpha \frac{\partial J(\theta)}{\partial \theta_t}\]

\[\frac{\partial J(\theta)}{\partial \theta_t}= \frac 1 m \sum_{i=1}^m(\theta^TX^{(i)T}-y^{(i)})x_t^{(i)}\]

function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
%   theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
%   taking num_iters gradient steps with learning rate alpha% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector%               theta. %% Hint: While debugging, it can be useful to print out the values%       of the cost function (computeCostMulti) and gradient here.%paramsize=length(theta);dJ_dtheta=zeros(paramsize,1);for i=1:paramsizefor j=1:mdJ_dtheta(i,1)=dJ_dtheta(i,1)+(theta'*X(j,:)'-y(j))*X(j,i);endendfor i=1:paramsizedJ_dtheta(i,1)=dJ_dtheta(i,1)/m;theta(i)=theta(i)-alpha*dJ_dtheta(i);end% ============================================================% Save the cost J in every iteration    J_history(iter) = computeCostMulti(X, y, theta);endend

最终测试结果

Fig1.收敛曲线

最小二乘法(投影法)求\(\theta\)

对于单变量的线性回归问题,

结果与梯度下降法近似。

Programming Exercise 2: Logistic Regression

Logistic回归二分类

plotData

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.% Create New Figure
figure; hold on;% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
%               2D plot, using the option 'k+' for the positive
%               examples and 'ko' for the negative examples.
%m=size(X,1);pos=find(y==1);neg=find(y==0);plot(X(pos,1),X(pos,2),'+','LineWidth', 2,'MarkerSize', 7);plot(X(neg,1),X(neg,2),'o', 'MarkerFaceColor', 'y','MarkerSize', 7);
% =========================================================================hold off;end

sigmoid

Sigmoid函数:

\[Sigmoid(x)=\frac 1 {1+e^{-x}}\]

function g = sigmoid(z)
%SIGMOID Compute sigmoid functoon
%   J = SIGMOID(z) computes the sigmoid of z.% You need to return the following variables correctly
g = zeros(size(z));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).g=1/(1+exp(-z));% =============================================================end

costFunction

Logistic回归采用交叉熵误差函数:

\[J(\theta)=\frac 1 m \sum_{i=1}^m[-y^{(i)}log(h_\theta(X^{(i)}))-(1-y^{(i)})log(1-h_\theta(X^{(i)}))]\]

梯度下降公式推导:

\[Sigmoid'(x)=Sigmoid(x)(1-Sigmoid(x))\]

\[\frac{\partial J(\theta)}{\partial \theta_t}=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}\frac {g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{{h_\theta(X^{(i)})}}-(1-y^{(i)})\frac {-g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{1-h_\theta (X^{(i)})}]x_t^{(i)}\]

\[=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}(1-g(\theta^T X^{(i)}))+(1-y^{(i)})g(\theta^T X^{(i)})]x_t^{(i)}\]

\[=\frac 1 m \sum_{i=1}^m\ (g(\theta^T X^{(i)})-y^{(i)})x_t^{(i)}\]

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%for i=1:mJ=J+(-y(i)*log(sigmoid(theta'*X(i,:)'))-(1-y(i))*log(1-sigmoid(theta'*X(i,:)')));endJ=J/m;for t=1:size(theta)for i=1:mgrad(t)=grad(t)+(sigmoid(theta'*X(i,:)')-y(i))*X(i,t);endgrad(t)=grad(t)/m;end% =============================================================end

predict

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)m = size(X, 1); % Number of training examples% You need to return the following variables correctly
p = zeros(m, 1);% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters.
%               You should set p to a vector of 0's and 1's
%for i=1:mtmp=theta'*X(i,:)';if(tmp>=0)p(i)=1;elsep(i)=0;endend
% =========================================================================end

最终测试结果

带正则化的Logistic回归二分类

costFunctionReg

带正则化的Logistic回归中,损失函数加入了正则化项

\[J(\theta)=\frac 1 m \sum_{i=1}^m[-y^{(i)}log(h_\theta(X^{(i)}))-(1-y^{(i)})log(1-h_\theta(X^{(i)}))]+\frac \lambda {2m} \sum_{i=1}^n \theta_i^2\]

其中\(\lambda\)为惩罚参数,\(\lambda\)越大,\(\theta_1\cdots \theta_n\)越小,\(h_\theta(X)\)越接近\(Sigmoid(\theta_0)\),越不容易过拟合,倾向于欠拟合。注意这里没有把\(\theta_0\)加入正则化项中

梯度下降公式推导:

\[Sigmoid'(x)=Sigmoid(x)(1-Sigmoid(x))\]

\[\frac{\partial J(\theta)}{\partial \theta_0}=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}\frac {g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{{h_\theta(X^{(i)})}}-(1-y^{(i)})\frac {-g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{1-h_\theta (X^{(i)})}]x_0^{(i)}\]

\[=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}(1-g(\theta^T X^{(i)}))+(1-y^{(i)})g(\theta^T X^{(i)})]x_0^{(i)}\]

\[=\frac 1 m \sum_{i=1}^m\ (g(\theta^T X^{(i)})-y^{(i)})x_0^{(i)}\]

\[\frac{\partial J(\theta)}{\partial \theta_t}=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}\frac {g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{{h_\theta(X^{(i)})}}-(1-y^{(i)})\frac {-g(\theta^T X^{(i)})(1-g(\theta^T X^{(i)}))}{1-h_\theta (X^{(i)})}]x_t^{(i)}+\frac \lambda m \theta_t\]

\[=\frac 1 m \sum_{i=1}^m\ [-y^{(i)}(1-g(\theta^T X^{(i)}))+(1-y^{(i)})g(\theta^T X^{(i)})]x_t^{(i)}+\frac \lambda m \theta_t\]

\[=\frac 1 m \sum_{i=1}^m\ (g(\theta^T X^{(i)})-y^{(i)})x_t^{(i)}+\frac \lambda m \theta_t,\ \ \ \ t> 1\]

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. % Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in thetafor i=1:mtmp=sigmoid(theta'*X(i,:)');J=J+(-y(i)*log(tmp)-(1-y(i))*log(1-tmp));endJ=J/m;regsum=0;for i=2:size(theta)regsum=regsum+theta(i)*theta(i);endregsum=regsum*lambda/(2*m);J=J+regsum;for i=1:mgrad(1)=grad(1)+(sigmoid(theta'*X(i,:)')-y(i))*X(i,1);endgrad(1)=grad(1)/m;for t=2:size(theta)for i=1:mgrad(t)=grad(t)+(sigmoid(theta'*X(i,:)')-y(i))*X(i,t);endgrad(t)=grad(t)+lambda*theta(t)/m;end
% =============================================================end

最终测试结果

Fig.不同\(\lambda\)取值下获得的决策边界




转载于:https://www.cnblogs.com/qpswwww/p/9273830.html

机器学习(Andrew Ng)作业代码(Exercise 1~2)相关推荐

  1. 机器学习(Andrew Ng)作业代码(Exercise 3~4)

    Programming Exercise 3: Multi-class Classification and Neural Networks 带正则化的多分类Logistic回归 lrCostFunct ...

  2. 机器学习-Andrew Ng课程笔记

    本文是参考Andrew Ng网上课程和别人总结的机器学习笔记,作为笔者学习的参考,笔者主要实现了一些程序问题. 持续更新中- 1 两个参数梯度下降法 现在有一组数据,plot之后如下图 x = np. ...

  3. Coursera机器学习(Andrew Ng)笔记:无监督学习与维度约减

    无监督学习与维度约减 机器学习初学者,原本是写来自己看的,写的比较随意.难免有错误,还请大家批评指正!对其中不清楚的地方可以留言,我会及时更正修改 Unsupervised learning & ...

  4. Coursera 的机器学习 (Andrew Ng) 课程 视频百度云

    关于视频资源的几个问题 1.视频来自哪里? 视频本题主从 youtube下载. 2.视频为什么不全? 2016/9/17包括了第一周的视频.以后会再次更新. 视频下载地址 链接:http://pan. ...

  5. 斯坦福大学机器学习相关网站——Andrew Ng

    1.斯坦福大学机器学习课程 网易公开课上的中文视频 http://v.163.com/movie/2008/1/M/C/M6SGF6VB4_M6SGHFBMC.html 2.斯坦福大学机器学习--An ...

  6. 使用python下载网易云课堂中Andrew Ng的机器学习课程

    看了网易云课堂上stanford大学教授Andrew Ng的机器学习课程,觉得很不错,就想下载下来,正好也在学习python,所以就有了这么一段代码.参考了博客http://blog.csdn.net ...

  7. Andrew Ng -- machine learning ex2/吴恩达机器学习ex2

    这个项目包含了吴恩达机器学习ex2的python实现,主要知识点为逻辑回归.正则化,题目内容可以查看数据集中的ex2.pdf 代码来自网络(原作者黄广海的github),添加了部分对于题意的中文翻译, ...

  8. 斯坦福大学Andrew Ng - 机器学习笔记(3) -- 神经网络模型

    大概用了一个月,Andrew Ng老师的机器学习视频断断续续看完了,以下是个人学习笔记,入门级别,权当总结.笔记难免有遗漏和误解,欢迎讨论. 鸣谢:中国海洋大学黄海广博士提供课程视频和个人笔记,在此深 ...

  9. Andrew Ng机器学习课程6

    Andrew Ng机器学习课程6 说明 在前面跟随者台大机器学习基石课程和机器学习技法课程的设置,对机器学习所涉及到的大部分的知识有了一个较为全面的了解,但是对于没有动手写程序并加以使用的情况,基本上 ...

  10. Andrew Ng机器学习课程14(补)

    Andrew Ng机器学习课程14(补) 声明:引用请注明出处http://blog.csdn.net/lg1259156776/ 利用EM对factor analysis进行的推导还是要参看我的上一 ...

最新文章

  1. SSL/TLS原理详解
  2. Java技术专题之JVM你的内存泄露了吗?
  3. 【转】android程序连接网络出现android.os.NetworkOnMainThreadExceptionat
  4. 【学习笔记】MOOC 数学文化赏析 笔记【补档】
  5. 2021大数据1班《Python程序设计基础》学生学期总结
  6. CCF2018-3-2 碰撞的小球
  7. Java基础之创建对象的五种方式
  8. NOI2019 SX 模拟赛 no.5
  9. Mac 如何查看电脑的蓝牙版本信息
  10. C/C++[codeup 1923]排序
  11. ubuntu备份与恢复
  12. CleanMyMac X最新2022如何激活许可证解决教程
  13. 职业规划-IT方向(超详细,超具体)
  14. apk改之理安装教程
  15. 世界上再也找不到第二位程序员大叔能写出这样纯美的数学小说了
  16. 如何备份android,如何备份安卓手机系统
  17. python运行时不让电脑休眠_python实现windows休眠
  18. 新浪微博模拟登陆passwd参数rsa解密
  19. 链接的图片转base64,字符串转流pdf预览-zip下载
  20. PPP、PPPOE、PPTP、L2TP应用场合

热门文章

  1. 《测试类职位面试360度》
  2. SQL Server 阻止组件 xp_cmdshell
  3. 关于opencv设置视频的属性无效问题
  4. @Html.Partial,@Html.Action,@Html.RenderPartial,@Html.RenderAction区别 .(转)
  5. Web前端开发规范文档(转)
  6. 转:Ajax调用Webservice和后台方法
  7. 基础训练 龟兔赛跑预测
  8. Trie树 01Trie
  9. 狂人传记:戎马半生 何以安家
  10. Netsuite Foreign Currency Revaluation 外币评估