吴恩达 coursera ML 第二课总结+作业答案
前言
学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念。
目录
文章目录
- 前言
- 目录
- 正文
- 线性模型
- 模型判断准则
- 损失函数解析
- 损失函数解析2
- 梯度下降
- 梯度下降解析
- 应用梯度下降的线性模型
- 术语补充
- 编程作业
- ex1.m
- computeCost.m
- featureNormalize.m
- gradientDescent.m
- computeCostMulti
- gradientDescentMulti.m
正文
本周学习内容为:线性模型
线性模型
监督学习的第一个例子,基于尺寸的房价预测。
这个问题可以建模成y=f(x),其中y是房价,x是尺寸,f(x)=wx,这个w就是我们要训练出来的东西。
训练出模型假设h。
模型判断准则
w可以是任意数,那什么样的w是好的呢?
我们肯定希望训练出来的模型能尽量准确的预测房价,因此,选择的准则是使得误差最小的w。
损失函数解析
这张图上的公式还是比较抽象难懂的。
这张图则形象化的展示了,模型参数和损失函数值的关系。
这张图则形象的展示了,通过画出误差函数,我们可以轻易的选取出最优的参数。
损失函数解析2
这张图则形象的展示了,有两个参数的损失函数的图像
这张图则形象的展示了,找到最优参数时的模型图像和误差图。
梯度下降
现在判断准则有了,如何去寻找最优的参数呢?
这是一个图像化的最优参数寻找过程,通过梯度下降,模型最终找到了最优的参数。
梯度下降的公式。
梯度下降解析
梯度下降的算法可以总结成图上那样。
梯度下降的直观图形展示,解释了,为啥无论当前点在哪里,梯度下降方法都会使它向最优点前进。
展示了梯度下降核心参数alpha的影响。
梯度下降算法的补充介绍,损失函数会自动变小,因此,不需要时刻调整alpha。
应用梯度下降的线性模型
总结一下本章学习的理论内容,梯度下降算法和线性回归模型。
针对线性模型使用时梯度下降的具体展示。
术语补充
batch 的意思是每次梯度下降时使用所有的样本。
编程作业
ex1.m
%% Machine Learning Online Class - Exercise 1: Linear Regression% Instructions
% ------------
%
% This file contains code that helps you get started on the
% linear exercise. You will need to complete the following functions
% in this exericse:
%
% warmUpExercise.m
% plotData.m
% gradientDescent.m
% computeCost.m
% gradientDescentMulti.m
% computeCostMulti.m
% featureNormalize.m
% normalEqn.m
%
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
%
% x refers to the population size in 10,000s
% y refers to the profit in $10,000s
%%% Initialization
clear ; close all; clc%% ==================== Part 1: Basic Function ====================
% Complete warmUpExercise.m
fprintf('Running warmUpExercise ... \n');
fprintf('5x5 Identity Matrix: \n');
warmUpExercise()fprintf('Program paused. Press enter to continue.\n');
pause;%% ======================= Part 2: Plotting =======================
fprintf('Plotting Data ...\n')
data = load('ex1data1.txt');
X = data(:, 1); y = data(:, 2);
m = length(y); % number of training examples% Plot Data
% Note: You have to complete the code in plotData.m
plotData(X, y);fprintf('Program paused. Press enter to continue.\n');
pause;%% =================== Part 3: Cost and Gradient descent ===================X = [ones(m, 1), data(:,1)]; % Add a column of ones to x
theta = zeros(2, 1); % initialize fitting parameters% Some gradient descent settings
iterations = 1500;
alpha = 0.01;fprintf('\nTesting the cost function ...\n')
% compute and display initial cost
J = computeCost(X, y, theta);
fprintf('With theta = [0 ; 0]\nCost computed = %f\n', J);
fprintf('Expected cost value (approx) 32.07\n');% further testing of the cost function
J = computeCost(X, y, [-1 ; 2]);
fprintf('\nWith theta = [-1 ; 2]\nCost computed = %f\n', J);
fprintf('Expected cost value (approx) 54.24\n');fprintf('Program paused. Press enter to continue.\n');
pause;fprintf('\nRunning Gradient Descent ...\n')
% run gradient descent
theta = gradientDescent(X, y, theta, alpha, iterations);% print theta to screen
fprintf('Theta found by gradient descent:\n');
fprintf('%f\n', theta);
fprintf('Expected theta values (approx)\n');
fprintf(' -3.6303\n 1.1664\n\n');% Plot the linear fit
hold on; % keep previous plot visible
plot(X(:,2), X*theta, '-')
legend('Training data', 'Linear regression')
hold off % don't overlay any more plots on this figure% Predict values for population sizes of 35,000 and 70,000
predict1 = [1, 3.5] *theta;
fprintf('For population = 35,000, we predict a profit of %f\n',...predict1*10000);
predict2 = [1, 7] * theta;
fprintf('For population = 70,000, we predict a profit of %f\n',...predict2*10000);fprintf('Program paused. Press enter to continue.\n');
pause;%% ============= Part 4: Visualizing J(theta_0, theta_1) =============
fprintf('Visualizing J(theta_0, theta_1) ...\n')% Grid over which we will calculate J
theta0_vals = linspace(-10, 10, 100);
theta1_vals = linspace(-1, 4, 100);% initialize J_vals to a matrix of 0's
J_vals = zeros(length(theta0_vals), length(theta1_vals));% Fill out J_vals
for i = 1:length(theta0_vals)for j = 1:length(theta1_vals)t = [theta0_vals(i); theta1_vals(j)];J_vals(i,j) = computeCost(X, y, t);end
end% Because of the way meshgrids work in the surf command, we need to
% transpose J_vals before calling surf, or else the axes will be flipped
J_vals = J_vals';
% Surface plot
figure;
surf(theta0_vals, theta1_vals, J_vals)
xlabel('\theta_0'); ylabel('\theta_1');% Contour plot
figure;
% Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100
contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20))
xlabel('\theta_0'); ylabel('\theta_1');zlabel('J value')
hold on;
plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 2);
computeCost.m
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
predict=X*theta;
error=predict-y;
J=sum(error.^2)/(2*m);% =========================================================================end
featureNormalize.m
function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X
% FEATURENORMALIZE(X) returns a normalized version of X where
% the mean value of each feature is 0 and the standard deviation
% is 1. This is often a good preprocessing step to do when
% working with learning algorithms.% You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));
sigma = zeros(1, size(X, 2));% ====================== YOUR CODE HERE ======================
% Instructions: First, for each feature dimension, compute the mean
% of the feature and subtract it from the dataset,
% storing the mean value in mu. Next, compute the
% standard deviation of each feature and divide
% each feature by it's standard deviation, storing
% the standard deviation in sigma.
%
% Note that X is a matrix where each column is a
% feature and each row is an example. You need
% to perform the normalization separately for
% each feature.
%
% Hint: You might find the 'mean' and 'std' functions useful.
% for i=1:size(X,2);mu(i)=mean(X(:,i));sigma(i)=std(X(:,i));X_norm(:,i)=X_norm(:,i)-mu(i);X_norm(:,i)=X_norm(:,i)/sigma(i);end% ============================================================end
gradientDescent.m
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
% theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector% theta. %% Hint: While debugging, it can be useful to print out the values% of the cost function (computeCost) and gradient here.%error_0=0;error_1=0;for i=1:merror_0=error_0+(X(i,:)*theta-y(i))*X(i,1);error_1=error_1+(X(i,:)*theta-y(i))*X(i,2);end theta(1)=theta(1)-alpha*error_0/m;theta(2)=theta(2)-alpha*error_1/m;% ============================================================% Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta);
end
end
## ex1_multi.m
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
predict=X*theta;
error=predict-y;
J=sum(error.^2)/(2*m);% =========================================================================end
computeCostMulti
function J = computeCostMulti(X, y, theta)
%COMPUTECOSTMULTI Compute cost for linear regression with multiple variables
% J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y% Initialize some useful values
m = length(y); % number of training examples% You need to return the following variables correctly
J = 0;% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
J=1/(2*m)*(X*theta-y)'*(X*theta-y);% =========================================================================end
gradientDescentMulti.m
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
% theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector% theta. %% Hint: While debugging, it can be useful to print out the values% of the cost function (computeCostMulti) and gradient here.%error=zeros(size(X,2),1);for i=1:merror=error+(X(i,:)*theta-y(i))*X(i,:)';end theta=theta-alpha*error/m;% ============================================================% Save the cost J in every iteration J_history(iter) = computeCostMulti(X, y, theta);endend
吴恩达 coursera ML 第二课总结+作业答案相关推荐
- 吴恩达 coursera AI 第二课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 梯度下降导数 计算图 逻辑回归的梯度下降 正文 本章主 ...
- 吴恩达 coursera ML 第九课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 问题判断是方差还是误差 正则化以及方差和偏差的关系 学 ...
- 吴恩达 coursera AI 第一课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 神经网络初探 规模驱动的神经网络 正文 本章主要介绍深 ...
- 吴恩达 coursera ML 第一课总结
前言 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 无监督学习 总结 资源 正文 基础材料都来自公开的课件. 第一堂课主要是简短的介绍了一下机器学 ...
- 吴恩达 coursera ML 第十七课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 正文 文字字符识别问题 工作流程 工作流水线 文本检测 ...
- 吴恩达 coursera ML 第十六课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 随机梯度下降 小规模批量下降 在线学习 大数据系统 正 ...
- 吴恩达 coursera ML 第十五课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 基于内容的推荐 协同过滤 实现细节:均值归一化 正文 ...
- 吴恩达 coursera ML 第十四课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 问题来源 高斯分布 算法 选择使用特征 多元高斯分布 ...
- 吴恩达 coursera ML 第十三课总结+作业答案
前言 吴恩达的课程堪称经典,有必要总结一下. 学以致用,以学促用,通过笔记总结,巩固学习成果,复习新学的概念. 目录 文章目录 前言 目录 正文 动机一数据压缩 动机二数据可视化 降维方法:PCA 数 ...
最新文章
- I.MX6 修改调试串口号(ttymx0 - ttymxc2)
- centos7 利用 crontab 执行 定时任务 计划任务
- android popupwindow 自定义背景,Android PopupWindow背景半透明兼容方案
- 提取图像的边界,用数字标记不同的目标边界
- Ngrok: 使用 Ngrok 实现内网穿透
- 2021暑假每日一题 【week7 完结】
- Ionic系列——环境配置和项目搭建
- linux中时间命令详解
- python numpy安装windows_windows 下python+numpy安装实用教程
- 设计模式之Visitor
- onCreate onRestoreInstanceState onSaveInstanceState
- STL迭代器iterator
- 小D课堂 - 零基础入门SpringBoot2.X到实战_第9节 SpringBoot2.x整合Redis实战_39、SpringBoot2.x整合redis实战讲解...
- 模式识别之特征提取算法
- python opencv生成背景透明图标
- 片上总线学习之Wishbone
- 用java画企鹅_Fireworks绘制简笔QQ企鹅图像
- html标签的记忆巧法,小学记忆单词的方法
- (转)美国最大的独立理财顾问公司 爱德华琼斯专注的成功
- 流式计算利器-Storm